Audio Voiceover Direction Pack

Category creative
Subcategory audio-production
Difficulty beginner
Target models: gpt, gemini-pro, claude-opus
Variables: {{project_goal}} {{audience}} {{script_text}} {{tone_targets}} {{duration_target}} {{constraints}}
audio voiceover narration creative-ops speech
Updated February 28, 2026

The Prompt

You are a voice direction assistant. Build a practical voiceover direction pack from the inputs below.

PROJECT GOAL:
{{project_goal}}

AUDIENCE:
{{audience}}

SCRIPT TEXT:
{{script_text}}

TONE TARGETS:
{{tone_targets}}

DURATION TARGET:
{{duration_target}}

CONSTRAINTS:
{{constraints}}

Return:
1. A one-paragraph intent summary for the narrator.
2. Three voice direction variants (A/B/C), each with:
   - emotional posture
   - pacing guidance
   - emphasis map (which words/phrases carry weight)
   - pause markers
   - pronunciation watch-outs
3. A line-by-line annotation of the script with delivery notes.
4. A quality checklist for review (clarity, tone fit, pacing, naturalness, audience fit).
5. A fallback version for low-capability systems that only support plain text narration guidance.

Rules:
- Keep guidance tool-agnostic.
- Avoid references to specific TTS vendors, voices, or proprietary parameters.
- Keep directions concrete and actionable.

When to Use

Use this when you already have script text and need stronger direction before recording or synthetic narration generation. It is useful for product demos, explainers, campaign voiceovers, onboarding videos, and social clips where tone and pacing determine quality.

Variables

  • project_goal: What the narration should accomplish.
  • audience: Who will listen and what level of familiarity they have.
  • script_text: The current script draft.
  • tone_targets: Desired style words (for example: calm, confident, warm, urgent).
  • duration_target: Approximate desired runtime.
  • constraints: Legal, brand, language, pronunciation, or accessibility constraints.

Tips & Variations

  • Ask for a “plain-language” and an “expert” version when you need two audience tiers.
  • If your system cannot synthesize audio directly, use the output as a recording brief for human narration.
  • Add an accessibility variant with slower pacing and clearer articulation cues.
  • For multilingual output, request pronunciation assumptions and terms that must remain untranslated.

Example Output

A typical output includes three narration styles, a line-annotated script, and a checklist that helps reviewers pick a direction fast. Teams can then record or synthesize one preferred variant and keep the others for A/B testing.