Source: raw/Stop_the_Generic_AI_Look_Seedance_2.0_Prompt_Engineering_for_Cinematic_Video_HeyGen.md — YouTube tutorial h8e1GZBKm0k by an unidentified HeyGen-channel creator. The walkthrough is structured around HeyGen’s Avatar Shots feature with Seedance 2.0 as the motion model, layered on top of Avatar 5 for A-roll.

A prompt-engineering framework for HeyGen’s Avatar Shots feature that lifts AI video out of the “generic rubbery look.” Thesis: the model is fine — most users prompt like writers, not like directors of photography. The fix is a 5-element prompt structure (subject / action / environment / camera / style) with one camera move per shot, layered into multi-shot and multi-avatar choreography. Pairs Avatar 5 (long-form A-roll) with Seedance 2.0 (cinematic motion + storytelling).

Key Takeaways

Tool stack and roles

  • Avatar 5 — long-form A-roll. The creator’s entire A-roll in this tutorial is Avatar 5 output.
  • Seedance 2.0 — cinematic motion, environments, storytelling.
  • Avatar Shots (the feature under walkthrough) — combines both: digital twin + fully cinematic Seedance-generated scenes. Unlocks full-body motion, multi-shot cinematic sequences, and up to three avatars in one scene.

The 5-element prompt structure (every shot)

ElementWhat to specifySkip-it failure mode
Subjectwho/what is in the shot, specific identifying details (outfit, identity tags)identity drift across cuts
Actionone intentional action — what they’re doing right nowinconsistent character motion
Environmentlocation + lighting (named lighting: golden hour, warm light, etc.)output falls apart on environment
Cameraexactly ONE camera move per shot (slow push-in, dolly, pan, etc.)over-busy / messy camera work
Styleoverall aesthetic + constraints (shallow depth of field, film grain, stable framing, consistent identity)generic AI look

Hard rule: one camera move per shot. Combining moves degrades output.

Working example — single shot prompt

Professional man in a navy blazer stands in a modern glass office overlooking a city skyline at golden hour. He adjusts his sleeve and looks confidently toward the camera. A slow camera push-in, cinematic shallow depth of field, warm golden light, soft reflections on the glass, subtle film grain, smooth motion, stable framing, natural gestures, consistent identity.

Maps cleanly onto the 5 elements:

  • Subject: professional man in navy blazer
  • Action: adjusts sleeve + looks toward camera
  • Environment: modern glass office overlooking city skyline at golden hour, warm golden light, soft reflections
  • Camera: slow push-in
  • Style: cinematic shallow depth of field, film grain, smooth motion, stable framing, natural gestures, consistent identity

Pre-shot decisions (before writing the prompt)

  • Pick the avatar.
  • Pick output ratio — vertical or horizontal.
  • Pick resolution — recommended 1080p.
  • Decide duration AFTER writing the prompt — re-read your prompt and pick the duration that matches its pacing. This is a non-obvious tip; don’t lock duration first.

Multi-shot sequences (“think in beats”)

  • Each beat = a discrete shot.
  • Give each beat its own:
    • Time window (timestamps inside the prompt)
    • One camera move
    • Purpose (why this beat exists in the story)
  • You can change eras, outfits, worlds in one generation. That’s “multi-clip prompting” — one prompt, multiple scenes, fully directed.

Example beat structure (paraphrased):

Location: 1800s British royal study. Subject: avatar wears powdered wig, formal aristocrat outfit, holding a teacup. Camera: medium shot with slow push-in. Script: …

Then repeat with new beat block for each scene in the sequence.

Multi-avatar choreography (up to 3 avatars per scene)

  • Becomes blocking + choreography, not just dialogue.
  • Define in the prompt:
    • Each avatar’s position in the frame
    • Each avatar’s movement relative to others
    • Camera support for the movement (how camera tracks the action)
  • Skipping these turns the scene “random.” Specifying them makes it feel like a production.

For multi-avatar:

  1. Describe the desired shot, feel, and look at the top.
  2. Then enumerate each character — position + wardrobe.
  3. Then camera motion.
  4. Then beat-by-beat scene description with character moves + camera moves.

Elements — physical references, locations, products, clothing

Once you start adding assets (locations, backgrounds, clothing, products), prompts get shorter:

  • Reference the asset by name in HeyGen.
  • Still use the 5-element structure, just less verbose because the asset injects the visual identity.
  • This is where continuity across shots becomes reliable.

Example shortened prompt (with elements pre-attached):

The avatar faces the camera in a modern office wearing a pastel outfit while holding a mug, speaking with confident and engaging energy.

Workflow recommendation (creator’s stack)

  • Avatar 5 for main message: founder videos, education, product explainers.
  • Avatar Shots + Seedance for: storytelling, transitions, visual depth.
  • Layer Avatar Shots inside an Avatar 5 A-roll project rather than building entire videos in Seedance.

Open Questions

  • Creator + channel identity — first-party HeyGen account or affiliate creator? Not clear from transcript.
  • Seedance 2.0 official documentation — referenced as a model name; primary docs URL not in transcript.
  • Cost model — generations are HeyGen-plan-gated; tier pricing not in walkthrough.
  • Aspect-ratio + resolution combinations — what’s supported per HeyGen plan tier.
  • Max sequence duration per multi-clip prompt — not stated; relevant for full-production workflows.

Try It

  1. Write your first 5-element prompt — pick a professional or personal scenario, fill all five slots explicitly. Don’t skip any.
  2. One camera move only — if your draft prompt has two camera moves, cut one. Always.
  3. Set ratio + resolution + avatar BEFORE writing the prompt; decide duration AFTER. Re-read your prompt and let pacing dictate length.
  4. Try a multi-shot sequence next — write 3 beats, each with timestamps, one camera move each. Generate as a single multi-clip prompt.
  5. Add multi-avatar choreography last — block positions + movements + camera tracking explicitly. Don’t introduce more than 3 avatars per scene.
  6. Move from prompts to elements — upload your branded product / outfit / location into HeyGen, then reference by name in prompts. Continuity becomes reliable.
  7. Stack with Avatar 5 — use Avatar Shots for storytelling beats inside an Avatar 5 A-roll project, not as the whole video.