Source: raw/Stop_the_Generic_AI_Look_Seedance_2.0_Prompt_Engineering_for_Cinematic_Video_HeyGen.md — YouTube tutorial h8e1GZBKm0k by an unidentified HeyGen-channel creator. The walkthrough is structured around HeyGen’s Avatar Shots feature with Seedance 2.0 as the motion model, layered on top of Avatar 5 for A-roll.
A prompt-engineering framework for HeyGen’s Avatar Shots feature that lifts AI video out of the “generic rubbery look.” Thesis: the model is fine — most users prompt like writers, not like directors of photography. The fix is a 5-element prompt structure (subject / action / environment / camera / style) with one camera move per shot, layered into multi-shot and multi-avatar choreography. Pairs Avatar 5 (long-form A-roll) with Seedance 2.0 (cinematic motion + storytelling).
Key Takeaways
Tool stack and roles
- Avatar 5 — long-form A-roll. The creator’s entire A-roll in this tutorial is Avatar 5 output.
- Seedance 2.0 — cinematic motion, environments, storytelling.
- Avatar Shots (the feature under walkthrough) — combines both: digital twin + fully cinematic Seedance-generated scenes. Unlocks full-body motion, multi-shot cinematic sequences, and up to three avatars in one scene.
The 5-element prompt structure (every shot)
| Element | What to specify | Skip-it failure mode |
|---|---|---|
| Subject | who/what is in the shot, specific identifying details (outfit, identity tags) | identity drift across cuts |
| Action | one intentional action — what they’re doing right now | inconsistent character motion |
| Environment | location + lighting (named lighting: golden hour, warm light, etc.) | output falls apart on environment |
| Camera | exactly ONE camera move per shot (slow push-in, dolly, pan, etc.) | over-busy / messy camera work |
| Style | overall aesthetic + constraints (shallow depth of field, film grain, stable framing, consistent identity) | generic AI look |
Hard rule: one camera move per shot. Combining moves degrades output.
Working example — single shot prompt
Professional man in a navy blazer stands in a modern glass office overlooking a city skyline at golden hour. He adjusts his sleeve and looks confidently toward the camera. A slow camera push-in, cinematic shallow depth of field, warm golden light, soft reflections on the glass, subtle film grain, smooth motion, stable framing, natural gestures, consistent identity.
Maps cleanly onto the 5 elements:
- Subject: professional man in navy blazer
- Action: adjusts sleeve + looks toward camera
- Environment: modern glass office overlooking city skyline at golden hour, warm golden light, soft reflections
- Camera: slow push-in
- Style: cinematic shallow depth of field, film grain, smooth motion, stable framing, natural gestures, consistent identity
Pre-shot decisions (before writing the prompt)
- Pick the avatar.
- Pick output ratio — vertical or horizontal.
- Pick resolution — recommended 1080p.
- Decide duration AFTER writing the prompt — re-read your prompt and pick the duration that matches its pacing. This is a non-obvious tip; don’t lock duration first.
Multi-shot sequences (“think in beats”)
- Each beat = a discrete shot.
- Give each beat its own:
- Time window (timestamps inside the prompt)
- One camera move
- Purpose (why this beat exists in the story)
- You can change eras, outfits, worlds in one generation. That’s “multi-clip prompting” — one prompt, multiple scenes, fully directed.
Example beat structure (paraphrased):
Location: 1800s British royal study. Subject: avatar wears powdered wig, formal aristocrat outfit, holding a teacup. Camera: medium shot with slow push-in. Script: …
Then repeat with new beat block for each scene in the sequence.
Multi-avatar choreography (up to 3 avatars per scene)
- Becomes blocking + choreography, not just dialogue.
- Define in the prompt:
- Each avatar’s position in the frame
- Each avatar’s movement relative to others
- Camera support for the movement (how camera tracks the action)
- Skipping these turns the scene “random.” Specifying them makes it feel like a production.
For multi-avatar:
- Describe the desired shot, feel, and look at the top.
- Then enumerate each character — position + wardrobe.
- Then camera motion.
- Then beat-by-beat scene description with character moves + camera moves.
Elements — physical references, locations, products, clothing
Once you start adding assets (locations, backgrounds, clothing, products), prompts get shorter:
- Reference the asset by name in HeyGen.
- Still use the 5-element structure, just less verbose because the asset injects the visual identity.
- This is where continuity across shots becomes reliable.
Example shortened prompt (with elements pre-attached):
The avatar faces the camera in a modern office wearing a pastel outfit while holding a mug, speaking with confident and engaging energy.
Workflow recommendation (creator’s stack)
- Avatar 5 for main message: founder videos, education, product explainers.
- Avatar Shots + Seedance for: storytelling, transitions, visual depth.
- Layer Avatar Shots inside an Avatar 5 A-roll project rather than building entire videos in Seedance.
Open Questions
- Creator + channel identity — first-party HeyGen account or affiliate creator? Not clear from transcript.
- Seedance 2.0 official documentation — referenced as a model name; primary docs URL not in transcript.
- Cost model — generations are HeyGen-plan-gated; tier pricing not in walkthrough.
- Aspect-ratio + resolution combinations — what’s supported per HeyGen plan tier.
- Max sequence duration per multi-clip prompt — not stated; relevant for full-production workflows.
Related
- _index — topic root
- heygen-skills-bundle — HeyGen’s Claude-skill bundle
- heygen-avatar-v — HeyGen Avatar V (if exists)
- heygen-instant-highlights-v2 — companion video by same channel on auto-clipping
- higgsfield-supercomputer — alternative cinematic AI video surface
- video-use — adjacent video-AI tooling
Try It
- Write your first 5-element prompt — pick a professional or personal scenario, fill all five slots explicitly. Don’t skip any.
- One camera move only — if your draft prompt has two camera moves, cut one. Always.
- Set ratio + resolution + avatar BEFORE writing the prompt; decide duration AFTER. Re-read your prompt and let pacing dictate length.
- Try a multi-shot sequence next — write 3 beats, each with timestamps, one camera move each. Generate as a single multi-clip prompt.
- Add multi-avatar choreography last — block positions + movements + camera tracking explicitly. Don’t introduce more than 3 avatars per scene.
- Move from prompts to elements — upload your branded product / outfit / location into HeyGen, then reference by name in prompts. Continuity becomes reliable.
- Stack with Avatar 5 — use Avatar Shots for storytelling beats inside an Avatar 5 A-roll project, not as the whole video.