Question 1

What is Happy Horse 1.0?

Accepted Answer

Happy Horse 1.0 is an open-source AI video model from Alibaba's ATH Innovation Division. It's a 15B-parameter unified transformer that handles text, image, video, and audio in one pass — producing native 1080p clips of 5-12 seconds, with six aspect ratios, joint audio-video synthesis, and 7-language lip-sync.

Question 2

What makes Happy Horse different from Sora, Veo, or Kling?

Accepted Answer

Three things: (1) it natively generates multi-shot sequences (multiple coherent cuts in a single generation, with characters and audio persisting across cuts), (2) audio is produced in the same forward pass as the video — no separate sound model, and (3) lip-sync works in 7 languages (English, Mandarin, Cantonese, Japanese, Korean, German, French).

Question 3

What's the official prompt structure?

Accepted Answer

Subject → Action → Environment → Style/Composition → Camera Motion → Ambiance/Audio. The model uses this exact order to allocate attention. Reordering or skipping the camera section causes flat, undirected output.

Question 4

Why do shorter prompts work better?

Accepted Answer

Happy Horse uses a unified attention transformer where every token competes for rendering capacity. Long prompts dilute attention across many tokens and cause subject drift. The official guidance is roughly 20 words per single shot.

Question 5

How do I use multi-shot mode?

Accepted Answer

Select Multi-Shot Sequence and structure the action as timestamped blocks: "[0-3s] establishing wide of …", "[3-7s] cut to medium close-up of …", "[7-12s] pull-back reveals …". Keep the character description consistent across blocks so identity persists across cuts.

Question 6

How does the lip-sync work?

Accepted Answer

Add dialogue in quotation marks in the audio field and pick a lip-sync language. The generator will note the language inline (e.g., "She says in Japanese: ...") so Happy Horse's lip-sync head locks to the right phoneme set.

Question 7

Image-to-Video — what should I put in the prompt?

Accepted Answer

Only what the image cannot show: motion, sound, expression changes, time passing. Describing the appearance or setting again wastes tokens and competes with the image conditioning.

Question 8

Is this tool free?

Accepted Answer

Yes. You get 3 free generations per day. For unlimited access, sign up for a Promptslove membership.

Happy Horse 1.0 Prompt Generator

Happy Horse 1.0 Tips

Frequently Asked Questions