US

Wan - AI Video Generator

Explore API
MODEL

Choose an AI palette for artistic style.

Model Image
Wan 2.6

Create Video with Character Reference

Nothing here yet.

Your assets will appear once published.Go To Playground

Wan — AI Video Generator

3 specialized video generation models by Alibaba. Character reference videos, image animation with synced sound, and realistic character animation.

Wan 2.6

Create Video with Character Reference
Generates character videos driven by visual references
  • UPLOAD REFERENCE (VIDEO 1)* — video upload, required
  • UPLOAD REFERENCE (VIDEO 2) — optional second video reference
  • PROMPT — up to 1000 characters
  • ASPECT RATIO — Portrait or Landscape only

WAN 2.5

Animate image with synced sound
Animates a static image matched to audio
  • REFERENCE IMAGE* — image upload, required
  • PROMPT — up to 3000 characters
  • DURATION — 5 sec or 10 sec dropdown

Wan 2.2 Animate

Create realistic character videos
Generates character animation from image only
  • UPLOAD CHARACTER IMAGE* — image upload, required
  • UPLOAD REFERENCE VIDEO — optional video upload
  • No prompt field — visual input only

Model Details & Workflows

Wan 2.6 — Two-Video Reference Workflow

Wan 2.6 accepts two reference videos (one required, one optional) and generates a new video based on visual patterns extracted from both. This dual-input approach allows for more nuanced character motion synthesis. You may also provide a text prompt (up to 1000 chars) to guide the generation. Aspect ratio is limited to Portrait or Landscape — Square output is not supported.

WAN 2.5 — Image-to-Video with Audio Sync

WAN 2.5 takes a static reference image and generates an animated video. You can provide a prompt (up to 3000 characters) and select output duration (5 or 10 seconds). This model is optimized for creating smooth animations from single-frame inputs with audio-synchronized motion where applicable.

Wan 2.2 Animate — Visual-Input-Only Generation

Wan 2.2 Animate works purely from visual inputs. Upload a character image (required) and optionally a reference video. This model does not use text prompts — all generation parameters come from image analysis and optional motion reference. This approach can produce highly consistent character animations without prompt interpretation overhead.

Input Guidelines

Reference Videos for Wan 2.6

Upload clear, well-lit video clips with a single subject against a simple background. Avoid fast cuts, scene changes, or multiple people. Keep reference videos under 10 seconds for best motion extraction quality. The reference person's proportions do not need to match your target character exactly, but extreme differences may produce unnatural results. Gross body motion (walking, gesturing, turning) transfers well; subtle expressions and micro-gestures often get lost.

Character Images for Wan 2.2 Animate and WAN 2.5

Start with clear, well-lit still images of your character. Transparent or solid-color backgrounds reduce generation artifacts. Wan analyzes pose, proportions, and visual features, so consistency in these elements produces the most reliable results. Full-body and upper-body poses both work well.

Aspect Ratio and Duration

Wan 2.6 supports Portrait (vertical) and Landscape (horizontal) aspect ratios only — Square is not available. WAN 2.5 can generate videos in either 5-second or 10-second durations. Choose based on your animation complexity and output requirements.

Prompt Guide (Wan 2.6 & WAN 2.5)

"The character walks forward confidently, arms swinging naturally, looking straight ahead"

Clear directional movement with natural secondary motion. Both models handle walking animations well when direction and gait are described simply.

"The character raises their right hand and waves slowly, slight smile, head tilts to the left"

Specific body part instructions with emotional cues. Including which hand and direction improves accuracy for upper-body animations.

"The character dances with energetic hip-hop movements, arms and legs in full motion"

High-energy motion prompt. Models can generate dance movements but they tend to be generic interpretations. For precise choreography, use reference video instead.

Honest Limitations

Frequently Asked Questions

What is Wan and who made it?
Wan is an AI video generation model developed by Alibaba Cloud (the Wan Team), focused on character animation and video synthesis. Pixazo provides access to Wan through its platform layer. Pixazo does not own, train, or modify the underlying model — it serves as an interface and compute layer.
What is the difference between Wan 2.6, WAN 2.5, and Wan 2.2 Animate?
Wan 2.6 generates videos using two optional reference videos plus an optional text prompt. WAN 2.5 animates a static image with text guidance and lets you choose duration (5 or 10 seconds). Wan 2.2 Animate uses only image and optional reference video inputs — no text prompts. Each model has different strengths: use Wan 2.6 for multi-reference video-driven motion, WAN 2.5 for image animation with detailed prompts, and Wan 2.2 Animate for pure visual consistency without prompt interpretation.
Can I use Square aspect ratio with Wan 2.6?
No. Wan 2.6 supports only Portrait and Landscape aspect ratios. If you need square video output, consider WAN 2.5 or Wan 2.2 Animate as alternatives, though these models may have different output ratio options.
Does Wan 2.2 Animate accept text prompts?
No. Wan 2.2 Animate is a visual-input-only model. It generates animations based on the character image and optional reference video alone. Text prompts are not supported and will have no effect on the output.
What are the main limitations across all Wan models?
Key limitations include: character proportions can shift during animation, hands and fingers are frequently malformed, reference video extraction works best with clean single-subject clips, environmental interaction is not realistic, and multi-character scenes often have artifacts. Additionally, Wan 2.6 is limited to Portrait/Landscape aspect ratios, and Wan 2.2 Animate does not accept text prompts.
Which Wan model should I use?
Choose based on your input and control preferences: Use Wan 2.6 if you want to guide animation with reference videos and optional text. Use WAN 2.5 if you prefer to animate a single image with detailed text descriptions and fixed duration. Use Wan 2.2 Animate if you want pure visual consistency without text interpretation, relying only on character and reference images.
With Pixazo’s platform we deliver enterprise-class security and compliance to you and your customers through every interaction.