What is Wan 3.0?
Wan 3.0 is the anticipated next release in Alibaba Tongyi Lab's open-source Wan video series, expected in 2026 under Apache 2.0. It is not yet officially released. Based on the Wan 2.x trajectory, it is expected to generate up to 30-second 4K clips with in-pass synchronized audio from text, image, or video input.
Is Wan 3.0 free?
The Wan series weights are free and open-source on Hugging Face under Apache 2.0 — you can self-host if you have the GPU capacity. wan3pro.video is a paid hosted service that runs the latest open Wan model for you (and Wan 3.0 as soon as it's released) so you don't need a local GPU. Paid plans start at $15 one-time (Mini Pack).
Is Wan 3.0 open source?
The Wan video series ships under Apache 2.0 with public weights on Hugging Face (Wan-AI), and Wan 3.0 is expected to follow the same open license. You can download, modify, fine-tune, and redistribute these models. wan3pro.video is an independent third-party service operating on top of the open models — we're not affiliated with Alibaba.
How is Wan 3.0 different from Wan 2.7?
It is expected to jump from 1080p to 4K native, from 24fps to 60fps, from 16s to 30s clips, add in-pass audio generation, and ship 6-shot Identity Lock for cross-cut character consistency. The release is also expected to add video and audio as input modes alongside text and image.
How does Wan 3.0 compare to Sora 2, Veo 3, or Kling 3.0?
Wan 3.0 is the only one of these models expected to ship with open Apache 2.0 weights. On expected specs it leads on max resolution (4K native vs 1080p), max duration (30s), and multi-shot identity tracking. See the comparison table above for the full breakdown.
Can I use Wan 3.0 videos commercially?
Yes — Apache 2.0 grants commercial use rights to the model itself. wan3pro.video also includes a commercial license for all generated output on every paid plan, with no watermark.
Does Wan 3.0 generate audio?
Yes. The model generates dialogue, sound effects, and ambient audio conditioned on the visual scene during the same forward pass as the video — so what you hear matches what you see, with no separate sync step.
What input types does Wan 3.0 accept?
Text prompts, reference images, reference video clips, and reference audio. You can combine up to 12 reference assets in one generation and tag them by type and number in your prompt.
What resolutions and aspect ratios are supported?
Resolutions: 720p, 1080p, and 4K. Aspect ratios: 16:9, 9:16, 1:1, and 4:3. Higher resolutions consume more credits per second.
How long does generation take on wan3pro.video?
Typical generation runs 30–60 seconds for a 5–10s clip on Fast/Standard tier. 4K Pro renders take longer (up to a few minutes). Generation time scales with resolution and duration.