NVIDIA unveils SANA-WM: a model for 60-second 720p videos on a single GPU
NVIDIA has unveiled SANA-WM, an open model for video generation with precise camera control. The model creates minute-long videos in 720p and can run on a singl

NVIDIA introduced SANA-WM — an open-source world model for video generation with camera control. The model creates 60-second videos in 720p with precise 6-DoF control and can run on a single RTX 5090.
How SANA-WM Works
SANA-WM is what's called a world model, a simulator of the physical world. Rather than simply stitching images into video, the model learns how the physical world responds to actions: if you rotate the camera left, objects in the frame move correctly. The model contains 2.6 billion parameters and was trained on 64 H100 GPUs.
Key capabilities of SANA-WM:
- Generating 60-second videos in 720p (1280×720)
- Precise camera control through 6-DoF control (position and orientation)
- Running on a single RTX 5090 without cloud services
- Open-source code for experimentation and adaptation
- Support for scaling: from scientific research to production
Why This Matters for Video Creators
Before SANA-WM, video generators were either cloud services (expensive) or required specialized equipment. SANA-WM changes this: it works locally, quickly, and without subscriptions. A studio can generate video scenarios, frame-by-frame visualizations, and drafts in minutes, without cloud dependency.
For directors and animators, this means the ability to quickly test visual storyboard ideas. For 3D artists — a way to automate creation of camera transitions in complex scenes. For marketers — quickly generate a promotional video with needed movement dynamics and viewing angles.
What This Means
SANA-WM symbolizes the transition from cloud-based video generators to local tools. Just as GPU once made 3D rendering accessible on every computer, world models are starting to make video generation accessible. For the industry, this is an acceleration of AI adoption in creative processes — not because models suddenly became smarter, but because they can now work everywhere.