Google Introduces Gemini Omni Flash — Model for Creating Video from Text and Images

Q: What is the source?

Originally published on DeepMind Blog. Hamidun News processes and adapts the material with AI.

Q: When was it published?

2026-05-21. Reading time: 3 min.

Google launched Gemini Omni Flash — a new model for generating video from text, images, and audio recordings. It understands physics, edits video through natura

Hamidun News Editorial

AI monitoring · DeepMind Blog

2026-05-21· 2 min

AI-processed from DeepMind Blog; edited by Hamidun News

Google Introduces Gemini Omni Flash — Model for Creating Video from Text and Images — Source: DeepMind Blog. Collage: Hamidun News.

◐ Listen to article

Google introduced Gemini Omni Flash — a new artificial intelligence model that creates high-quality video from a combination of text, images, audio recordings, and other videos simultaneously. This is the first major step toward full video generation, where AI becomes a digital director, screenwriter, and editor all in one.

Physics and Logic in One Window

Gemini Omni Flash processes multiple types of input data in parallel and converts them into video content. The model stands out particularly for its accurate physics simulation: it correctly models gravity, kinetic energy, fluid dynamics, and object interaction in three-dimensional space. This means that movements look natural — objects fall correctly, liquid flows logically, fabric folds realistically, hair sways in the air.

Previously, such details required manual work from 3D artists and simulation specialists. Now AI handles it on the fly, processing your idea in real time. For video production, this means that directors can experiment with ideas much faster.

The main innovation is that the model reasons about what should happen next. It doesn't simply generate a mechanical sequence of frames, as early video generators did. Instead, Omni understands the context and Google's built-in world knowledge: who is where, what logically will happen in a specific scene, how characters should move relative to each other and the environment.

Editing Through Conversation

The second revolutionary feature is editing video through natural language. You don't open the final edit, don't search for the right clip in the library, don't apply effects manually. You simply write: "add more people in the background," "change the character's dress color from blue to red," "make the scene more sunny and cheerful."

The model understands such requests and edits video without intermediate export, reencoding, and reimport. All dynamics, character consistency, and lighting are preserved — only what's needed changes. This saves hours of routine work.

All videos are automatically marked with a hidden SynthID watermark — invisible to the human eye but readable by machines. This is critical for verification: you can definitively prove that the video was created by AI, not shot on camera. This way, misinformation is prevented and deepfakes are exposed at first glance.

Where It Launched and What's Next

Gemini app — web and mobile application
Google Flow — personal digital assistant
YouTube Shorts — free for all users
Google AI Plus, Pro, and Ultra subscriptions

Developers and enterprises will get access via API later. Google notes that it is still working on safe deployment of audio editing and voice synthesis features. Extra caution is needed here — voice is a more sensitive attribute of personal identity.

What This Means

Video content becomes as quick to create as text or email. Previously, professional video required special video editing knowledge, expensive software like Adobe Premiere, and hours of routine work in interfaces. Now a creative idea becomes a text prompt, and you have beautiful video ready in minutes.

This will dramatically accelerate content creation for marketing, education, entertainment, and internal company communications. Small businesses will be able to compete with large ones on the quality of video materials. Likely, standards will soon emerge for mandatory marking of video content, and whoever first adapts to working with video generation will gain a competitive advantage.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation