Google is rolling out an update to its Gemini app, introducing a new way to control its AI-powered video generation model. In the latest version, users can now upload multiple reference images for a single video prompt. The system then generates both video and audio by combining these images with text input, giving users more precise control over the visual style and sound of the final output.
Google previously tested this feature within Flow, its expanded video AI platform. Flow also enables users to extend existing clips and stitch together multiple scenes, offering slightly higher video generation quotas compared to the Gemini app. Veo 3.1, which began rolling out in mid-October, delivers more realistic textures, improved input fidelity, and enhanced audio quality over its predecessor, Veo 3.0, according to Google.