Runway Launches Its First World Model and Adds Native Audio to Its Latest Video Model

2025-12-12

As the market for AI-powered image and video generation becomes increasingly crowded, Runway has entered the fray alongside both startups and tech giants by unveiling its first world model. Dubbed GWM-1, this model simulates an understanding of physical laws and how the world evolves over time by predicting frames sequentially, according to the company.

A world model is an AI system that learns an internal simulation of how the world operates, enabling it to reason, plan, and act without needing real-world training for every possible scenario.

Earlier this month, Runway launched its Gen 4.5 video model, which surpassed offerings from Google and OpenAI on the Video Arena leaderboard. The company claims its GWM-1 world model is more “general-purpose” than Google’s Genie-3 and other rivals. It positions GWM-1 as a tool capable of generating simulations for training agents across diverse fields such as robotics and life sciences.

“To build a world model, we first needed to create an exceptional video model,” said Anastasis Germanidis, CTO of Runway, during a live presentation. “We believe the right path to building world models is teaching them to predict pixels directly—that’s the best way to achieve general-purpose simulation. With sufficient scale and the right data, you can develop a model that truly understands how the world works.”

Runway has released specialized variants of its new world model under three distinct categories: GWM-Worlds, GWM-Robotics, and GWM-Avatars.

GWM-Worlds is an application that empowers users to craft interactive environments. By providing text prompts or image references, users can define scenes, and as they navigate through the space, the model dynamically generates a world that comprehends geometry, physics, and lighting. The simulation runs at 24 frames per second in 720p resolution. While useful for gaming, Runway emphasizes that Worlds is particularly well-suited for training agents to navigate and behave in physically realistic settings.

With GWM-Robotics, the company aims to leverage synthetic data enriched with new parameters—such as varying weather conditions or dynamic obstacles. Runway notes this approach can also help identify when and how robots might deviate from policies or instructions in different scenarios.

Under GWM-Avatars, Runway is developing highly realistic digital avatars designed to simulate human behavior. This places it in competition with companies like D-ID, Synthesia, Soul Machines, and even Google, all of which are advancing lifelike avatars for applications in communication, training, and beyond.

Technically, Worlds, Robotics, and Avatars operate as separate models today, but Runway plans to eventually unify them into a single, cohesive system.

In addition to launching its new world model, Runway has also updated its recently released Gen 4.5 foundation model. The latest enhancements introduce native audio support and the ability to generate long-form, multi-shot videos. According to the company, users can now produce one-minute videos featuring consistent characters, natural dialogue, ambient sound, and complex cinematography captured from multiple angles. The update also enables editing of existing audio tracks—including adding new dialogue—and supports multi-shot video editing of any duration.

These Gen 4.5 upgrades bring Runway closer to rivals like Kling, whose comprehensive video suite—launched earlier this month—also emphasizes native audio and multi-shot storytelling. Collectively, these developments signal that video generation models are maturing from experimental prototypes into production-ready creative tools. The updated Gen 4.5 model is now available to all users on paid subscription plans.

Runway announced that GWM-Robotics will be accessible via an SDK. The company added that it is already in active discussions with multiple robotics firms and enterprise clients to deploy both GWM-Robotics and GWM-Avatars.