At today's Consumer Electronics Show, NVIDIA CEO Jensen Huang officially unveiled the company's new Rubin computing architecture, which he described as the latest breakthrough in AI hardware. Currently in production, the new architecture is expected to drive accelerated advancements in the second half of this year.
"Vera Rubin is designed to address a fundamental challenge we're facing: the computational demands of AI are increasing dramatically," Huang told the audience. "Today, I can confirm that Vera Rubin has entered full-scale production."
First announced in 2024, the Rubin architecture represents the newest milestone in NVIDIA’s relentless hardware innovation cycle—one that has transformed the company into the world’s most valuable firm. Rubin will succeed the Blackwell architecture, which itself replaced Hopper and Lovelace in NVIDIA’s rapid progression.
Rubin chips are already slated for deployment across nearly all major cloud service providers, including high-profile collaborations with Anthropic, OpenAI, and Amazon Web Services. The Rubin-powered systems will also drive HPE’s Blue Lion supercomputer and the upcoming Doudna supercomputer at Lawrence Berkeley National Laboratory.
Named after astronomer Vera Florence Cooper Rubin, the architecture consists of six discrete chips engineered to operate in concert. At its core lies the Rubin GPU, while enhancements in BlueField and NVLink technologies tackle growing bottlenecks in storage and interconnectivity. The platform also introduces a new Vera CPU, purpose-built for agent-based inference workloads.
Dion Harris, Senior Director of NVIDIA’s AI Infrastructure Solutions, highlighted the advantages of the new storage architecture, emphasizing the rising memory requirements in modern AI systems.
"As you enable new types of workflows like agent AI or long-running tasks, there's significant pressure on your KV cache—the memory system AI models use to compress inputs," Harris explained in a phone interview. "To address this, we've introduced a new external storage layer that connects directly to compute devices, enabling more efficient scaling of memory pools."
As anticipated, the new architecture delivers substantial improvements in speed and energy efficiency. According to NVIDIA’s internal benchmarks, Rubin achieves up to 3.5 times faster performance than Blackwell in model training tasks and five times faster in inference, reaching up to 50 petaflops. The platform also supports eight times more inference operations per watt compared to previous generations.
The rollout of Rubin arrives amid an intense global race to build out AI infrastructure, with AI labs and cloud providers competing aggressively for access to NVIDIA’s chips and the data centers needed to power them. During NVIDIA’s October 2025 earnings call, Huang estimated that between $3 trillion and $4 trillion will be invested in AI infrastructure over the next five years.