NVIDIA is gearing up for a major advancement in AI hardware. On September 8, the company announced that its Vera Rubin microarchitecture is undergoing tape-out, with a planned release in 2026. According to Dave Salvator, NVIDIA’s Director of Accelerated Computing Product, a new variant called Rubin CPX will focus on AI workloads requiring massive context windows.
“The Vera Rubin platform marks another leap forward in AI computing — introducing the next-generation Rubin GPU and a new category of processor called CPX,” said NVIDIA founder and CEO Jensen Huang in a press release. “Just as RTX revolutionized graphics and physics AI, Rubin CPX is the first CUDA GPU specifically designed for massive context AI, capable of processing millions of knowledge symbols simultaneously.”
This announcement came just before NVIDIA released its latest MLPerf inference results on September 9.
NVIDIA Unveils New Hardware and Architecture
Certain AI applications involve context windows exceeding one million tokens, such as software development with more than 100,000 lines of code or high-resolution video generation. For these use cases, NVIDIA plans to launch the Vera Rubin NDL 144 CPX-class GPU by the end of 2026.
A variant of Vera Rubin NDL 144 is specifically optimized for applications requiring extended context windows. The CPX model delivers 8 exaflops of AI performance, 30 PF NVFP4 for context computing, offering 3x more tensor compute power compared to NVIDIA’s GB300 NVL72 system. It also includes 128GB of GDDR7 memory, four NVENC (encoders) and four NVDEC (decoders) for video generation, and 100TB of fast memory.
“It unlocks new high-end applications such as intelligent coding and video generation,” said Shar Narasimhan, Director of AI and Data Center GPU Product Marketing at NVIDIA during a pre-briefing.
Gigascale Reference Design for Data Centers to Guide AI Factory Construction
The Vera Rubin NDL 144 CPX can be viewed as part of a broader AI factory. On September 9, NVIDIA also announced plans to provide gigascale reference designs for large-scale data centers.
“This requires innovation and co-design with a broad set of infrastructure partners,” Narasimhan noted.
Narasimhan added that NVIDIA is entering a new era of data center design from a computing perspective, working closely with infrastructure companies. The company will offer reference designs covering architecture, engineering, and construction; design, simulation, and operations; power generation and storage; as well as mechanical, electrical, and plumbing systems.
Blackwell GPU Sets Records in MLPerf Benchmarks
MLPerf is a benchmarking initiative organized by the MLCommons consortium, used by companies to evaluate the performance of hardware and software on generative AI workloads.
NVIDIA’s Blackwell GPU achieved a new performance record on Llama 3.1 405B Interactive using a technique called disaggregated serving, surpassing the Blackwell baseline. This method enables performance improvements on the same hardware platform.
“You can extract more performance from the same platform,” Salvator explained during the pre-briefing. “This level of performance can generate additional revenue for organizations that have already deployed solutions.”
Meanwhile, Microsoft showcased experimental results of its AI acceleration using an analog optical computer .