Nvidia has unveiled Nemotron 3, an open family of models and databases designed to power next-generation AI agent operations across industries.
The new model lineup includes three variants: Nano, Super, and Ultra. Built with a breakthrough architecture leveraging a Mixture-of-Latent-Experts approach, these models reduce memory requirements, optimize computation, and deliver exceptional intelligence across different scales.
According to the company, the smallest model, Nemotron 3 Nano, achieves four times the throughput of its predecessor while maintaining high performance.
"Open innovation is the foundation of progress in AI," said Jensen Huang, founder and CEO of Nvidia. "With Nemotron, we’re transforming advanced AI into an open platform, giving developers the transparency and efficiency they need to build agent systems at scale."
The release addresses growing industry demand for AI agents capable of reasoning and tool orchestration. The era of single-model chatbots is being replaced by orchestrated AI applications that automate multiple models to enable proactive, intelligent agents.
Nemotron 3 positions Nvidia more directly within the expanding landscape of open and semi-open inference models, as competition shifts from raw parameter counts toward orchestration capabilities, reliability, and agent-centric performance.
Leveraging its novel architecture, Nvidia aims to lower deployment costs while delivering fast, reliable inference and scalable intelligence.
The Nano variant, now available, is a compact 30B-parameter model with 3B active parameters, optimized for targeted, efficient tasks. The mid-tier Super model features 100B total parameters and 10B active ones, designed to support multi-agent applications requiring moderate intelligence. At the top end, Ultra is a powerful 500B-parameter inference engine with 50B active parameters, built for complex AI workflows and advanced agent orchestration.
Nemotron Super excels in applications where multiple AI agents collaborate with low latency to complete intricate tasks. In contrast, Nemotron Ultra serves as a central "brain" for demanding AI workflows that require deep analysis and long-term strategic planning.
Nano is currently available, Super is set for release in Q1 2026, and Ultra is expected in the first half of next year.
Using Nvidia’s highly efficient 4-bit NVPF4 training format, developers can deploy these models on fewer GPUs, significantly reducing memory footprint. This efficient process also enables model distillation without substantial loss in accuracy or reasoning capability.
Early adopters of the Nemotron family include Accenture, CrowdStrike Holdings Inc., Oracle Cloud Infrastructure, Palantir Technologies Inc., Perplexity AI Inc., ServiceNow Inc., Siemens, and Zoom Communications Inc.
"Perplexity is built on the idea that human curiosity will be amplified by accurate AI embedded in excellent tools—like AI assistants," said Aravind Srinivas, CEO of Perplexity. "With our agent router, we can route workloads to the best fine-tuned open models, such as Nemotron 3 Ultra."
With this launch, Nvidia is betting heavily on ecosystem-driven adoption to create mutually beneficial relationships. While the models don't require Nvidia hardware to run, they are highly optimized for platforms and GPUs designed by Nvidia due to internal architectural synergies.
The company also emphasized its commitment to maintaining a predictable and reliable model release roadmap. Amid the current rapid pace of AI releases—often monthly—Nvidia aims to provide developers with clearer expectations around model maturity and long-term support for each open-source family.
New Tools and Data for Customizing AI Agents
In addition to the models, Nvidia introduced a suite of training datasets and state-of-the-art reinforcement learning libraries tailored for building specialized AI agents.
Reinforcement learning trains AI models by exposing them to real-world problems, instructions, and tasks. Unlike supervised learning, which relies on predefined question-answer pairs, reinforcement learning places models in dynamic environments, rewarding successful actions and penalizing failures. The result is a reasoning system that learns rules and boundaries through active feedback, capable of operating under complex and changing conditions.
The new dataset comprises 3 trillion newly generated Nemotron tokens from pre-training, post-training, and reinforcement phases, offering rich examples of reasoning, coding, and multi-step workflows. These serve as a foundation for developing high-performance, domain-specific agents. Additionally, Nvidia released the Nemotron Agent Safety Dataset—a real-world telemetry collection enabling teams to evaluate and strengthen the security of complex, multi-agent systems.
Built on this foundation, Nvidia launched NeMo Gym and NeMo RL as open-source libraries. Gym and RL provide training environments and base frameworks for post-training Nemotron models. Developers can integrate both to accelerate development using RL training loops and ensure interoperability with existing training infrastructures.
The open release of Nemotron models and datasets further solidifies Nvidia’s influence within the broader AI ecosystem.