Google is shifting from being an internal-only chip user to a full-fledged retailer, directly challenging NVIDIA’s market dominance. A new analysis reveals that the mere existence of Google’s latest Tensor Processing Unit (TPU) is already driving down prices for AI computing power.
For years, Google used its TPUs almost exclusively for its own AI models. That strategy changed with the introduction of the TPUv7 “Trillium.” According to chip analysts at SemiAnalysis, Google is now actively selling these chips to third parties, positioning itself as a direct competitor to NVIDIA.
Anthropic tops the list of early customers. The analysis indicates the startup’s deal involves roughly one million TPUs, split between direct hardware purchases and cloud rentals via Google Cloud Platform (GCP). The infrastructure required to run this hardware reportedly consumes over one gigawatt of power.
The market is already feeling the ripple effects. SemiAnalysis reports that OpenAI secured approximately a 30% discount on its NVIDIA GPU fleet simply by credibly threatening to switch to TPUs or other alternatives.
Analysts Dylan Patel, Myron Xie, and Daniel Nishball quipped: “The more you buy [TPUs], the more you save [on NVIDIA GPU capex]”—a playful twist on NVIDIA CEO Jensen Huang’s famous slogan, “Buy more, save more.”
TPUs Prove Capable of Training Top-Tier AI Models
Usage data shows TPUs are no longer second-tier alternatives. Two of the most powerful AI models recently launched—Google’s Gemini 3 Pro and Anthropic’s Claude 4.5 Opus—rely primarily on Google TPUs and Amazon’s Trainium chips. Gemini 3 was trained entirely on TPUs.
Technically, according to SemiAnalysis, the TPUv7 “Trillium” nearly matches NVIDIA’s Blackwell-generation GPUs in theoretical compute performance (FLOPs) and memory bandwidth. But the real game-changer is pricing.
For Google itself, the total cost of ownership (TCO) per chip is roughly 44% lower than that of comparable NVIDIA GB200 systems. Even external customers like Anthropic—who pay a premium—could see per-unit effective compute costs 30% to 50% lower than NVIDIA’s offerings, based on analyst modeling.
This advantage is particularly compelling for teams skilled in software optimization. Google’s system can interconnect up to 9,216 chips into a single dense network domain—a stark contrast to traditional NVIDIA setups, which typically cluster only 64 to 72 chips tightly together—making large-scale AI training runs easier to distribute.
Software Upgrades Aim to Break CUDA Lock-In
Software has long been the biggest barrier to TPU adoption, with NVIDIA’s CUDA platform entrenched as the industry standard. Google is now investing heavily to overcome this hurdle. The company is developing native support for the popular PyTorch framework and integrating with inference libraries like vLLM.
The goal is to make TPUs a viable alternative without forcing developers to rebuild their entire toolchain. However, the core of the TPU software stack—the XLA compiler—remains proprietary. SemiAnalysis views this as a missed opportunity, arguing that open-sourcing it could accelerate broader community adoption.
To deploy such massive silicon capacity, Google is leveraging innovative financing models. It’s partnering with “new cloud” firms like Fluidstack and even former crypto miners such as TeraWulf. In these arrangements, Google often acts as a financial backstop: if an operator defaults, Google guarantees rental payments. This approach enables rapid conversion of existing crypto mining data centers into AI facilities.
NVIDIA’s Next-Gen Chip Could Erase Google’s Price Edge
Under pressure from Google’s success, NVIDIA is preparing a technical counteroffensive. Its next-generation “Vera Rubin” chip, expected in 2026 or 2027, will feature aggressive design choices like HBM4 memory and ultra-high bandwidth interconnects.
Google’s planned response, the TPUv8, adopts a dual-track strategy, according to SemiAnalysis. The company is developing two variants: one co-designed with longtime partner Broadcom (codenamed “Sunfish”) and another with MediaTek (codenamed “Zebrafish”). Yet the designs appear conservative. Analysts note project delays and an architecture that avoids the cutting-edge approaches seen in rivals—such as TSMC’s 2nm process or HBM4 adoption.
The stakes are high for Google. If NVIDIA successfully leverages the performance gains of Rubin, the TPU’s current cost advantage could vanish. SemiAnalysis warns that NVIDIA’s Rubin systems—particularly the “Kepler Rack”—may prove more economical than Google’s own TPUv8 for internal workloads.
“Google has thrown down the gauntlet—now NVIDIA must execute to remain the lion at the top of the food chain,” SemiAnalysis concludes. If the market leader flawlessly executes its roadmap, it can maintain dominance. But any misstep in performance or delays in the Rubin timeline could seriously threaten its leadership position.