Google’s seventh-generation TPUs are technically superior to NVIDIA’s current offerings, according to Chris Lattner, CEO of Modular and creator of LLVM. Six weeks after private equity began rolling up legacy firms to automate the bottom 50% of professional work, Lattner argues that Google’s hardware advantage is being held back only by its closed ecosystem and lack of developer tools.
NVIDIA’s dominance, Lattner contends, rests on CUDA - a 20-year-old programming model that creates artificial lock-in. While Amazon’s Trainium and Inferentia chips gain traction with AI labs like Anthropic, Modular is building software to break hardware silos, enabling developers to switch platforms without doubling engineering teams.
"CUDA is a legacy system. It was brilliant 20 years ago, but it’s not designed for the generative AI era."
- Chris Lattner, This Week in AI
Jensen Huang pushes back. On the Dwarkesh Podcast, he dismissed raw hardware competition, arguing that NVIDIA’s true advantage lies in pre-funding supply chain bottlenecks years ahead of demand. By aligning TSMC and packaging suppliers long before shortages hit, NVIDIA turns physical constraints into a moat that mimics cash flow.
Huang also warns that special-purpose chips like TPUs risk obsolescence as algorithms evolve faster than silicon. A rigid ASIC optimized for today’s Transformers could become useless if models shift to hybrid SSMs. NVIDIA’s programmable stack, co-designed from Blackwell to NVLink to CUDA, allows 50x efficiency gains - impossible under Moore’s Law alone.
"If the algorithm changes, your ASIC is a paperweight."
- Jensen Huang, Dwarkesh Podcast
Still, Google’s scale is undeniable. Its seventh-gen TPUs match NVIDIA in training throughput, and its internal workloads run at efficiency levels no external vendor can verify. But Huang notes that Google doesn’t participate in MLPerf or Inference Max benchmarks - making real-world TCO comparisons speculative.
The deeper battle is strategic. NVIDIA avoids competing with its customers, instead investing $30 billion to backstop labs like Anthropic and Neo-clouds like CoreWeave. This ensures CUDA remains the default. For now, ubiquity wins - even if the underlying tech is aging.

