What happened with google tpus challenge nvidia's cuda empire?

Google’s seventh-gen TPUs now match NVIDIA’s scale, limited only by ecosystem openness.

What happened with google tpus challenge nvidia's cuda empire?

NVIDIA’s real moat is pre-funded supply chains and ecosystem ubiquity, not just silicon.

What happened with google tpus challenge nvidia's cuda empire?

Modular’s Chris Latt游戏副本ner says CUDA is legacy tech unfit for generative AI.

AI & TECH

Google tPUs challenge nVIDIA's CUDA empire

Friday, April 17, 2026 · from 2 podcasts

2 SOURCESThis Week in AI Dwarkesh Podcast

This Week in AI Dwarkesh Podcast

Google’s seventh-gen TPUs now match NVIDIA’s scale, limited only by ecosystem openness.
NVIDIA’s real moat is pre-funded supply chains and ecosystem ubiquity, not just silicon.
Modular’s Chris Latt游戏副本ner says CUDA is legacy tech unfit for generative AI.

Google’s seventh-generation TPUs are technically superior to NVIDIA’s current offerings, according to Chris Lattner, CEO of Modular and creator of LLVM. Six weeks after private equity began rolling up legacy firms to automate the bottom 50% of professional work, Lattner argues that Google’s hardware advantage is being held back only by its closed ecosystem and lack of developer tools.

NVIDIA’s dominance, Lattner contends, rests on CUDA - a 20-year-old programming model that creates artificial lock-in. While Amazon’s Trainium and Inferentia chips gain traction with AI labs like Anthropic, Modular is building software to break hardware silos, enabling developers to switch platforms without doubling engineering teams.

"CUDA is a legacy system. It was brilliant 20 years ago, but it’s not designed for the generative AI era."
- Chris Lattner, This Week in AI

Jensen Huang pushes back. On the Dwarkesh Podcast, he dismissed raw hardware competition, arguing that NVIDIA’s true advantage lies in pre-funding supply chain bottlenecks years ahead of demand. By aligning TSMC and packaging suppliers long before shortages hit, NVIDIA turns physical constraints into a moat that mimics cash flow.

Huang also warns that special-purpose chips like TPUs risk obsolescence as algorithms evolve faster than silicon. A rigid ASIC optimized for today’s Transformers could become useless if models shift to hybrid SSMs. NVIDIA’s programmable stack, co-designed from Blackwell to NVLink to CUDA, allows 50x efficiency gains - impossible under Moore’s Law alone.

"If the algorithm changes, your ASIC is a paperweight."
- Jensen Huang, Dwarkesh Podcast

Still, Google’s scale is undeniable. Its seventh-gen TPUs match NVIDIA in training throughput, and its internal workloads run at efficiency levels no external vendor can verify. But Huang notes that Google doesn’t participate in MLPerf or Inference Max benchmarks - making real-world TCO comparisons speculative.

The deeper battle is strategic. NVIDIA avoids competing with its customers, instead investing $30 billion to backstop labs like Anthropic and Neo-clouds like CoreWeave. This ensures CUDA remains the default. For now, ubiquity wins - even if the underlying tech is aging.

AI Infrastructure Chips Regulation

Chris Lattner Amazon Anthropic CoreWeave Modular Nvidia CUDA Google

Source Intelligence

- Deep dive into what was said in the episodes

This Week in AI

The Future of AI: Personal Agents, Taste & Private Data | Lin Qiao & Demi Guo | E9 • Apr 15

Chris Lattner explains that hardware fragmentation and proprietary software stacks like Nvidia's CUDA create vendor lock-in, hindering AI deployment across diverse chips from Nvidia, AMD, and Apple.
Chris Lattner states Modular's software layer enables heterogeneous compute systems, allowing Nvidia, AMD, and Apple Silicon chips to work together within a single application.
Chris Lattner identifies Google's TPU as the biggest sleeper competitor to Nvidia, citing its seven-generation development and superior scale-out, but notes its adoption is limited by GCP-only access and lack of a developer community.
Chris Lattner ranks Amazon's Tranium and AMD as the next major competitors after Google, but says software fragmentation and a lack of open-source ecosystems hold back their widespread adoption.
Jake Lucerrian frames the AI chip race as a national security cold war, arguing the US government must increase spending and avoid overregulation to maintain compute independence and deterrence.
The hosts note the launch of 'Hark', a new AI lab from Figure Robotics' Brett Adcock focused on personal intelligence hardware, interpreting it as a move to compete in the high-value AI model space rather than just robotics.

Also from this episode: (6)

Robotics (3)

Jake Lucerrian argues purpose-built robots for mission-critical infrastructure inspection deliver deterministic value, unlike general-purpose humanoids which offer low ROI due to complex dexterity and reliability issues.
Jake Lucerrian says Gecko Robotics has mapped 500,000 to 600,000 critical infrastructure assets globally, creating a proprietary dataset for predicting failures in the built world.
Jake Lucerrian argues the re-industrialization of the US requires making manufacturing, energy, and mining sectors 'cool' again with AI and robotics to attract talent and address decades of technological stagnation.

Enterprise (1)

Jake Lucerrian predicts the current decade will be the best for private equity, as firms can buy legacy infrastructure assets and use AI and robotics to radically improve their P&L through automation and self-insurance.

AI & Tech (1)

Chris Lattner contends AI is an accelerant for economic growth and individual capability, enabling people to become software developers or skilled tradespeople through personalized assistance and learning tools.

Startups (1)

Chris Lattner and Jake Lucerrian emphasize that long-term company building requires exceptional focus on delivering core customer value, not mimicking competitors or chasing short-term valuation narratives.

Chips Robotics Enterprise Macro

Dwarkesh Podcast

Jensen Huang – TPU competition, why we should sell chips to China, & Nvidia’s supply chain moat • Apr 15

Jensen Huang argues that Nvidia's core function is transforming electrons into valuable tokens, a process he views as hard to commoditize due to the immense artistry and engineering required.
Huang states Nvidia has leveraged its downstream demand to secure and inspire upstream supply chain investments, creating a critical moat in components like memory and packaging.
Huang asserts that industry bottlenecks like CoWoS packaging or logic supply are temporary, typically resolved within two to three years as the market swarms to address them.
Huang argues Nvidia's advantage over TPUs is accelerated computing's versatility, supporting diverse applications from molecular dynamics to data processing, not just AI tensor operations.
Huang claims the programmability of CUDA and Nvidia's architecture is essential for rapid AI algorithm innovation, enabling leaps like the 35x to 50x efficiency gain from Hopper to Blackwell.
Huang states CUDA's value lies in its massive install base, rich ecosystem, and presence in every cloud, making it the default, low-risk foundation for developers and framework builders.
Huang dismisses the threat from hyperscaler custom kernels, arguing Nvidia's architectural expertise and AI-driven optimization consistently deliver 2x or greater performance gains for partners.
Huang attributes specific competitor traction to strategic capital investments, stating Nvidia missed early opportunities to fund labs like Anthropic but has corrected this stance with OpenAI.
Huang outlines Nvidia's philosophy as 'doing as much as needed, as little as possible,' explaining it invests in ecosystem partners like CoreWeave instead of becoming a cloud provider itself.
Huang states Nvidia allocates scarce GPU supply on a first-in-first-out basis tied to purchase orders and data center readiness, denying any price gouging or favoritism towards highest bidders.
Arguing against chip export controls to China, Huang claims China already has sufficient compute, energy, and AI researchers, and that conceding the market harms U.S. technology leadership across all five layers of the AI stack.
Huang contends that China's abundance of energy compensates for less advanced lithography, and their researchers' algorithmic advances are a greater competitive lever than raw hardware flops.
Huang asserts Nvidia does not pursue multiple divergent chip architectures because its current roadmap is provably superior in simulation, but it will expand segments like Groq for premium low-latency inference.

Also from this episode: (1)

AI & Tech (1)

Huang believes AI will cause a massive increase in tool usage, not a decrease, predicting exponential growth in software agents and instances of tools like Synopsys Design Compiler.

Chips Energy Big Tech Enterprise Startups

The Frontier

Google tPUs challenge nVIDIA's CUDA empire

Source Intelligence

Related Stories