Price:

BUSINESS

AI compute crunch forces industry to build chips, grids, and space data centers

Monday, June 8, 2026 · from 4 podcasts, 5 episodes
  • AI's primary cost is shifting from silicon to electricity, with energy projected to exceed 50% of data center spending within four years.
  • OpenAI is booking $50B power plant builds for 2028 and diversifying beyond Nvidia to AMD and Cerebras to secure future capacity.
  • Executives argue the unsustainable cost of cloud tokens will force AI processing back onto local devices like Nvidia's new 'super chip'.
  • Public hostility and a failure to share AI's economic benefits risk a regulatory backlash that cedes the lead to China.

The bottleneck for artificial intelligence has moved from the chip fab to the power grid. Energy now constitutes nearly 40% of the total cost for the latest Nvidia GPU clusters, a figure Naveen Rao on This Week in AI projects will surpass 50% within three to four years. This flips the economics of scaling, making electricity, not transistors, the binding constraint.

OpenAI is responding by building physical infrastructure on a decade-long horizon. CFO Sarah Friar, on All-In, said the company is currently negotiating for power and chip capacity in 2030, with one-gigawatt data centers costing roughly $50 billion each. In Michigan, OpenAI is paying for its own grid upgrades to avoid spiking local utility rates - a project that won't deliver usable compute until 2028.

This energy wall is forcing an architectural pivot. Cerebras CEO Andrew Feldman argues that beating Nvidia requires abandoning GPU design principles altogether. His company’s wafer-scale chip, the size of a dinner plate, places memory directly next to compute to eliminate data movement latency, claiming an 18x speed advantage. OpenAI is following suit, diversifying its chip strategy to include AMD, Cerebras for low-latency work, and a custom chip developed with Broadcom.

"We are hitting an energy wall that manufacturing cannot solve alone. Simply building more chips won't work if the grid can't light them."

- Naveen Rao, This Week in AI

The industry is exploring even more radical solutions. Planet Labs CEO Will Marshall contends that the ultimate fix is to move data centers to space, where sun-synchronous orbits provide 24/7 solar power without terrestrial land-use battles. He says the economic tipping point is a launch cost of $200 per kilogram, a threshold he expects SpaceX’s Starship to hit soon.

Simultaneously, the exorbitant and unpredictable cost of cloud-based AI is triggering a swing back to local processing. As Steven Sinofsky explained on The a16z Show, developers are already running stacks of Mac Minis to avoid $10,000 cloud bills for simple tasks. He sees Nvidia’s new RTX Spark chip - an Arm CPU paired with Nvidia parallel processing - as a bid to become the primary architect of the AI-native PC, where tokens become infinitely free after the hardware is purchased.

"Whenever a resource becomes a bottleneck that users must pay for, it moves to the local device and becomes free."

- Steven Sinofsky, The a16z Show

This scramble for compute and energy is unfolding against a backdrop of rising public animosity. Rao and Alex Finn blame “doomer” narratives from companies like Anthropic for painting AI as an existential threat, fueling local protests against data centers over water use and misinformation. They warn that without tangible local benefits - like AI companies funding public transit for host communities - the regulatory backlash will be swift, potentially ceding technological leadership to a more enthusiastic China.

The industry’s survival now depends on a three-front war: reinventing chip architecture, securing gargantuan energy supplies, and winning the public relations battle it has so far badly lost. The companies that can deliver intelligence per watt, not just raw performance, will define the next era.

Source Intelligence

- Deep dive into what was said in the episodes

The IPO Comeback: Why Tech Giants Are Finally Going Public | All-In Liquidity IPO PanelJun 6

  • Andrew Feldman says Cerebras spent nine years navigating a tough IPO path before the final year became easy.
  • Planet Labs CEO Will Marshall says Earth observation combined with AI could unlock a market worth $75 to $100 billion.
  • Andrew Feldman believes AI unlocked computing for problems like image and language processing, which were previously intractable.
  • Andrew Feldman argues Cerebras chose a dinner plate-sized chip architecture to place memory next to compute, solving AI's data movement bottleneck and delivering 15-18x speed over GPUs.
  • Will Marshall contends that current large language models lack real-world data, but integrating planetary sensing will create 'large Earth models' for practical applications.
Also from this episode: (4)

Startups (4)

  • Planet Labs' primary revenue today comes from security applications, accounting for about 60% of its business.
  • Will Marshall says Planet Labs' early investors like Google and Capricorn held shares through its public market 10x move from $5 to $50, capturing most of the upside.
  • Andrew Feldman notes most money is made after IPO rather than before, citing studies that show greater percentage and absolute returns in the public markets.
  • Brad Gerstner observes a pendulum swinging back toward earlier IPOs, with companies now aiming to go public at $1-5 billion valuations rather than staying private indefinitely.

OpenAI CFO Sarah Friar: IPO, AI Rivalries, New Device, and Spending $100B+ on ComputeJun 2

  • Compute scarcity is the current bottleneck, with insufficient tokens available through 2026 and limited supply in 2027. Friar credits OpenAI’s early compute buying for its current position.
  • OpenAI is diversifying its chip strategy beyond Nvidia. Their fall training run will use Vera Rubens, AMD chips are in the pipeline, Cerebras is live for low-latency dev work, and they are developing their own chip with Broadcom.
  • OpenAI is developing a new consumer device with Jony Ive, described as natural, lovable, and intimate. Friar says it will be unveiled by year-end and available for purchase early next year.
  • Friar sees ChatGPT as a hybrid of Google and Meta, possessing high user intent data plus memory and demographic context. This creates a potent ad platform, though an ad-free tier will remain.
  • OpenAI’s current token allocation prioritizes strategy over pure economics; API tokens are an order of magnitude more valuable than consumer tokens, but they are provisioning for broad global access.
Also from this episode: (3)

Big Tech (2)

  • Friar frames an IPO as a fundraising milestone, not a destination, to create optionality. OpenAI raised $22 billion in March for flexibility.
  • Usage intensity scales sharply with pricing tiers. Free users average about seven turns daily, the first paid tier doubles that to 15, the $20 Plus tier triples it, and Pro users see an 11x increase over free.

Enterprise (1)

  • OpenAI’s revenue is now roughly balanced 50-50 between consumer and enterprise. Friar cites heavy enterprise engagement with firms like Thermo Fisher, major banks, Travelers, and tech companies.

What OpenAI and Anthropic Think Happens Next With AIJun 5

  • Nathaniel Whittemore says Trump's AI executive order evolved from mandating 90-day pre-release government access to a voluntary 30-day process, with NSA assigned primary testing responsibility.
  • David Saxs intervened to stop the initial executive order, arguing it would hinder US AI competitiveness against China. The final version includes a disclaimer prohibiting mandatory government licensing or pre-clearance regimes.
  • Dean Ball calls the executive order a major win for AI safety advocates within the administration and a significant loss for accelerationists like Saxs, arguing it tees up infrastructure for future mandatory licensing.
  • Anthropic expanded Project Glasswing access to its Mythos model, adding 150 partners across 15 countries including energy, water, communications, healthcare, and hardware sectors vulnerable to catastrophic cyber attacks.
  • Anthropic walked back its timeline for general Mythos access, stating robust safeguards preventing cyber capability misuse don't exist yet. Current testers find the model powerful but prohibitively expensive, with Anthropic subsidizing costs.
  • SK Hynix plans to double memory chip manufacturing capacity by decade's end, viewing AI-driven demand as structural. Chairman Shay Taye Wuan warns shortages could persist until 2030 and sudden price jumps threaten industry sustainability.
  • OpenAI reports Codex now has 5 million weekly active users, with non-technical knowledge workers adopting it three times faster than developers. The platform sees users shifting from sequential to parallel task execution.
  • OpenAI identifies three knowledge work frictions: finding inputs across opaque systems, information coordination costs, and approval delays. A McKinsey study found workers spend over 25% of their week on email and nearly 20% searching for internal information.
  • OpenAI's new Codex features include annotations for precise document interaction, role-specific plugins bundling apps and skills for six functions, and Sites for turning artifacts into shareable web apps.
  • Uber implemented a $1,500 monthly token spending cap per employee, highlighting cost management as a critical vector in enterprise AI adoption amid the broader shift from subsidy to scarcity economics.
  • Microsoft released seven new AI models including MAI Thinking One, a 1-trillion parameter MoE model positioned between Sonnet 4.6 and Opus 4.6. Mustafa Suleyman claims it outperformed GPT-4.5 on quality while being 10x lower cost for McKinsey tasks.
  • Microsoft's strategy focuses on Frontier Tuning for company-specific agents, with CEO Satya Nadella advocating for enterprises to move from consuming frontier models to participating in the frontier ecosystem via cost-optimized proprietary models.

AI Layoffs, Compute Costs & Agents | Naveen Rao & Alex Finn on This Week in AI Episode 16Jun 4

  • Naveen Rao estimates that 20-30% of current AI compute token costs are wasted on 'token maxing,' a gaming of usage metrics driven by leaderboards and corporate proxy goals.
  • Current AI models lack the holistic reasoning, architectural foresight, and production-grade reliability of a senior human developer. Alex Finn counters that the intelligence is already revolutionary; the problem is its misapplication by non-technical users.
  • Alex Finn reports his coding velocity has increased by a thousandfold using AI. He attributes this to deeply understanding systems, not just prompt blasting.
  • Alex Finn runs Quen 3.7 locally on a $4,000 Nvidia DGX Spark, advocating for 'unlimited, dumber intelligence' to power 24/7 agents for tasks like scraping social media for opportunities.
  • Naveen Rao notes that the total cost of ownership for GPU clusters is shifting from capex to opex, with energy now constituting nearly 40% of TCO for current-gen Nvidia chips. He projects this will exceed 50% within the next 3-4 years.
  • Naveen Rao's startup, Unconventional AI, is developing non-von Neumann architectures where memory and compute are unified. He aims for a 2-3 order of magnitude improvement in power efficiency to overcome the coming energy wall.
  • Naveen Rao blames 'doomer' narratives, specifically calling out Anthropic, for painting AI as an existential threat. He argues this damages public perception, fuels protests against data centers, and risks harmful regulation.
  • Alex Finn traces current AI layoff rhetoric to irresponsible hiring during the 2020 zero-interest rate period. He argues CEOs are using AI as a scapegoat for prior overspending, not as the real cause of cuts.
  • Naveen Rao identifies a core problem as Silicon Valley's failure to let the public share in AI's financial upside, exacerbated by companies staying private too long. He contrasts this with China, where public sentiment views AI as a competitive superpower.
  • Alex Finn posits that seizing private equity for a public trust destroys incentives. He proposes a policy alternative: give every American a funded ChatGPT plan and education on extracting value from AI.
  • Naveen Rao suggests AI companies building data centers should voluntarily invest in local communities, like funding public buses or rec centers, to build tangible public goodwill and counter misinformation-driven protests.
  • Alex Finn and Naveen Rao both express skepticism about buying into imminent hyped IPOs like Anthropic or SpaceX, citing distorted valuations and a preference to let price discovery settle first.

Steven Sinofsky on Apple at 50, Microsoft, and the Future of ComputingJun 2

  • Steven Sinofsky positions the Nvidia RTX Spark 'super chip' as a pivot toward AI-native personal computing, moving compute from costly cloud tokens to free local device processing.
  • The Computex trade show in Taiwan is an 'inside baseball' supply chain event for computing device components that rarely enters mainstream tech coverage.
  • Nvidia's Spark announcement marks a re-entry into the mainstream PC chip business, recalling Microsoft's 2011 Surface announcement which also involved Nvidia, Qualcomm, and Texas Instruments.
  • A critical unresolved question is whether Apple will natively support Nvidia's CUDA APIs at WWDC or use a translation layer, which will shape the AI hardware-software ecosystem.
  • Sinofsky dismisses memory and component shortages as transient industry cycles that will self-correct, drawing on historical examples like DRAM and hard drive shortages.
  • He is skeptical of Microsoft's approach to the Nvidia Spark, framing full backward Windows compatibility as a burden consumers don't want, preferring legacy apps be handled via servers or VMs.
Also from this episode: (1)

Big Tech (1)

  • He contends Microsoft's Surface strategy strayed from its original Arm-based, mobile-forward vision to focus on Intel-based 'objection handler' devices, missing a platform discontinuity.