The bottleneck for artificial intelligence has moved from the chip fab to the power grid. Energy now constitutes nearly 40% of the total cost for the latest Nvidia GPU clusters, a figure Naveen Rao on This Week in AI projects will surpass 50% within three to four years. This flips the economics of scaling, making electricity, not transistors, the binding constraint.
OpenAI is responding by building physical infrastructure on a decade-long horizon. CFO Sarah Friar, on All-In, said the company is currently negotiating for power and chip capacity in 2030, with one-gigawatt data centers costing roughly $50 billion each. In Michigan, OpenAI is paying for its own grid upgrades to avoid spiking local utility rates - a project that won't deliver usable compute until 2028.
This energy wall is forcing an architectural pivot. Cerebras CEO Andrew Feldman argues that beating Nvidia requires abandoning GPU design principles altogether. His company’s wafer-scale chip, the size of a dinner plate, places memory directly next to compute to eliminate data movement latency, claiming an 18x speed advantage. OpenAI is following suit, diversifying its chip strategy to include AMD, Cerebras for low-latency work, and a custom chip developed with Broadcom.
"We are hitting an energy wall that manufacturing cannot solve alone. Simply building more chips won't work if the grid can't light them."
- Naveen Rao, This Week in AI
The industry is exploring even more radical solutions. Planet Labs CEO Will Marshall contends that the ultimate fix is to move data centers to space, where sun-synchronous orbits provide 24/7 solar power without terrestrial land-use battles. He says the economic tipping point is a launch cost of $200 per kilogram, a threshold he expects SpaceX’s Starship to hit soon.
Simultaneously, the exorbitant and unpredictable cost of cloud-based AI is triggering a swing back to local processing. As Steven Sinofsky explained on The a16z Show, developers are already running stacks of Mac Minis to avoid $10,000 cloud bills for simple tasks. He sees Nvidia’s new RTX Spark chip - an Arm CPU paired with Nvidia parallel processing - as a bid to become the primary architect of the AI-native PC, where tokens become infinitely free after the hardware is purchased.
"Whenever a resource becomes a bottleneck that users must pay for, it moves to the local device and becomes free."
- Steven Sinofsky, The a16z Show
This scramble for compute and energy is unfolding against a backdrop of rising public animosity. Rao and Alex Finn blame “doomer” narratives from companies like Anthropic for painting AI as an existential threat, fueling local protests against data centers over water use and misinformation. They warn that without tangible local benefits - like AI companies funding public transit for host communities - the regulatory backlash will be swift, potentially ceding technological leadership to a more enthusiastic China.
The industry’s survival now depends on a three-front war: reinventing chip architecture, securing gargantuan energy supplies, and winning the public relations battle it has so far badly lost. The companies that can deliver intelligence per watt, not just raw performance, will define the next era.



