Price:

AI & TECH

Compute shortage degrades Claude, forcing Anthropic to lease from Musk

Tuesday, May 19, 2026 · from 4 podcasts
  • Anthropic routed users to inferior chips, spiking token usage by 170x and costs from $26 to $42,000.
  • The company now leases SpaceX's Colossus 1 data center to solve an acute compute bottleneck.
  • OpenAI CFO warns the subsidy era is over as Washington blocks model releases for national security.

The compute shortage is breaking Claude. Theo and Ben of Nerd Snipe argue Anthropic's infrastructure arbitrage - shunting users from optimized Nvidia stacks to Google TPUs and Amazon Trainium chips - has crippled performance. Their primary conspiracy: Anthropic eliminated the price premium for its 1-million-token window to force traffic onto this inferior hardware, hoarding scarce Nvidia GPUs for internal research.

"The input tokens required for similar tasks increased by 170x, while output tokens grew 64x."

- Theo, Nerd Snipe

The damage is measurable. Theo cites an AMD audit showing Claude Code's input tokens for the same tasks jumped 170-fold between January and March. One AMD engineering project saw its estimated AWS Bedrock costs explode from $26 to over $42,000 monthly. Anthropic’s own engineering, strained by hiding 'thinking' traces to prevent competitor distillation, broke under load. Theo attributes bans of third-party tools like OpenClaw to this cost panic, noting OpenClaw's idle heartbeat function costs him $4.31 daily.

The bottleneck forced an unthinkable alliance. Frank Downing of FYI reports Anthropic, which once blocked Elon Musk's xAI from its models, now leases the 300-megawatt Colossus 1 data center - controlled by Musk’s AI and space interests - to get Claude back online. The deal provides inference capacity ideal for running existing models, not training new ones.

Nathaniel Whittemore of AI Daily Brief frames the crunch as systemic. OpenAI CFO Sarah Fryer describes a 'vertical wall of demand' where compute is the primary bottleneck, ending the era of 'all-you-can-eat' AI subsidies. GitHub and Microsoft are pivoting to usage-based billing. Supply is so tight that even second-tier labs are sold out of tokens.

"This marks the first known instance of the executive branch restricting a model rollout based on policy considerations rather than technical failures."

- Nathaniel Whittemore, The AI Daily Brief

The scarcity reshapes power. Whittemore notes Washington recently blocked the broad release of Anthropic’s Mythos model, citing national security and compute cannibalization concerns - an 'informal, highly improvised licensing regime.' Policy expert Dean Ball says the state is now an active gatekeeper of AI capability, not just a customer.

Garry Tan, on Tetragrammaton, sees a different escape route: decentralization. He argues corporate AI models like ChatGPT and Claude function as surveillance tools, and the real shift will come when users run local agents on their own hardware. Tan aligns with 'Team Pirate,' advocating for personal liberty over a 'unipolar' world controlled by one corporation or government.

The frontier is moving off-planet. Brett Winton of FYI expects SpaceX to begin launching modular AI clusters into low Earth orbit around 2028. If Starship launch costs hit $300/kg, deploying a gigawatt of compute in space could cost $7.5 billion, beating terrestrial infrastructure's $20 billion price tag. For Anthropic and others, getting hardware online today is worth any premium.

The compute crisis is no longer a scaling challenge - it's a strategic cage.

Source Intelligence

- Deep dive into what was said in the episodes

Google’s Big AI Test Comes Next WeekMay 15

  • A demand crunch for AI compute tokens is driving a business model shift from flat-rate subscriptions to usage-based billing. GitHub Copilot cited unsustainable inference costs in its pricing change.
  • GPU rental prices have increased 40% over the last six months due to real token demand. Oguz Erken notes the top two AI labs generate nearly $60B in aggregate annual revenue, driven by fundamentals.
  • Big Tech cloud earnings showed massive AI-driven growth: AWS up 28% year-over-year, Microsoft Azure up 40%, and Google Cloud up 63%, beating estimates and causing a historic market cap jump.
  • Microsoft and OpenAI restructured their deal, removing the AGI clause and granting Microsoft free access to OpenAI models for a half-decade. OpenAI is now free to sell models through AWS and Google Cloud.
  • The White House considers unwinding Anthropic's supply chain risk designation. However, US government officials oppose Anthropic's planned Mythos model rollout, citing national security and compute constraint concerns.
  • Dean Ball argues the US government restricting model rollouts like Mythos constitutes an informal, improvised licensing regime. He states the 'trial runs are over' for AI policy.
  • Product development is shifting focus from raw models to the 'harnesses' or interfaces that deliver AI capabilities. New tools like Cursor's SDK and OpenAI's updated Cursor aim to simplify agent deployment.
  • Anthropic split technical and non-technical work into Claude Code and Claude Cowork, while OpenAI's Cursor bets on a single interface for all knowledge workers, promoting technical skill acquisition.
Also from this episode: (5)

AI & Tech (5)

  • Nathaniel Whittemore argues we're entering a phase where AI is critical global economic infrastructure, not a startup-era novelty. This is reflected in Big Tech earnings and massive private valuations.
  • Anthropic is in talks to raise at a valuation above $900B, surpassing OpenAI's last valuation of $825M. Secondary market trades suggest some investors value Anthropic near $1T.
  • OpenAI traced a 'goblin' bug in Cursor to a personality reinforcement learning artifact from GPT-5 models. The quirk highlights how model interdependencies can amplify unusual behaviors.
  • The viral 'MS Paint' image prompt instructs AI to redraw an image in a 'clumsy, scribbly, and utterly pathetic way' to mimic low-quality mouse-drawn art. It spread on Threads and Asian social channels.
  • Nathaniel Whittemore is skeptical of Silicon Valley predictions like a 'permanent underclass,' arguing builders often misunderstand real-world technology diffusion and broader economic forces.

Anthropic solved their compute problem by buying it from Elon?May 14

  • Theo argues Anthropic's Claude Opus 4.7 is not a meaningful improvement over prior models and comes with user experience regressions due to overly strict safety system prompts.
  • Figma's stock dropped 5% after a competitive announcement from Anthropic, contributing to an 85% decline since its IPO. Ben cites this as evidence of Anthropic's negative market impact.
  • Theo reveals Anthropic confirmed a routing error last year where 0.8% to 16% of Sonnet requests were sent to a dumber, 1M-context version, establishing a precedent for performance regressions via model versioning.
  • Theo's primary conspiracy is that Anthropic now forces all Claude Code users onto the dumber 1M-context model version to route traffic away from scarce Nvidia GPUs and onto partners like AWS and Google TPUs, explaining the performance drop.
  • The hosts claim Anthropic employees use different, superior internal versions of Claude and its tools, creating a disconnect where employees don't experience the external product's failures and dismiss user complaints.
Also from this episode: (5)

AI & Tech (5)

  • Anthropic is aggressively banning third-party tools like T3 Code and OpenClaw that interface with Claude Code. Theo attributes this to a compute crisis and poor caching implementations that increase costs.
  • Theo states OpenClaw's heartbeat function, which polls for tasks, costs him $4.31 daily without active use, extrapolating to roughly $120 monthly in wasted API spend.
  • Ben cites a GitHub issue from an AMD AI head showing a massive spike in Claude Code usage and cost at their company. Input tokens increased 170x and costs jumped from $26 to over $42,000 monthly after model updates.
  • Theo and Ben argue Anthropic's engineering is incompetent, citing recent changes like a new tokenizer, a 5-minute cache TTL, and hidden reasoning data that complicate their stack across three hardware providers, leading to reliability and intelligence issues.
  • Theo attributes Anthropic's problems to a research-first, safety-obsessed culture that devalues engineering and product reliability, creating opaque policies and a 'holier-than-thou' attitude that frustrates developers.
Tetragrammaton with Rick Rubin
Tetragrammaton with Rick Rubin

Tetragrammaton with Rick Rubin

Garry TanMay 13

  • Tan credits the November 2025 release of Anthropic's Opus model as a watershed moment, enabling 'vibe coding' that lets him produce 100x more software now than in 2013. He sees this as democratizing creation.
  • He champions the open-source 'openclaw' movement for personal AI, exemplified by his own projects GStack and Gbrain. Tan argues control is critical to avoid a future where a single corporate or government entity controls the only superintelligence.
  • Tan argues a major impediment to innovation is big tech's closed ecosystems, citing Apple's Siri and iMessage as examples where locked platforms prevent the best technology from reaching users.
Also from this episode: (11)

Startups (2)

  • Garry Tan describes Y Combinator as an institution where 16 partners review 80,000 annual applications to find 800 founders, focusing on the fundamental question: 'Will this person make something people want?'
  • Tan argues the best investors today are former builders, not bankers, because early-stage startups need a focus on creation over finance. He contrasts this with traditional VC, which he sees as stuck in a 1970s 'banker' mentality.

AI & Tech (6)

  • Tan asserts the current AI revolution mirrors the early personal computer era. He says it is still 1% of the way into changing the world, with tools currently too expensive for most but destined to become democratized.
  • He details using AI agents like 'Granola' to transcribe and distill YC office hours, freeing partners from repetitive advice to focus on novel problems. This creates an 'above the API line' role for creative work.
  • He differentiates leading AI models by personality: Claude Opus is an 'ADHD CEO,' OpenAI's model is a '200 IQ savant,' and DeepSeek is a 'conspiracy theorist.' Tan believes this diversity of 'personalities' is healthy for the ecosystem.
  • Tan views Silicon Valley's essence as earnest builders making things people want. He attributes its origin to post-WWII R&D and defense funding, like DARPA's role in creating TCP/IP, which was later commercialized.
  • He believes AI will enable small, highly efficient companies, contrasting with the inefficient 'adult daycare' of large tech orgs. The goal should be directing human talent toward more meaningful service and creation.
  • He is a 'techno-optimist' who believes technology, from fire to AI, is the unbroken chain lifting humanity from subsistence. His personal mission is to give others the access to technology that changed his life.

Business (3)

  • He contends a founder's character is more critical than their initial idea. The essential traits are earnestness and being 'connected to the source,' not a salesmanship or hustle culture mentality.
  • Tan explains that YC's 13-week program works by creating intense focus and community. The median company now raises about $2.2 million at demo day, up from roughly $1 million when he returned to lead the organization.
  • Tan frames his management philosophy as 'zero-based accounting,' asking what YC would rebuild from scratch today. His core directive was to refocus exclusively on the early-stage founder program that made YC successful.

SpaceX And Anthropic Partnership | The Brainstorm EP 131May 13

  • Anthropic has been supply-constrained on compute, restricting user tokens and cutting off research compute, forcing them to lease Colossus 1 - a 300MW, 220,000 GPU data center - from SpaceX to lift capacity limits.
  • Anthropic previously enforced a policy to cut off competitors like XAI from using its models, but the compute deal has reopened communication between them.
  • SpaceX’s vertical integration from chip fabrication to model inference reduces its risk when leasing out compute, as it can more rapidly rebuild capacity.
  • Building a gigawatt-scale AI data center costs roughly $60B: $19B for the facility, $30B for GPUs, and $11B for other IT equipment like CPUs and networking.
  • A gigawatt of leased compute infrastructure generates about $15B annually; a model provider operating on that scale can generate $30B in revenue.
  • OpenAI’s revenue scaled with compute; they generated $20B on roughly 2 gigawatts early this year, implying about $20B per effective gigawatt for inference.
  • Revenue per watt is increasing as model utility and enterprise willingness to pay rise, creating pricing power for both infrastructure providers and model companies.
  • Space-based AI compute economics hinge on Starship launch costs; at $300/kg, launching a gigawatt costs $7.5B, beating terrestrial facility costs.
  • Manufacturing satellites on a production line offers cost efficiencies over building unique terrestrial data centers, where $5B of the $19B facility cost is labor.
  • Compute scarcity on Earth means AI companies like Anthropic would pay a premium for space-based watts even before launch costs break parity, valuing velocity over price.
  • ARK analysts project SpaceX could begin scaling space-based AI compute in 2028-2029, reaching tens of gigawatts per year in the early 2030s.
  • Small modular nuclear reactors are more likely to scale in the US than gigawatt plants, fitting incremental demand like offsetting retired coal plants.