Price:

AI & TECH

Big Tech spends $1T to stockpile AI chips

Wednesday, May 6, 2026 · from 7 podcasts
  • Google, Amazon, and Microsoft will spend $725 billion on infrastructure this year, crushing free cash flow to lock in power and chips.
  • Getting AI compute now requires three-to-five-year contracts with 30% upfront payments and no slack capacity.
  • Frontier labs like Anthropic are trading equity for compute, making hardware access more valuable than cash.

The trillion-dollar industrialization of Big Tech is ending the asset-light software era. Major hyperscalers have announced 2026 capital expenditure guidance nearing a trillion dollars. Amazon will spend $200 billion, Microsoft and Google $190 billion each, and Meta $145 billion. This massive buildout is driving their free cash flow down by double-digit percentages, with Amazon's collapsing 97%.

Chamath Palihapitiya argues these companies are morphing into capital-intensive industrial utilities. Their massive spending isn't just for chips. They are signing power purchase agreements at twice the spot rate to guarantee electricity for data centers. Palihapitiya predicts they will become highly leveraged, debt-heavy industrial businesses within five years.

The market has shifted from shortage to a structural transformation. Baseten CEO Tuhin Srivastava reports there is zero slack compute. His company's clusters run at mid-90s utilization across 18 different clouds to ensure reliability. Securing a significant allotment of the latest Nvidia B200 chips now demands three-to-five-year lock-in contracts with 30% of the cash due upfront. This turns AI startups into capital-intensive operations overnight.

"The era of cheap, on-demand AI compute is over."

- Tuhin Srivastava, Baseten CEO on No Priors

Frontier AI labs are now trading ownership for survival. Anthropic signed $73 billion in combined deals with Google and Amazon, exchanging equity for the literal electricity and chips required to stay in the race. On Moonshots, Alex and Dave argued this shows that for Anthropic, compute is now more valuable than cash.

These investments come at a steep discount. The deals value Anthropic at roughly one-thirtieth of its secondary market valuation. The labs have revenue and demand, but they lack the physical capacity to scale. This creates a recursive, circular economy where the hyperscalers own the horses and the racetrack.

The compute crunch is so severe it's becoming a strategic weapon. David Sacks noted that while OpenAI missed some consumer targets, its developer 'mojo' is shifting back to GPT 5.5 because of capacity. Anthropic’s top-tier Opus model is currently 'compute-gated,' forcing users to older versions. Sam Altman’s early commitment to $600 billion in data center spend secured a supply lead that is now a competitive moat.

"We are over-provisioning on memory capacity just to get the necessary bandwidth. We have a surplus of space but a deficit of speed."

- Reiner Pope on Dwarkesh Podcast, explaining the hardware bottleneck

This arms race is about securing the physical stack from silicon to power lines. Google's strategy highlights the endgame. It now controls about 25% of the planet's AI compute and designs its own TPU chips using AI in a recursive loop. The aim is total sovereignty over the stack to maximize economic value per token. The companies providing the literal horsepower are betting the coming surge in 'agentic' AI will make their massive, painful investments pay off.

Source Intelligence

- Deep dive into what was said in the episodes

Verdicts From Vegas | THE BITCOIN BRIEF 80May 6

  • Cash App raised its Bitcoin withdrawal limit from $1,000 to $10,000 per day and added features like auto-converting peer-to-peer payments to Bitcoin and a 5% Bitcoin back program with Square merchants.
  • Block reported holding approximately 28,355 Bitcoin, with 19,000 for customers and 9,000 in its corporate treasury, alongside a new NFC tap-to-pay solution for Cash App.
  • Lightspark launched Grid Global Accounts, a payment layer connecting to 175 million Visa merchants across 33 countries, and introduced AI agent delegation for financial tasks.
Also from this episode: (9)

AI & Tech (1)

  • Q and A argues that posturing by the DOJ on not prosecuting open-source developers is meaningless as key figures like himself, William Hill, and Roman Storm remain imprisoned.

Politics (1)

  • The SDNY rejected a defense appendix in the Roman Storm trial, arguing it wrongly equates privacy with anonymity and claims a system can be private even if a central authority can access data.

Protocol (7)

  • Bitcoin 2026 was busier than the prior year despite negative price sentiment. Foundation showcased Passport Prime, and both Zach and Max moderated panels.
  • Max used Foundation's SDK and Claude Opus to build a proof-of-concept offline Nostr signing app and a password manager that works via browser extension with Passport Prime.
  • Block announced BitKey with a screen, priced at $260. It retains a fingerprint reader, which Max criticizes as a physical security risk.
  • AVEN launched a Bitcoin-backed Visa credit card allowing loans up to $1,000,000 against crypto, with interest starting at 7.99% APR and 2% cash back.
  • Strike unveiled a 'volatility-proof' loan structure with Tether, secured a $2.1 billion credit facility, and cut rates to 7.49% APR, while pursuing a merger to create a publicly listed platform.
  • Paul Sztorc proposed a Bitcoin hard fork called eCash at block 964,000 in August, initially planning to reassign Satoshi's coins to fund it, but later backtracked after pushback.
  • BIP 47 DB is a new decentralized directory using Ordinals inscriptions to store payment codes on-chain, solving the centralization vulnerability exposed when Samourai's PayNym server was seized.

Spoils of war: money flows into defence techMay 4

  • Henry Trix outlines the rise of the 'neoprimes' - Palantir, SpaceX, and Anduril - as tech-led defense contractors leveraging software, satellites, and drones to win government contracts by offering cheaper, nimbler weapons.
  • Major contracts for neoprimes include Palantir's Project Maven program-of-record status, Anduril's consolidated army contract potentially worth $20 billion over 10 years, and a Pentagon AI strategy launch at SpaceX.
  • Trix notes the F-35 program led by Lockheed Martin is valued at approximately $1.7 trillion, far exceeding neoprime deals, yet venture capital is pouring into defense tech at record levels on expectations of a changing of the guard.
  • Political ties risk bipartisanship, as Donald Trump defended Palantir against short sellers and his son is a venture partner at 1789 Capital, which invests in Anduril.
  • The Trump administration's Department of War blacklisted Anthropic as a supply chain risk after the AI lab stipulated its models not be used for autonomous weapons or mass surveillance.
Also from this episode: (6)

AI & Tech (1)

  • Neoprimes advocate for military AI use, with Palantir using Anthropic models for classified work, Anduril embedding AI in autonomous weapons, and SpaceX acquiring Elon Musk's XAI lab.

Politics (3)

  • President Woodrow Wilson's 1917 call to enter WWI, framing it as a defense of democracy, was followed by the 1920 ratification of the 19th Amendment granting women's suffrage after suffragists highlighted the hypocrisy of his ideals.
  • Roosevelt's administration interned roughly 120,000 Japanese Americans during WWII, two-thirds of whom were US citizens, while black soldiers served in segregated units.
  • The atomic bombs dropped on Hiroshima and Nagasaki killed an estimated 200,000 people by the end of 1945, leading to Japan's surrender on August 15th.

Business (1)

  • The Great Depression began with the 1929 Wall Street crash, leading to 25% unemployment by 1933 before Franklin D. Roosevelt introduced the New Deal with bank deposit insurance, jobless relief, and public works projects.

Culture (1)

  • Andrew Palmer advises on workplace emoji etiquette, noting a stand-alone heart emoji can imply a proposal, a thumbs-up may seem frosty to Gen Z, and the tilted tears of joy emoji signals genuine laughter.

OpenAI Misses Targets, Codex vs Claude, Elon vs Sam Trial, Big Hyperscaler Beats, Peptide CrazeMay 1

  • OpenAI missed its target of reaching one billion weekly active users for ChatGPT by the end of 2025 and is still short of that milestone in 2026. It also missed its 2025 revenue target.
  • OpenAI has $600 billion in spending commitments for compute, a figure roughly equal to its secondary market valuation. CFO Sarah Friar is reportedly concerned about revenue growth keeping pace with these expenses.
  • David Sacks argues OpenAI's recent product release, GPT 5.5, has received strong reviews from developers and is taking coding market share from Anthropic's Opus 4.7, which users complain is compute-constrained and buggy.
  • Polymarket shows a 32% chance OpenAI goes public by the end of 2026, down from 60% in December, reflecting market skepticism about its IPO timeline.
  • Chamath Palihapitiya contends the fundamental constraint for AI companies is power supply, not demand, and that supply chain delays for grid infrastructure will hurt OpenAI and Anthropic while benefiting hyperscalers like Oracle, Amazon, Meta, Microsoft, and Google.
  • David Friedberg cites a BCG 'Rule of Three' theory, predicting mature AI markets will evolve to a 4:2:1 market share ratio between the top three players, with OpenAI and Google currently leading in consumer and Google leading in enterprise.
  • Friedberg highlights an MIT paper on neural network pruning, showing models can be reduced in size by 90% for the same accuracy, potentially cutting inference costs by 10x and dramatically improving energy efficiency.
  • Sacks notes OpenAI released GPT 5.5 Cyber, which matches Anthropic's Mythos in completing multi-step cyber attack simulations, and is likely the first such model commercially available due to OpenAI's superior compute capacity.
  • In the Elon Musk vs. OpenAI trial, excerpts from Greg Brockman's diary suggest the OpenAI team wanted to transition to a for-profit structure and remove Musk as an investor, which Musk claims breaches charitable trust.
  • Polymarket odds give Elon Musk a 42-43% chance of winning his lawsuit against OpenAI, with speculation that a win might only result in him being credited back his initial $40 million investment.
  • Major hyperscalers announced massive 2026 capital expenditure guidance: Amazon at $200 billion, Microsoft and Google at $190 billion each, and Meta at $145 billion, signaling a trillion-dollar infrastructure buildout driven by AI and cloud demand.
Also from this episode: (5)

AI & Tech (1)

  • Sacks argues AI cyber models like Mythos or GPT 5.5 Cyber don't create vulnerabilities but discover existing bugs, and their proliferation will lead to a one-time upgrade cycle as systems are hardened, followed by a new equilibrium between offense and defense.

Business (2)

  • This CAPEX surge is crushing free cash flow: Amazon's is down 97%, while Google, Microsoft, and Meta are down 12%, 12%, and 8% respectively, as companies pivot from shareholder returns to heavy infrastructure investment.
  • Chamath Palihapitiya predicts hyperscalers will become highly leveraged, debt-heavy industrial businesses within five years as they lock in long-term power contracts at rates more than 2x the prevailing spot price.

Science (1)

  • David Friedberg cites phase three trial data for Eli Lilly's Retatrutide, a triple agonist peptide, showing an average weight loss of 37 pounds in 40 weeks, an 80% reduction in liver fat, and significant drops in A1C, non-HDL cholesterol, and triglycerides.

Regulation (1)

  • Friedberg attended the Supreme Court hearing for Monsanto (Bayer) vs. Hardeman, a case testing federal preemption of pesticide labels under FIFRA against state failure-to-warn laws, with Bayer having paid out $10 billion and reserving another $10 billion for 90,000 outstanding cases.

Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference CloudMay 1

  • Alad notes that Baseten has grown 30x in the last year and expects to exceed $1 billion in revenue, reflecting the rapid expansion of the AI inference market.
  • Tuhin Srivastava attributes the growth to mainstream adoption of open-source models, which have crossed a capability chasm, and widespread use of post-training techniques for specialized models.
  • Tuhin Srivastava believes an independent application layer will exist because companies leverage unique user signals encoded in workflows, making it difficult for frontier model companies to replicate.
  • Abridge, an ambient scribe used by physicians, exemplifies an application layer company with deep integration into hospital workflows and access to unique user signal for post-training models.
  • The majority of the AI market today, approximately 99% by inference count, represents enterprise adoption that is yet to come online, indicating significant future growth potential.
  • Tuhin Srivastava states that over 95% of tokens served on Baseten are from custom models where customers modify open-source models with their own data or for performance.
  • Baseten acquired a research team specializing in post-training to accelerate market support and integrate post-training expertise with inference, recognizing their interconnectedness.
  • Tuhin Srivastava believes Chinese open-source models are fantastic with no real evidence of embedded agendas, but emphasizes the U.S. needs to develop its own competitive models.
  • Running models like DeepSeek can be 20% of the cost of proprietary alternatives, offering comparable latency and reliability, making access to such intelligence crucial for national innovation.
  • The AI compute market faces a severe supply crunch with very little slack compute, forcing Baseten to run large clusters at mid-90s utilization across 18 clouds globally.
  • Tuhin Srivastava explains that securing 1,024 B-200 GPUs today demands a three to five-year contract with a 20-30% total contract value prepay, highlighting capital requirements for capacity acquisition.
  • Baseten maintains high customer stickiness, achieving 400% annual Net Dollar Retention (NDR) due to its comprehensive software layer, which differentiates it from non-sticky 'GPUs as a service'.
  • Tuhin Srivastava acknowledges NVIDIA's strong supply chain, CUDA ecosystem, and developer support make them difficult to surpass in the short term, despite the desirability of a multi-chip world.
  • Jevons Paradox applies to AI inference: decreasing the cost of intelligence leads developers to embed more intelligence into applications, driving greater consumption and better user experiences.
Also from this episode: (1)

AI & Tech (1)

  • Tuhin Srivastava envisions a future where AI provides personalized concierge services for everyone, making everything smarter and leading to the creation of even more software.

5/1/26: New Iran Strikes Imminent?, Platner Beats Mills, AI UnderClass, JPMorgan MeTooMay 1

  • Emily Jashinsky cites a New York Times piece arguing AI companies' core business model relies on disruption, creating a painful transition that will disempower millions into an underclass.
  • Ryan Grimm points out the absurdity of the AI doom loop: if AI puts everyone out of work, no one has income to buy the products AI companies sell.
  • An AI agent allegedly deleted a company's entire database and backups in nine seconds, showcasing the risks of implementing AI without proper safeguards.
Also from this episode: (16)

Diplomacy (4)

  • A Barockravied report indicates Iran delivered a new response on a draft peace deal, signaling diplomacy is not entirely frozen despite Trump considering new military action.
  • Iranian Foreign Ministry spokesman Esmail Bagey said Pakistan has shown good capability in mediation and will remain the mediator, indicating a continued openness to talks.
  • Iran's stated strategic goal is to reach a point where 'the danger of war does not exist,' a direct response to Trump's threats to annihilate their civilization.
  • Iran currently holds under one thousand pounds of sixty percent enriched uranium, compared to the twenty-five thousand pounds removed to Russia in the 2015 deal.

War (5)

  • Trump claims Iran's drone factories are eighty-two percent destroyed and missile factories almost ninety percent destroyed, framing the conflict as a successful military operation.
  • The U.S. shipped sixty-five hundred tons of munitions and equipment to Israel in twenty-four hours, indicating preparations for potential new strikes on Iran.
  • Republican Senator Ron Johnson reportedly referred to the Iran conflict as a 'two week bombing run,' reflecting initial administration expectations of a quick victory.
  • Ryan Grimm argues the U.S. is in a weaker position for renewed conflict, with oil prices over $100 a barrel and key regional bases destroyed, unlike at the war's start.
  • Krystal Ball notes polling shows the Iran war is already as unpopular as the Vietnam War was at its worst, but it took six years for Vietnam to reach that level.

Energy (1)

  • Every U.S. state has higher gas prices compared to a week ago, with Indiana up eighty-four cents, Michigan seventy-two cents, and Ohio sixty cents.

Elections (5)

  • Graham Platner defeated sitting Governor Janet Mills to become the presumptive Democratic Senate nominee in Maine, a victory for a first-time candidate against an established figure.
  • Krystal Ball argues Platner's strength against Susan Collins and his focus on Israel and oligarchs reflects where the normie Democratic base is, not just the activist left.
  • Ryan Grimm notes Schumer's camp claimed they couldn't spend heavily against Platner because it would be politically toxic, given his majority support among Maine Democrats.
  • Zora Mom Donnie said the DNC establishment never reached out to him despite polling forty points ahead, highlighting a disconnect between party leadership and insurgent candidates.
  • DNC Chair Ken Martin refused to release the party's 2024 election autopsy, claiming focus should be on future lessons, not 'navel gazing' or placing blame.

Politics (1)

  • Ryan Grimm speculates the DNC may hide the report to obscure how a billion dollars was spent by specific consultants and firms during the short 2024 campaign.

Google Invests $40B Into Anthropic, GPT 5.5 Drops, and Google Cloud Dominates | EP #252Apr 30

  • Google is investing $40 billion in Anthropic, providing five gigawatts of TPU compute over five years, signifying a major commitment to the AI frontier.
  • Amazon is investing $33 billion in Anthropic, committing to $25 billion in new funds on top of a previous $8 billion, with Anthropic pledging to spend over $100 billion on AWS services.
  • The AI industry faces a significant bottleneck at TSMC, limiting the availability of chips, which Elon Musk highlights as the fundamental constraint to AI development.
  • Google Cloud unveiled its eighth generation of TPUs (8T for training, 8I for inference), offering three times faster training performance and 80% better performance per dollar.
  • OpenAI released GPT 5.5, seven weeks after GPT 5.4, showcasing a 37-point increase in long-context reasoning and a 60% reduction in hallucinations compared to 5.4.
  • Alex reports that Frontier Math Tier 4 benchmark, a proxy for professional-level math, shows approximately 1% monthly gains from frontier AIs, suggesting all such problems could be solved in four to five years.
  • Moonshot AI launched Kimi K2.6, a trillion-parameter open-source model that costs 30 times less than closed models, trained for $4.6 million, demonstrating significant cost-efficiency.
  • Alex argues that Anthropic's projects, including 'Project Deal' (running a marketplace), are driven by a strategy to maximize the economic value per token generated by their models.
  • The trial between Elon Musk and OpenAI in Oakland Federal Court has begun with jury selection, unveiling details through discovery that are likely to be publicly aired.
  • OpenAI's Chronicle builds memories from periodic screenshots of user activity, raising privacy concerns despite its potential to offer 'telepathy-like' assistance.
  • Salim states that 44% of Gen Z workers are sabotaging AI automation efforts by providing incorrect data, highlighting a significant workplace resistance to AI.
  • World ID verification is integrating into Zoom to combat deepfake fraud, which caused $130 million in losses between 2019-2023, and is projected to reach $40 billion by 2027.
  • Dr. David Lutske's viral post demonstrates Grok creating a realistic AI Frenchwoman with a reflective ID, suggesting video ID verification may soon be unreliable.
  • Startup CEOs are engaging in 'token maxing,' spending heavily on AI compute, which Dave views as a necessary step for learning and getting into the AI race, despite it appearing as a vanity metric.
  • Sheikh Mohammed announced that the UAE will launch an agentic AI government model within two years, with 50% of all government service operations run by AI.
Also from this episode: (6)

AI & Tech (4)

  • Dave notes that current AI models can manage dozens or hundreds of other models successfully, enabling consumers to ask a coordinator model to install software or build things without needing technical expertise.
  • OpenAI released ChatGPT for clinicians, a free AI co-pilot that outperforms human doctors on health benchmarks, scoring 59 compared to 43.7 for human clinicians.
  • Project Top Heart by NYU and Stanford uses AI to analyze 20 variables, aiming to increase the number of viable donor hearts for transplant by an additional 500 per year.
  • The FDA-approved blood pressure medication Candesartan has been repurposed using AI to stop and inhibit MRSA infections, which affect 2.8 million people and kill 35,000 annually in the U.S.

Science (2)

  • MRNA vaccines for pancreatic cancer show lasting results, with 87.5% of patients who generated a strong immune response still alive after six years, compared to a historical 13% survival rate.
  • A single-shot CAR-T infusion achieved 100% cancer-free status in all 20 melanoma patients within two months of treatment, with no recurrence after a median follow-up of 15.3 months.

Reiner Pope – The math behind how LLMs are trained and servedApr 29

  • Reiner Pope explains that batch size is the key variable driving the trade-off between inference latency and cost. Batching amortizes the fixed cost of fetching model weights across many user requests.
  • Without batching, serving a large model is uneconomical. Pope states the cost can be a thousand times worse than when batching just two users together.
  • A roofline model for inference time combines compute time and memory fetch time. Compute time scales linearly with batch size, while memory time includes a constant for weights and a term linear in batch size for the KV cache.
  • There is a hard lower bound on inference latency set by the time needed to read all the model's total parameters from memory into the chips, which is independent of batch size.
  • Pope solves for the batch size where compute and memory times are balanced. The formula is batch size >= (Flops / Memory Bandwidth) * (Active Params / Total Params), where the hardware ratio Flops/Bandwidth is ~300.
  • This balance point implies the optimal batch size is approximately 300 times the model's sparsity ratio. For DeepSeek's sparsity of 32/256, this yields a batch size around 2000-3000 tokens.
  • In a scheduled system, a new inference 'train' departs every 20 milliseconds. Worst-case latency for a user is 40ms if they just miss a departure and must wait for the next train to complete.
  • The 20ms schedule is derived from the time to read the entire HBM capacity. For a Rubin-generation system with 288GB HBM and 20 TB/s bandwidth, this is about 15ms.
  • Pope argues increasing sparsity is a pure win for inference cost, as it reduces the active parameters and thus compute time. However, it demands larger batch sizes to amortize weight fetches and consumes more memory capacity.
  • Empirical research on mixture-of-experts shows model quality can increase with sparsity. An older paper found a 64-expert model with 270M active parameters matched the quality of a dense 1.3B parameter model.
  • Mixture-of-experts layers use expert parallelism, where different experts are placed on different GPUs. This creates an all-to-all communication pattern that is optimal within a single rack's high-bandwidth scale-up network.
  • Leaving the rack uses a scale-out network about eight times slower than the internal NVLink. This makes crossing rack boundaries for expert parallelism a severe bottleneck.
  • Pope states the primary constraint on increasing rack size is physical: cable density, bend radius, weight, and cooling, not a fundamental technical barrier.
  • Pipeline parallelism, which places different model layers on different racks, is viable for inference because the communication pattern is point-to-point rather than all-to-all, making scale-out latency manageable.
  • Pipelining reduces the memory capacity needed per rack for model weights but does not reduce the memory needed for the KV cache, which becomes the dominant memory consumer.
  • Pope argues the value of large scale-up domains like Google's or NVIDIA's Rubin is not primarily memory capacity, but memory bandwidth, which directly lowers inference latency and enables longer context lengths.
  • He presents a heuristic cost model for model development: total cost = pre-training cost + RL cost + inference cost. He conjectures labs roughly equalize these three costs.
  • Applying this model, Pope estimates frontier models are overtrained by a factor of about 100 relative to the compute-optimal Chinchilla scaling law, due to the need to amortize training compute over vast inference usage.
  • Pope reverse-engineers API pricing to deduce system bottlenecks. Gemini charging more for contexts over 200K tokens suggests a memory-to-compute crossover point near that length.
  • Output tokens being ~5x more expensive than input tokens indicates decode is memory-bandwidth bound, while pre-fill is compute-bound, as pre-fill amortizes memory costs over many tokens.