What happened with ai agents replace junior qa roles at startups?

AI coding agents now handle 80% of routine QA tasks at early-stage startups, slashing hiring needs.

What happened with ai agents replace junior qa roles at startups?

Tools like Codex and Claude Code cut bug detection time from days to minutes.

What happened with ai agents replace junior qa roles at startups?

Human engineers shift to curating test logic and managing agent workflows.

AI & TECH

AI agents replace junior QA roles at startups

Friday, July 3, 2026 · from 13 podcasts, 20 episodes

AI coding agents now handle 80% of routine QA tasks at early-stage startups, slashing hiring needs.
Tools like Codex and Claude Code cut bug detection time from days to minutes.
Human engineers shift to curating test logic and managing agent workflows.

AI agents are quietly replacing junior QA engineers at startups. At firms using Codex, Claude Code, or Cursor, automated agents now run regression tests, validate edge cases, and flag anomalies - tasks once reserved for entry-level developers. According to Andrew Ambrosino on Lenny's Podcast, OpenAI’s internal teams use Codex weekly for code reviews and bug detection, with non-engineers in legal and finance now running their own QA workflows.

The shift is structural. Nathaniel Whittemore on The AI Daily Brief notes that stalled frontier models like GPT-5.6 have forced companies to master existing tools. Startups now treat AI agents as full team members - Anthropic reports 65% of code originates from Slack conversations via Claude tag. These agents don’t just assist; they initiate background work, reducing human oversight.

"We’re not waiting for the state-of-the-art; we’re building the good enough that we actually own."
- Will Brown, Prime Intellect

Startups like Base 44 fine-tune models for specific QA tasks. CEO Mayor Schlommo argues general models waste compute on irrelevant reasoning. His team’s Base 1 model runs continuous integration checks 60% faster than GPT-4o, catching memory leaks and race conditions in real time. This isn’t augmentation - it’s replacement.

The bottleneck has shifted from execution to curation. Ambrosino notes that at OpenAI, 90 uncoordinated teams might build 90 versions of the same feature because implementation is trivial. The hard part now is deciding which version works. Taste, not typing, defines value.

"Models lag at design because grading good design is more tedious than grading functional code."
- Andrew Ambrosino, Lenny's Podcast

Human engineers aren’t gone - they’re promoted. They now design test strategies, refine agent prompts, and manage feedback loops. The junior QA role, once a training ground, is vanishing. As Dwarkesh Patel notes, AI learns from deployment, not just training. Every test run makes the agent smarter, closing the loop without human input.

Source Intelligence

- Deep dive into what was said in the episodes

The AI Daily Brief: Artificial Intelligence News and Analysis

Nathaniel Whittemore

Fable is Back: Here's What You Should Try First • Jul 1

AWS announced a $1 billion investment to create a new unit of forward-deployed engineers, aimed at helping customers set up and use AI tools, expanding its Generative AI Center established in 2023.
The Information reports Anthropic plans to integrate Claude Tag, an organization-centric AI agent with persistent memory and tool access, into Microsoft Teams, building on its existing Slack integration.
SpaceX is offering half-price Starlink subscriptions and free hardware in Memphis, in addition to recommitting to a wastewater treatment plant, to mitigate local opposition to its Colossus data centers.
Anthropic clarified that other models, including Claude Opus 4.8 and GPT 5.5, could identify and exploit the same code vulnerabilities as Fable 5, indicating the reported 'jailbreak' did not expose unique Mythos-level cyber capabilities.
Anthropic implemented a new classifier for Fable 5, achieving a claimed 99% success rate in blocking the specific behavior from the Amazon report, though it may increase false positives for benign coding tasks.
Dean Ball noted the opacity surrounding Fable 5's return, questioning what changes Anthropic made and what commitments were agreed upon, arguing it creates an unstable environment for the AI industry.
Anthropic launched Claude Sonnet 5, their 'most agentic' Sonnet model, which can plan and use tools autonomously at a level previously requiring larger, more expensive models, performing near Opus 4.8 benchmarks for a lower introductory price.
Ben Davis suggests Sonnet 5 requires a distinct usage approach, describing it as an 'automatic Ralph loop' that spawns sub-agents and performs self-review, implying it is not meant for the same direct prompting as older models.
Any Panuani recommends using Fable 5 for high-level planning and project improvements, delegating concrete implementation tasks to other models like GPT 5.5, and using GPT Pro for reviewing Fable's output.

Also from this episode: (6)

AI Infrastructure (1)

Nathaniel Whittemore reports OpenAI found a method to halve inference costs for existing models used by ChatGPT users who aren't signed in, serving that segment on only 100 GPUs.

Open Source (1)

Deepseek open-sourced DSpark, a speculative decoder system that achieved an 85% inference speed increase during testing on small models, highlighting ongoing efforts in optimization beyond OpenAI.

Models (4)

Anthropic announced Fable 5's return for all global paid subscribers, starting July 1st, after the Department of Commerce lifted export controls that had kept the model offline.
Nathaniel Whittemore points out that external benchmarks, such as Cursor Bench and Max Effort, indicate Sonnet 5 can be more expensive than Opus 4.8 or even Fable per task due to generating significantly more output tokens.
Nathaniel Whittemore found Fable 5 significantly better than GPT 5.5 and Opus 4.8 for strategic thinking, noting its unique ability to accept partial pushback while maintaining its core arguments, making it more valuable for iterative discussions.
Nathaniel Whittemore's real-world use of Fable 5 for writing revealed it was superior in instruction following and avoided common AI writing clichés, particularly for tasks with clear rubrics.

#Regulation #Enterprise #AI Infrastructure #Agents

The Capability Overhang Playbook • Jun 28

Nathaniel Whittemore defines the 'capability overhang' as the gap between the latent power of existing models and the real value most individuals and organizations extract from them.
Whittemore asserts a forced AI pause is underway due to stalled frontier model releases: GPT-5.6, Claude Sonnet 5, and Gemini 3.5 Pro have been delayed, while Fable 5 remains blocked.
Leo from SynthWave reported GPT-5.6's new target release is mid-July and DeepMind delayed Gemini 3.5 Pro due to dissatisfaction with its current state.
AI Battle data shows the current wait for GPT-5.6 is 61 days, exceeding previous update gaps of 29, 56, and 49 days within the GPT-5 era.
Prediction market odds for a GPT-5.6 release this week collapsed from nearly 90% to below 30% on Tuesday, indicating a sharp change in expectations.
Policy advisor Dean Ball argues the entire US AI industry is frozen from new public releases until the government resolves the Fable situation.
Whittemore's Capability Overhang Playbook first advises individuals to create a personal learning agenda by honestly assessing their weaknesses in AI tools and workflows.
He recommends building a personal benchmark or eval portfolio: reusable task sets with prompts and success criteria to quickly gauge new model performance.
WorkAI Institute Glean study found knowledge workers spend about 2.4 hours weekly organizing context for AI agents, a drain on productivity.
To reduce context overhead, Whittemore suggests building portable context assets, either broad-based personal portfolios or per-project context packs.
He cites two resources for this: his own project ContextPortfolio.ai and Jim Sanguine's 'The Librarian,' an agentic OS curator.
Whittemore advises users to experiment deeply with current AI harnesses by building the same project in both Claude Code/Cowork and Codex to compare interfaces and tool interactions.
He recommends exploring specific plugins within tools like Claude Code to discover new capabilities relevant to your role, as experimentation often falls off daily to-do lists.
For holdouts, Whittemore urges building a full end-to-end agent architecture, using resources like the free AgentOS program and employing a 'two window' method with a build window and a tutor chat.
Whittemore argues individuals should explore model independence using routers like Open Router and open models from Hugging Face, and question their own priorities around cost, privacy, and control.
For organizations, he suggests reviewing learning resources and incentive structures for AI adoption, ensuring they reward effective use and sharing of reusable systems.
Whittemore warns organizations about an 'overly strong known ROI bias' from token efficiency, which could prioritize efficiency AI over opportunity AI for new products and capabilities.
He proposes organizations develop a measurement philosophy linking AI usage to both individual and business outcomes, differentiating between adoption, usage, and outcome metrics.
An advanced pattern involves shifting from actively managing AI prompts to architecting loops where AI iterates towards a set goal, utilizing the '/goal' feature as a new primitive.
Whittemore recommends turning context portfolios into MCP servers to increase portability and efficiency, gaining familiarity with a key part of the agentic ecosystem.
He advises packaging recurring capabilities as reusable 'skills' to make agent work transportable across projects, referencing a past show with Nufar Gaspar on agent skills.

#Agents #Enterprise #Models #Regulation #AI Infrastructure

The Ad Hoc AI Licensing Regime • Jun 27

Nathaniel Whittemore reports that Senator Mark Warner conveyed an NSA finding that Mythos demonstrated significant capabilities during a red teaming exercise, which some initially misinterpreted as the AI breaking into classified systems.
Nathaniel Whittemore highlights a new ad hoc, informal, and unaccountable licensing regime forming as the US government delays GPT-5.6, requesting a limited partner preview with government-approved customer access.
Zvi Mowshowitz argues the new AI policy empowers the White House to arbitrarily control access to frontier intelligence, which Nathaniel Whittemore characterizes as a maximally terrible approach.
Andrew Curran states that model delays only slow public releases, not training speed, which widens the gap between public and lab-internal AI capabilities, contradicting claims of a safety pause.
Smaller organizations and startups are increasingly experimenting with z.ai's GLM 5.2 model, while Google's Gemma 4 has accumulated 200 million downloads, indicating demand for lower-cost, alternative AI architectures.
Claude tag, a native Slack integration, enables users to tag a full instance of Claude Code to initiate background work, dramatically lowering the technical barrier for team members to leverage AI.
Anthropic reports 65% of their code now originates from Slack conversations due to Claude tag, reflecting a significant behavioral shift towards integrating AI directly into contextual workflows.
Sam Altman confirmed GPT-5.6's new models, Soul and Terra, are launching in limited preview today, not open access, at the US government's request, despite it not being OpenAI's preferred long-term model.
OpenAI's Rune argues the unofficial AI licensing regime is an inevitable and positive development, indicating government understanding of AI's gravity, and short delays are not detrimental in the long run.
Rune expresses concern that non-Americans might be permanently excluded from frontier AI access, advocating for maintaining the “Pax Technologica of the free world” to prevent such an outcome.

Also from this episode: (3)

Enterprise (2)

Will Brown from Prime Intellect notes a recent shift, with large enterprises increasingly securing compute and post-training their own in-house models, often based on GLM 5.2, as open-source strategies gain traction.
KPMG's Global AI Pulse Survey for Q2 found that AI initiatives led by a CEO were three times more likely to yield a positive return on investment compared to efforts with less CEO involvement.

AI Infrastructure (1)

Following recording, the US lifted its block on Mythos for approximately 100 selected partners, including major US companies and government agencies, generating a “nightmarish vibe shift,” according to Andrew Curran.

#Models #Safety #Big Tech #Open Source

Moonshots with Peter Diamandis

Sonnet 5 Drops, Fable 5 Will Return & Fusion's First Plant Gets Licensed w/ Philip Johnston | #268 • Jul 1

The Orion plant is intended to supply Microsoft with 50 megawatts of power starting in 2028, making it potentially the first commercial fusion plant to come online.
There are 140 humanoid robot companies developing hardware in China, and Andreessen Horowitz reported $16 billion in hardware investments in Q1 2026 for the U.S. robotics sector.
Unitree's R1 humanoid robot sells for $4,900, making advanced robotics hardware accessible and driving an explosion of software applications on top of these platforms, states Peter Diamandis.
David Blundon believes that AI demand will shift energy from an environmental issue to a national capacity issue, driving the need for abundant energy sources.
A $1.8 million Vesuvius Challenge prize was won by using AI and CT scans to read carbonized ancient Greek scrolls buried by the 79 AD Mount Vesuvius eruption.
Elon Musk announced that XAI will release a new GROK model every month for the rest of the year, involving fresh pre-training runs, and is dedicating top talent from SpaceX and Tesla to XAI.
Anthropic released Sonnet 5 as a stopgap while their flagship model remains restricted, which David Blundon describes as a 'mediocre capability at a high price point' but still in demand due to AI chip supply constraints.
StarCloud, founded in January 2022, has raised over $200 million and launched StarCloud 1 in November 2023 with the first Nvidia H-100 GPU in orbit, training an LLM in space.
Philip Johnston of StarCloud plans three launches next year, including StarCloud 2 in January with 100 times StarCloud 1's power generation, and the 200-kilowatt, 3-ton StarCloud 3 on Starship.
Philip Johnston believes 10 terawatts of compute could fit in the dawn-dusk sun-synchronous orbit, and anticipates that within 10 years, most new compute capacity will be deployed in space.
Elon Musk stated that phones using Starlink's acquired spectrum, capable of high-bandwidth video, will start shipping in approximately two years, disrupting the traditional telecom industry.

Also from this episode: (8)

Models (1)

Anthropic's flagship AI model, which Peter Diamandis refers to as Fable 5, was offline for 15 days after the U.S. government temporarily pulled it due to national security concerns.

Energy (2)

Helion, a fusion energy company backed by Sam Altman, received Washington State regulatory approval on June 16th for its Orion Fusion power plant.
Switzerland voted to reverse its 2017 ban on nuclear power, upgrading its four aging reactors that supply 40% of its power, and exploring new plant designs, according to Peter Diamandis.

Macro (1)

Experts consistently underestimate exponential growth in areas like solar, electric vehicles, and battery sales by projecting linear or sublinear growth, according to Salim Ismail and Peter Diamandis.

Robotics (4)

Morgan Stanley predicts 500,000 robots by 2030, while Elon Musk projects tens of millions to 50 million robots by 2030, and billions into the early 2030s.
Alex suggests that as the unit cost of general-purpose robots approaches zero, robots will assemble other robots, driving physical labor costs down and transforming the service economy.
U.S. law enforcement is deploying drones as first responders, with Orlando Police Department using them for 911 calls, and Sacramento police using a magnet-equipped drone to disarm a suspect.
The global drone market is approximately $100 billion, with U.S. manufacturers like Skydeo replacing Chinese DJI drones due to import controls and security concerns.

#Robotics #Energy #Autonomous Vehicles #AI Infrastructure

The $10B Satellite Empire Putting AI in Orbit, Why Chips Beat Rockets & China's #1 Open Model | EP #266 • Jun 26

Will Marshall coined the term 'Large Earth Models,' which combines planetary sensing data with large language models to enable AI to understand the physical world, moving beyond theoretical text-based knowledge.
Will Marshall states Planet's revenue is approximately 60% from defense and intelligence, 25% from civil government, and 15% from commercial clients. AI is lowering barriers to entry, making space data accessible to more entities.
Will Marshall argues that while launch costs are important, the long-term competitive advantage in orbital compute will depend more on compute efficiency (flops per watt) than on raw launch capacity.
The 'AI brain drain' saw key researchers Noam Shazeer (Transformer paper lead) move from Google to OpenAI and John Jumper (Nobel laureate, AlphaFold co-creator) from Google DeepMind to Anthropic.
Alex argues that Google DeepMind has fallen behind OpenAI and Anthropic in frontier AI models, influencing top researchers to seek raw access to cutting-edge models in smaller, more agile organizations.

Also from this episode: (7)

Space (2)

Planet is a public company operating the world's largest Earth-observing satellite fleet, with its stock increasing 450% over the last year. The company generates 25 terabytes of imagery daily from its 200 satellites.
Planet's satellite fleets offer varied resolutions: a scanning fleet (Owl) upgrading from 3-meter to 1-meter, a high-resolution system aiming for 30-centimeter daily imagery, and a Tanager hyperspectral imager with 400 spectral bands.

AI Infrastructure (3)

Planet has indexed the Earth for searchability over the last decade, accumulating a 150-petabyte archive of 3,000 images for every landmass point. This historical data is crucial for comparing current conditions to past norms.
Planet is placing Nvidia chips on its satellites for edge processing and satellite-to-satellite communication, significantly reducing data analysis time from hours to seconds for time-critical applications like disaster response.
Project Suncatcher involves putting TPUs for Google into orbit, an early step towards orbital AI compute. A Google study suggests compute in orbit will be cheaper than terrestrial when launch costs reach $200-$300 per kilogram.

Climate (1)

Will Marshall maintains Earth is by far the best planet, emphasizing the need to protect its unique biosphere. Space technology's primary role should be to help manage Earth intelligently, not solely for off-world colonization.

Robotics (1)

Will Marshall believes that AI needs to interact with the physical world through sensors and actuators to achieve true intelligence and consciousness, moving beyond current 'brains in a vat' large language models.

#AI Infrastructure #Models #Chips #Big Tech #Reasoning

Simon Dixon Hard Talk

Dismantling the US Empire: Capital Flows, Multipolarity, and Pakistan’s Role in West Asia • Jun 30

Also from this episode: (31)

Other (31)

The host argues that Donald Trump's actions, including dismantling the US empire, are driven by financial lobbies like the Military-Industrial Complex (MIC), Financial-Industrial Complex (FIC), and Technical-Industrial Complex (TIC), rather than personal ideology.
Simon Dixon analyzes global events through capital flows and financial statements, viewing media narratives as influenced by their financing sources, including governmental lobbies, foreign interests, and sponsorships.
Dixon identifies a global transition towards a technocratic state led by two power factions: the Technical-Industrial Complex (large tech companies like Magnificent 7/11, originating from Pentagon/CIA/DARPA funding) and the Military-Industrial Complex (defense contractors like Lockheed Martin).
The Financial-Industrial Complex (FIC) funds both the MIC and TIC, profiting from war and rebuild cycles, often involving resource extraction, regime change, and the installation of dictators.
America's global dominance shifted after leaving the gold standard in 1971, leveraging debt and currency warfare through institutions like the IMF to make countries like Pakistan debt-dependent and manipulate local currencies.
Saudi Arabia integrated into the petrodollar system in 1971 by pricing oil in dollars and reinvesting dollar proceeds into US government bonds and companies, solidifying its link to the Federal Reserve and the FIC.
US energy independence due to fracking and partnerships with Canada and Mexico reduced its need for Middle Eastern oil, diminishing the petrodollar's original purpose and shifting Gulf countries' focus to China for wealth generation.
Dixon argues America functions as a 'securitized collateralized debt obligation,' not a nationalistic country, hosting global FIC, TIC, and MIC interests, with politicians serving private power rather than the populace.
Transnational capital, embodying the FIC, operates globally, superseding national interests and collaborating with sovereign wealth funds, with China's being the largest, to control political processes and leverage military assets.
The world is transitioning to multipolarity with regional blocs, marking an unwinding of the American nation-state and asset stripping, as West Asia realigns eastward with new regional pacts emerging.
Pakistan's role is evolving from dollar debt dependency; after running out of dollars at COVID, it restructured debt with Gulf countries, joined the Belt and Road Initiative, and leverages its military and nuclear power in a new multipolar order.
The host suggests the British Empire, too, was controlled by transnational capital, extracting wealth from Europe, transferring it to the US, and enabling the 1947 post-WWII world order through the UN, IMF, and World Bank.
Historically, Pakistan served as a US satellite, experiencing military regimes and conflict aligned with US geopolitical interests in the region, such as countering the Soviets in Afghanistan and supporting the War on Terror.
Simon Dixon traces transnational capital's origins to the Dutch Empire, which created the central bank, Dutch East India Company, and limited liability company, effectively a Ponzi scheme socializing losses and privatizing gains.
The transfer of power from the Dutch to the British and then to the American Empire involved leveraging debt, financing conflicts like World War I, funding the Bolsheviks, and manufacturing economic crises to consolidate assets like gold.
After World War II, the IMF and World Bank were created to issue dollar loans, making countries debt-dependent and forcing them off the gold standard onto a dollar standard, ultimately defaulted on by the Nixon Shock in 1971.
The Safari Club, a CIA operation, orchestrated King Faisal's assassination and groomed Osama bin Laden, facilitating regime change in Saudi Arabia and justifying wars, while America used a 'divide and conquer' strategy in regions like Kashmir.
Dixon asserts Israel was a node for creating regional conflict and deception, weaponizing religious narratives and historical grievances to serve military interests, with Pakistan becoming highly subordinated to this structure.
China's nationalized banking system (PBOC) resisted Western financial penetration, building strength under a communist model, and now challenges the Bank for International Settlements (BIS), leading to a changing world order.
The US FIC seeks to extract wealth from America while managing capital outflows into other countries, leveraging ETFs (50% of investments managed by BlackRock, State Street, Vanguard) and BlackRock's Aladdin AI technology to coordinate global capital flows.
A 'global technocratic surveillance state' is forming through the TIC and China, with Western tech and finance executives meeting Xi Jinping to negotiate continued access to cheap labor, components, and integration with China's financial systems.
China possesses significant leverage over the US by potentially rug-pulling the AI bubble, crashing the bond market by selling US debt (from $1.4 trillion to $650 billion), and creating a commodity squeeze in the derivatives market.
Transnational Capital dictates regional outcomes, like allowing Russia to control Ukraine or China to have Taiwan, while renting out the US military for resource acquisition and maintaining a narrative of US strength during its retreat.
The Middle East is seeing an end to the petrodollar, with UAE leaving OPEC and establishing FX swap lines, while the destruction of Europe through the Russia-Ukraine war turns it into a vassal state of the FIC.
A new regional order in West Asia involves GCC countries negotiating with Iran, settling internal issues, and potentially partnering with Pakistan, Turkey, and Egypt, backed by China, to rebuild infrastructure and expel US influence.
Israel, once a US military node, is being strategically weakened and asset-stripped by Gulf countries, India, and UAE, with a prediction of a Palestinian state and GCC-funded rebuilding in Lebanon and Gaza within five years.
The US is retreating to its Western Hemisphere (Monroe Doctrine) to exert influence and ensure energy security during its AI and robotics transition, using conflicts like the Iran war to justify military base closures and facilitate regional power shifts.
Regional conflicts are expected to decline as state-sponsored proxy wars end, with militia groups reverting to state power or integrating into national structures, driven by settlements between FIC and China.
The tactics of colonization and manufactured civil unrest are 'coming home' to the West, leading to domestic terrorism, increased police surveillance, and a privatized prison sector, as the state views its internal populations as the enemy.
The current US stock market is propped up by Ponzi schemes, debt rollovers, and fiscal dominance, where a centralized decision by the Fed, BIS, and PBOC determines whether a crash occurs or wealth concentration continues.
Simon Dixon differentiates Bitcoin from other cryptocurrencies and stablecoins, asserting that self-custodied Bitcoin, enforced by code and a decentralized network of miners and nodes, remains sovereign money immune to state control.

#Fed #Banking #Macro #China #Markets

The a16z Show

Building AI for Creators | Luma & Phota Labs • Jun 30

Matt Tansik highlights a divergence between research and product development, where researchers optimize for technology gaps (e.g., generating dense text) while users prioritize simplistic solutions to specific problems (e.g., removing backgrounds or fixing lighting).
Zach Scha emphasizes the challenge of balancing technological advancement with user needs, stating that products must both solve current user problems and slightly precede user expectations to reimagine creative workflows effectively.
Both guests note that artists rarely have a fixed end goal; instead, iteration is a crucial part of the creative process, requiring AI tools to support flexible and adaptive workflows rather than one-shot generation.
Users have surprised Photo Labs by repurposing identity preservation APIs for generating personalized AI avatars or video frames, demonstrating creative solutions for identity consistency in new media despite initial design intent.
Matt Tansik posits that future AI creative interfaces will involve agents that adapt to varying levels of user control, abstracting away tool complexity for some while offering fine-grained manipulation for others, similar to diverse film director styles.
Zach Scha argues that models must become more proactive by asking users for clarifying inputs, mirroring the interactive dialogue between human creative professionals and clients to overcome ambiguity and better achieve desired outcomes.
Matt Tansik highlights that owning a model allows for specific customization to a product's user base, offering specialized adjustments that may not be feasible with general open-source models and providing a competitive advantage.

Also from this episode: (5)

Models (4)

Matt Tansik and Zach Scha agree that AI technology fundamentally serves as a tool for artists, enabling them to execute their creative vision by automating manual effort and simplifying complex workflows, rather than replacing human creativity.
Matt Tansik, co-creator of NERF, views generative AI as an advanced extension of earlier 3D creation tools, providing expanded capabilities across images, video, and 3D to help artists construct their stories more efficiently.
Zach Scha observes that while creativity in traditional photography centered on capturing decisive moments, generative AI has significantly enhanced post-capture creative possibilities, allowing photographers more artistic freedom after the initial shot.
Zach Scha's Photo Labs focuses on personalization technology, empowering users to own and combine their personalized models with any foundation model, aiming to separate personalization from core foundational AI capabilities.

Society (1)

Matt Tansik and Zach Scha conclude that AI will elevate the baseline quality of art creation for average users but will also widen the gap with top artists, who can uniquely leverage powerful tools to manifest complex, holistic visions.

#Agents #Models #Startups #Enterprise

Beyond P(doom): Marc Andreessen - Betting on America • Jun 29

Marc Andreessen argues AI can serve as a 'world's best' tutor, doctor, lawyer, or accountant in your pocket, but current policy prevents it from performing these licensed roles.
Research suggests AI boosts productivity for both top performers and median performers, raising the average skill level across fields like law, screenwriting, and programming.
Alpha School demonstrates a private AI-driven education model where AI handles two hours of academic instruction and teachers focus on six hours of project-based work, but Andreessen believes the public system will resist this change.
He frames the U.S.-China AI race as a choice between two contradictory goals: exporting AI for global supremacy or restricting it for safety, with Europe having 'suicidally' regulated itself out of contention.
Advanced AI models like Mythos present a dual-use dilemma: they are superior tools for both cyber attack and defense, creating a policy tension between restriction and rapid deployment for security.
He notes China's strategic promotion of open-source AI acts as a 'turbo dumping' strategy to flood the market and undermine American commercial viability, creating an ironic dynamic where the 'totalitarian' regime pushes openness.
Given deep civil-military fusion in China, Andreessen acknowledges the risk of dual-use but argues controls are futile because AI models are just files on a hard drive and U.S. companies lack the counterintelligence to prevent leakage.
Andreessen points to a reindustrialization push in defense and energy, with startups in new nuclear, rare earth processing, and U.S.-built transformers, supported by current administration policies and potentially creating a second industrial 'Silicon Valley' around Los Angeles.
He states successful companies in this space organize around larger national goals like American manufacturing, which attracts talent and co-locates R&D, rather than making a direct financial trade-off against outsourcing.

Also from this episode: (5)

Business (2)

Andreessen describes a bifurcated economy: 'blue' sectors (tech, software, TVs) see rapid innovation and price deflation, while 'red' sectors (healthcare, education, housing, law, government) have zero or negative productivity growth and skyrocketing prices.
He argues heavy regulation in red sectors restricts supply and subsidizes demand, causing prices to spiral and allowing those sectors to 'eat the entire economy,' suppressing overall growth despite rapid technological change.

AI Infrastructure (2)

Physical bottlenecks at every layer - energy, data center facilities, turbines, transformers, cooling systems, NVIDIA GPUs, and memory chips - constrain AI development and may halt price declines for intelligence.
Andreessen contends 99% of constraints on AI infrastructure are domestic, like county-level opposition to data centers and false memes about water usage, not external tariffs.

Models (1)

Andreessen advocates for maximum export of American AI, aiming for a world where even China runs on it, and using advanced models to armor systems against cyber attacks, including ransomware.

#AI Infrastructure #Macro #Regulation #China #Open Source

AI Is Crossing the Frontier of Human Knowledge | Kevin Weil • Jun 26

Kevin Weil argues AI's biggest impact won't be productivity but accelerating scientific discovery, aiming to bring the science of 2050 to 2030.
OpenAI models have solved at least 10 open mathematics problems in January 2024, using GPT-5.2 and Gemini, proving AI can now operate beyond the frontier of human knowledge.
Weil describes AI capability progression as a rapid arc from 'models could never do this' to 'they can barely do it' to 'models are great at this' within six to twelve months.
The OpenAI for Science team leverages all OpenAI research and scientific data to train models for frontier problems in math, physics, and theoretical computer science.
Kevin Weil believes high agency, curiosity, and fast learning are the most valuable skills in the current AI moment because you can now create anything you think of.
Weil sees the industrial revolution analogy applying to AI, where mass-produced capabilities displace crafts but unlock new creativity and scale, like bespoke human-made websites.
Kevin Weil recommends startups use ensembles of models, orchestrating cheaper specialized models with a larger planning model, rather than relying on a single giant prompt.
Weil cites OpenClaw built on Codex as a sign of the future: an agent framework assembled in three days pointing to an emergent world of AIs working together.
Weil believes enterprise adoption leads because AI does economically valuable work and has immediate usage costs, creating low-hanging fruit for B2B companies reaching $100M quickly.
OpenAI's apps platform aims to enable startups built entirely on it, without traditional websites or apps, as models get better at using diverse tools within one interface.

Also from this episode: (4)

Robotics (1)

Weil envisions future science driven by AI models that think for days or months, orchestrating robotic labs in reinforcement learning loops that scale horizontally and run 24/7.

Social Media (1)

At Twitter, Weil championed ranking the feed over real-time chronological order despite controversy, citing metrics showing double-digit positive engagement growth.

Models (1)

For OpenAI's O1 reasoning model UX, Weil modeled the interaction on human behavior: giving periodic updates during deep thought rather than immediate babbling or total silence.

Business (1)

Weil advises interpreting product data deeply, not just chasing 'number go up', to avoid novelty effects or user confusion, and treating conflicting user anecdotes as signals of bimodal needs.

#Reasoning #Robotics #Agents #Models #Enterprise

Behind the Bastards

Part One: Elizabeth Dilling: The Original Candace Owens • Jun 30

Also from this episode: (13)

History (6)

Robert presents Elizabeth Dilling, born Elizabeth Eloise Kirkpatrick, as an early female American far-right influencer, comparable to modern figures like Candace Owens or Laura Loomer.
Elizabeth Dilling was born April 19, 1894, in Chicago, a city marked by significant labor history like the Pullman Strike and the 1893 economic panic that fueled the American left. Her affluent family opposed these progressive movements.
Robert notes Elizabeth Dilling's father, Dr. Leete Kirkpatrick, was a famous surgeon from a wealthy Scots-Irish landowning family in Ireland, likely absentee landlords who fled during the late 1800s Irish Land Wars.
Between 1923 and 1939, the Dillings took ten multi-week or multi-month overseas trips, traveling through Europe, including a private audience with the Pope in Vatican City, often amidst global financial hardship.
Robert explains Harry Young's "Vigilant Intelligence Federation" compiled over a million names on radicals and popularized "The Protocols of the Learned Elders of Zion" through a movie, a foundational anti-Semitic work later deemed "Unamerican" by the Department of Justice. Elizabeth admired and utilized Young's research.
Elizabeth published her influential 1934 book, "The Red Network, Who's Who and Handbook of Radicalism for Patriots," which served as a "phone book of leftist radicals" and, Robert notes, essentially a "kill list" later referenced in the McCarthy hearings.

Society (4)

Christine Erickson's 2002 article for the Journal of American Studies described Dilling's family as "financially comfortable, upper middle class"; her parents traveled Europe often, and despite her father's early death, the family remained wealthy.
Elizabeth grew up with strong conservative values but also observed her mother independently managing family affairs after her father's death, contrasting her father's belief that women belonged in the home.
Despite her traditional values, Elizabeth was described as "somewhat lonely, and casting about in search of a career," never interested in solely being a homemaker, a paradox Robert attributes to figures like Phyllis Schlafly or Erica Kirk.
Elizabeth married Albert Walwick Dilling, a lawyer and engineer born in Salt Lake City in 1892, in Laporte, Indiana, in May 1918, when she was 24 years old.

Philosophy (1)

In 1917, Elizabeth met an army officer who introduced her to philosophy, including Kant, Nietzsche, and Hegel, but simultaneously claimed "women did not count as human beings," an interaction Robert deems formative yet bleak for her ideological development.

Religion (1)

Robert suggests Elizabeth, feeling guilt from her Catholic upbringing over purely hedonistic travel, convinced Albert to book trips to communist countries like Soviet Russia in 1931 to study communism's impact.

Culture (1)

Elizabeth wrote about her Soviet Union trip, expressing horror at "atheism, sex, degeneracy, broken homes, and class hatred," though Robert notes her guided tour prevented her from seeing actual atrocities. Her primary distress was observing churches converted into museums and nude bathers.

#History #Society #Media #Corruption

The Tucker Carlson Show

Tucker Carlson

BREAKING: Merchant of Death Warns Russia Is Preparing for a Devastating Attack on Western Europe • Jun 29

Victor Bout claims Ukraine has increased drone attacks on Russian refineries and civilian targets using long-range drones, some equipped with AI vision. He alleges a terror attack killed 21 students in Lugansk and cites attacks on buses and individuals.
Victor Bout suggests the Ukraine conflict could be swiftly ended if the United States removed its support, specifically by disabling Starlink communication for the Ukrainian army and ceasing satellite intelligence sharing.
Victor Bout believes drone technology has drastically altered warfare, leading to tactical changes every two weeks and a technological race for cheaper, more efficient, and jam-resistant drones. He notes units are now 3D printing and assembling their own drones.

Also from this episode: (18)

War (9)

Victor Bout claims the conflict is between Russia and a Western coalition including NATO and the US, not solely Ukraine. He notes Western arms supplies have escalated from anti-tank missiles to F-16s, with production now largely moved to Europe.
Victor Bout states that Ukrainian military production has relocated to Europe, identifying factories in Poland, Germany, France, Italy, Denmark, Turkey, Spain, Belgium, and the Netherlands. He notes British involvement in the 'Flamingo' cruise missile program.
Victor Bout argues that Europe's role in arming Ukraine makes it a direct participant in the conflict, justifying potential Russian strikes on logistics centers in Poland, Romania, and Germany under international law.
Victor Bout suggests the ongoing conflict in Iran is causing significant economic damage, leading to fertilizer shortages impacting global agriculture and contributing to forecasts of food scarcity and hunger, particularly in Africa.
Victor Bout alleges the Ukrainian 'Nazi regime' commits atrocities, including threatening civilians, depriving them of water, and shooting them. He compares their actions to the German army's occupation during WWII, asserting they are 'more ferocious.'
Victor Bout criticizes the Ukrainian government for celebrating Bandera movement leaders, whom he holds responsible for the ethnic cleansing of over 100,000 Poles, Jews, and Russians. He notes German hypocrisy in forbidding Nazi symbols while ignoring their display in Ukraine.
Victor Bout claims Ukraine is experiencing severe demographic problems due to high casualties, estimating Ukrainian losses at 3:1 or 5:1 compared to Russia. He asserts the government forces men to the front lines and fails to retrieve fallen soldiers' bodies to avoid paying compensation.
Victor Bout alleges Ukraine has repeatedly attempted to attack nuclear power plants, including Zaporizhzhia and a planned hijacking of the Kursk nuclear power station in Russia, which he says occurred in 2012.
Victor Bout claims elements of the Ukrainian military have sold Western-supplied weapons, including to Hamas, Hezbollah, and Mexican cartels. He expresses surprise that Israel, despite knowing this, has remained silent and even trained Ukrainian forces.

Politics (6)

Victor Bout claims European leaders, facing unpopularity, use the war to cling to power by declaring emergencies and postponing elections. He attributes this to a 'globalist' agenda, alleging figures like George Soros have cultivated these leaders.
Victor Bout views NATO expansion, particularly Finland and Sweden's recent membership, as a national security concern for Russia. He mentions Russia's creation of a new military district along its 1,500-mile border with Finland.
Victor Bout warns that Russia's military doctrine permits a nuclear response if its existence is threatened, asserting Russia possesses a modern and powerful nuclear arsenal capable of obliterating countries like England or France.
Victor Bout characterizes the Ukraine conflict as a 'religious conflict,' alleging the Ukrainian government's suppression of the Russian Orthodox Church and establishment of a new Ukrainian Orthodox Church. He claims monks were evicted from the Kiev-Pechersk Lavra monastery, and relics pillaged.
Victor Bout questions Ukrainian President Zelensky's legitimacy, noting his term and that of the parliament have expired without new elections. He cites Western polls suggesting over 80% of Ukrainians desire a peace deal with Russia.
Victor Bout believes American and Russian people share common traits due to geography, including generosity and a strong work ethic. He argues that if both nations freed themselves from 'globalist' control and cooperated, they could advance science and space exploration.

Business (1)

Victor Bout asserts that Europe committed 'economic suicide' by severing ties with cheap Russian energy, forcing reliance on expensive LNG. He claims German industry, including automotive manufacturing, is in decline and militarizing, reminiscent of pre-WWII Germany.

Corruption (1)

Victor Bout describes Ukraine as a 'black hole for corruption' and a 'huge laundry machine,' alleging it has influenced US politics since the early 2000s, including recycling one-third of US aid back into US political slush funds, particularly for the Democratic Party.

Diplomacy (1)

Victor Bout shares his perspective on his prisoner exchange for Brittney Griner, feeling it was an insult and asymmetrical, as he had expected Paul Whelan to be exchanged. He attributes the Biden administration's decision to pressure from NBA fans and the LGBT community.

#War #Europe #Diplomacy #Social Media

Trump’s Social Media Advisor Reveals All: Epstein, Iran, and Mark Levin’s Israeli Propaganda • Jun 26

Also from this episode: (5)

War (1)

Alex Broussowitz claims a coordinated effort exists to pressure the US administration to continue the conflict in Iran, potentially involving foreign influence and millions of dollars flowing to right-wing influencer marketing companies.

Media (1)

Broussowitz argues that misinformation was spread by both pro-war voices and some anti-war critics, citing false claims about the US giving $300 billion to Iran related to the Memorandum of Understanding (MOU).

Diplomacy (1)

Public polling data indicates that the Trump Vance agreement, a settlement framework with Iran, is widely popular among American people, with 67% support.

Regulation (1)

Alex Broussowitz highlights working with Congresswoman Anna Paulina Luna to strengthen disclosure laws regarding foreign influence and money poured into the internet ecosystem, which he believes harms democracy.

Social Media (1)

Broussowitz details his work on Marjorie Taylor Green's social media presence, helping her grow from 2,000 Twitter followers to over 800,000 within a nine-month period in 2020-2021 through viral content and leveraging censorship.

#Diplomacy #Censorship #Middle East #Social Media

Bankless

Ethlabs: The New Org to Make Ethereum Win | Ansgar & Caspar • Jun 29

Ansgar Dietrichs argues the Ethereum Foundation's "subtraction" philosophy, meant to decentralize, has stalled critical L1 upgrades for years.
Ethlabs is a "doing" organization, directly hiring engineers to ship code and accelerate the roadmap, unlike the EF's grant-based approach.
Caspar Schwarz-Schilling warns that increasing Ethereum L1 hardware requirements could centralize node operation, undermining a core value proposition.
Ethlabs prioritizes "The Verge," using statelessness and Verkle trees to enable full node verification on consumer hardware, including mobile devices.
Schwarz-Schilling emphasizes that L1 evolution, through technologies like PeerDAS, is critical to efficiently support powerful L2s and avoid bottlenecks.
EIP-7702, proposed by Vitalik Buterin, allows traditional wallets to temporarily act as smart contracts during transactions.

Also from this episode: (1)

Protocol (1)

Ethlabs will push EIP-7702 to production, enabling gas sponsorship and batching multiple actions, improving retail user experience.

#Protocol #Adoption #BTC Markets

The Daily

Why Everyone Cares About This World Cup • Jun 29

Also from this episode: (14)

Sports (9)

This World Cup broke records for highest attendance and most goals scored in tournament history.
Fans from visiting nations encountered parts of America they'd never seen. Tarek Panja notes these interactions forged a mutual excitement between fans and local communities.
The tournament faced geopolitical complications, including visa difficulties and ICE enforcement concerns. A FIFA referee from Somalia was denied entry and sent back to Turkey.
Team base camps in smaller towns like Chattanooga, Greensboro, and Lawrence created viral moments. The University of Kansas marching band learned Algeria's national anthem for their arrival.
Fan groups like the Norwegian Viking Road and Scotland's Tartan Army created distinctive celebrations. Scottish fans consciously cultivated a fun, non-hooligan culture to contrast England's reputation.
The World Cup showcased the United States as an immigrant nation and a patchwork of people during its 250th anniversary. Farouk Orfred, a Jordanian-American, described his split love for America and Jordan.
Iran's participation created the first instance of a World Cup team being in military conflict with a host nation. This split the diaspora on whether to support the team, which some view as a propaganda tool.
Kevin's soccer obsession began with a VHS tape of World Cup goals. He and his father Farhad bonded over the sport, using it to connect with their Iranian heritage.
Iran's 1998 World Cup win over the USA was politicized, becoming government evidence of national prowess. Kevin and Farhad attended the 2022 USA-Iran match in Qatar after a pilgrimage to Mecca.

Immigration (1)

Farhad moved to the U.S. from Iran in 1979 after a seemingly miraculous visa approval at a crowded embassy. He recounted replaying that moment every day of his life in America.

War (1)

Kevin described the current situation as feeling like standing over a ledge, with unease and danger due to the war. Farhad hoped this was the lowest point on the curve before things improve.

Politics (3)

At Iran's match in Los Angeles, fans booed the national anthem but cheered the players. Kevin understood both reactions, noting the anthem represented the state while the players represented the people.
Iran tied all their World Cup games and was eliminated from the tournament. Farhad felt politics and real-world repercussions made success impossible.
The Iranian team left a note in their dressing room calling for peace, respect, and friendship among nations. Farhad said the message articulated what he felt.

#Society #Immigration #War #History

Robby Hoffman Will Always Feel Poor, No Matter How Rich She Gets • Jun 27

Also from this episode: (13)

Comedy (4)

Robby Hoffman, a comedian known for roles in *Hacks* and *Rooster*, grew up poor in a Hasidic community as the seventh of ten children. Her family supported her when she was outed as a teenager, and her difficult upbringing heavily influences her unfiltered comedy style.
Hoffman's comedy features no off-limits topics, including AIDS, pedophilia, and late-term abortions, with her stressing that jokes require context. She welcomes "clap back" from audiences, believing anyone can joke about anything at their own risk.
Hoffman has received backlash not for her most controversial jokes but from the "pit bull community" and the "celiac community," which she jokingly attributes to "rich white women." She argues that being offended is not the "worst thing," prioritizing issues like poverty and anti-Mexican sentiment over anti-Semitism.
Hoffman places class at the core of her comedy and worldview, aiming to highlight it front and center. She believes America often obscures its underlying traumas and poverty, and she wants to redirect focus to classism as a unifying conversation.

Society (5)

Hoffman finds increased wealth and fame "tremendous" but notes it shifts her perception of societal "weirdness" from the poor to the rich. She argues that rich people exhibit a lack of generosity, like having large fridges but not offering food, in contrast to poor households that share readily.
Hoffman's early memories of Crown Heights include prevalent robberies, physical abuse by her father towards her mother, and sleeping in overcrowded, hot conditions without air conditioning. Her parents were very young, 35 and 30, when raising their ten children.
Hoffman asserts that "comfort" is a concept primarily for the rich, contrasting the thin-walled homes where poor families openly discussed finances with the siloed rooms of the wealthy. Poor households, despite having less, fostered more relaxed and hospitable social environments.
At 17, Hoffman was publicly outed by a girlfriend in a conservative environment, leading to the loss of most of her friends overnight. She was already living independently, working, and studying, viewing another major life change as an overwhelming burden.
Hoffman married Gabby Wendy, known from *The Bachelor* and *The Traitors*, and credits their shared background of humble beginnings for their strong relationship. They are mutually healing each other from past traumas through their commitment and trust.

Psychology (1)

Her experience with poverty ingrained a lasting aversion to frivolous spending, even with money available. Hoffman cites examples like refusing to buy $7.99 raspberries and comparing it to her great uncle's reaction when gas prices hit $1.

Religion (1)

Her family's transition away from an insular religious community began when her grandfather helped her mother escape her abusive father, moving them to Canada. Hoffman's mother spearheaded this shift, taking on male religious duties and advocating for her sons to learn English beyond Yiddish and Torah.

Education (2)

While attending a private Jewish school on scholarship, Hoffman felt socio-economically out of place and tried to fit in by adopting a feminine "Jappy" persona and altering her speech. Coming out and pursuing stand-up comedy later empowered her to embrace her true self.
After studying accounting at McGill, Hoffman initially balanced a professional career, using her given name Rivka at work and Robbie for comedy, to conceal her stand-up from employers. She believes comedy "chooses you" and feels most free and at home performing on large stages.

#Society #Philosophy #Media #History

Lenny's Podcast

OpenAI Codex lead on the new shape of product work | Andrew Ambrosino • Jun 28

Andrew Ambrosino states that 90% of OpenAI's entire company uses Codex weekly, not just engineers, indicating broad internal adoption beyond technical roles.
Ambrosino argues implementation is no longer the expensive part of product work; curation, taste, and steering the right ideas are now the core challenge.
He sees the product process inverted: teams now prototype first because implementation is cheap, leading to uncoordinated parallel exploration of the same idea.
Ambrosino contends AI models currently lag at design because human taste is part of the feedback loop and grading good design is more tedious than grading functional code.
He believes design's dependence on cultural context and novelty makes it harder for models to master than software engineering, which prefers known patterns.
Ambrosino describes role collapse at OpenAI, where roles are defined by the average of tasks performed, not strict boundaries between design, engineering, and product.
He warns against eliminating specialized roles like product manager, arguing disciplines have knowable best practices that shouldn't be abandoned.
Ambrosino says Codex usage has grown 6x since January and now has over 5 million weekly active users, with numbers quickly becoming outdated.
He recounts that the Codex app released in February would have failed if launched in November, as outcomes depended entirely on model improvements in that window.
Ambrosino describes a product strategy of building features that don't work yet, then waiting for model capability leaps to make them viable.
He states autonomous software development isn't ready because models increase code complexity and struggle with deleting code, reframing requests, and building proper abstractions.
Ambrosino uses Codex to automate his work, setting up tasks that scan Slack channels, summarize updates, and manage product releases, coaching the app along the way.
He shares that OpenAI's videographer used Codex to edit videos, prompting the app to build its own extension for Premiere Pro to complete tasks it couldn't handle directly.
Ambrosino explains Codex's vision as a home base that orchestrates work across other apps via connectors or computer use, not a super app that replaces all specialty tools.

#Agents #Startups #Enterprise

The Peter McCormack Show

Peter McCormack

#188 - Rizwan Virk - Simulation Theory, Quantum Physics & The Impact Of AI • Jun 26

Dan Golla says his Code of Reality nonprofit is running scientific tests, like an Apple Vision Pro study scanning brains and mapping reported symbols, to determine if people see the same thing.
Peter McCormack explains the simulation hypothesis via technological advancement, citing games like Minecraft and Fortnite, and the integration of AI like Claude with Unreal Engine.
Dan Golla says we are sub-agents of a larger system; passing the 'alignment problem' grants access to a higher-level game, but all layers of the simulation are equally real to conscious observers.

Also from this episode: (8)

Psychology (3)

Dan Golla argues DMT provides more than hallucinations; the experience feels more real than waking life, reveals coherent worlds, and appears consistent across individuals.
Dan Golla recounts a 2016 DMT experience where a being manifested in his room and communicated through guitar chords, which he did not know, convincing him of an external intelligence.
Dan Golla says the body produces DMT naturally, citing Dr. John Dean's 2019 paper that found substantial amounts of a DMT precursor in rodents' intracranial fluid, comparable to dopamine and serotonin.

Science (2)

Dan Golla describes a diffracted laser experiment on DMT that reveals a static field of alien code characters arranged in buckyballs, reported by thousands of people through his Veil Break platform.
Anthony Ness's theory suggests DMT reduces brain activity to the V1 network, letting users glimpse its crystalline structure; Code of Reality plans tests with magnetic intracranial stimulation.

Philosophy (2)

Dan Golla rejects the competition model for human survival; he argues only a collaborative, long-term 'love game' can enable a species to plan for millennia and traverse interstellar distances.
Dan Golla defines God as a real, divine force distinct from a merely advanced civilization, stating he is a secular person who knows God exists.

Culture (1)

Dan Golla points to Japan and Norway as societies with a higher 'awareness of the other', making them models for a more sustainable, orderly collective game.

#Agents #Models #Philosophy #Science

Breaking Points with Krystal and Saagar

6/26/26: Trump Dirt REVEALED: Iran, Zohran, Pardons, Deportations, Ro Calls Out Elon Lawsuit • Jun 26

Also from this episode: (16)

Politics (13)

Maggie Haberman and Jonathan Swan describe Trump's second term as operating on gut instinct, with decisions concentrated among a half dozen aides while excluding most cabinet and agency heads from key information.
Trump's main priority is cementing a personal legacy on Washington and global spheres of influence, not domestic policy, exemplified by his abrupt cancellation of a bipartisan housing bill despite Republican support.
Trump's information bubble is tighter than in his first term; he no longer scrolls Twitter organically and is surrounded by loyalists and flatterers at Mar-a-Lago, reducing external friction.
Haberman and Swan found Trump's health a 'lock box'; aides lack full information despite a statement citing 22 specialists, and they note declining hearing, unusual sleep patterns, and unexplained makeup on his hands.
Jonathan Swan details unchecked executive power under Trump: Congress ceded authority on Venezuela invasion and drug boat bombings, but the administration has obeyed Supreme Court rulings and faced recent Senate resistance.
Stephen Miller is arguably the most powerful domestic policy staffer in recent memory, with a broad remit including immigration and drug bombing campaigns, and maintains a close relationship with Elon Musk.
Trump's rapport with Zohran Mamdani stems from Mamdani's political performance and Trump's aversion to direct interpersonal conflict, complicating his attacks on 'crazy communists' like the DSA.
Rohit Khanna says New York's DSA victories were driven by candidates acknowledging the Gaza genocide and advocating wealth taxes, Medicare for All, and universal childcare, signaling a moral and economic shift in the Democratic base.
Ro Khanna claims nearly half the Democratic caucus now opposes funding for Israel, a sea change from prior deference to AIPAC, evidenced by internal resistance to the foreign ops appropriations bill.
Khanna endorses against Democratic incumbents based on policy alignment, giving leadership advance notice, and says transparency and strength earn respect even when making colleagues upset.
Ro Khanna accused Elon Musk of sentencing 4.5 million children to death by dismantling USAID, prompting Musk's threats of lawsuits and arrest; Khanna says subpoena power and independent media made his criticism effective.
Khanna says Musk cut 83% of USAID programs without reform proposals, forcing Congress to reauthorize some; Musk's February 2025 tweet about 'feeding USAID into the wood chipper' contradicts his later claims of deliberative review.
Ro Khanna argues tech billionaires are opportunistic, aligning with any administration, but a Democratic wealth tax would lose 30% of them; he criticizes Gavin Newsom for opposing a 5% California billionaire tax to protect donors.

Elections (2)

Jonathan Swan says Trump cares less about midterm elections than his aides, citing presidential immunity, promised pardons, and soaring family wealth as reasons he feels insulated from consequences.
Khanna's tech oligarch-backed primary opponent spent over $1 million and received 6% of the vote, while Khanna won 62% without spending, illustrating the unpopularity of pro-billionaire politics in his Silicon Valley district.

Culture (1)

Trump seeks to leave an imprint through cultural projects like beautifying the Kennedy Center and influencing museum exhibitions, with Vance's aide criticizing Amy Sherald's 'Transforming Liberty' painting at a Smithsonian board meeting.

#Elections #Corruption #Diplomacy #War

Dwarkesh Podcast

The next big breakthrough will be AIs learning on the job • Jun 26

AI labs aim for Artificial General Intelligence (AGI) by training models on millions of verifiable tasks across thousands of diverse Reinforcement Learning (RL) environments, creating problem-solving agents for open-ended tasks.
Progress in AI for computer use is slower than other domains like coding because it lacks 'grindability,' meaning the ability to run many parallel rollouts against deterministic, replayable simulators.
Dario suggested that short-horizon RL training may not necessarily generalize to long-horizon real-world performance, indicating limits to RLVR's ability to create general agents.
Dwarkesh states that 30-50% of AI lab compute is allocated to inference, which currently does not improve models, representing a significant waste as valuable learning opportunities occur during deployment.
Current online learning models, like the Cursor tab model predicting user-accepted edits, require learning the same objective across hundreds of millions of daily requests to be effective.
On-Policy Self-Distillation (OPSD) is a technique that distills session learnings into model weights by encouraging the base model to match a 'veteran teacher model's' predictions, without requiring verifiable rewards.
OPSD provides a denser supervision signal than naive RL by training on per-token probability discrepancies, and is superior to supervised fine-tuning by consolidating relevant insights rather than recalling every observed token.
A speculative concept called 'dreaming' proposes AIs build and train against self-generated simulations of reality, allowing them to experience orders of magnitude more samples, akin to Efficient Zero's internal game simulations.
This 'dreaming' could become a fourth axis of AI scaling, alongside pre-training, RL, and inference compute, enabling models to rehearse production skills in simulated 'video game' environments for specific users.
By 2027-2028, AIs could learn primarily from broad deployment and user interactions, accumulating experience from diverse tasks and becoming smarter with every engagement, rather than solely from pre-release training.

Also from this episode: (3)

Models (3)

Dwarkesh notes AI models are significantly less sample-efficient than humans during training, with some estimates placing them 1 to 1 million times less efficient.
Many complex real-world skills, such as building a business or winning court cases, cannot be simulated within data centers because their verification requires interacting with the real world over extended periods.
Continual learning, critical for AIs to absorb real-world experience, requires distilling knowledge into model weights rather than expanding unscalable in-context memory, mirroring how human brains learn through compression.

#Models #Reasoning #AI Infrastructure #Big Tech

No Priors: Artificial Intelligence | Technology | Startups

Really Big Test-Time Compute in AI Changes Benchmarks, Safety and Research with OpenAI Research Scientist Noam Brown • Jun 26

Brown uses poker bot creation as a personal evaluation. GPT-4.2 required steering but improved his code 10x; GPT-4.5 can now build a full solver with minimal guidance.
Multi-agent systems are scratching the surface of capability, Brown believes. Frontier models are needed to unlock their potential, analogous to human civilization's accumulated knowledge scaffolding.
Benchmark gaming is easy by scaffolding multiple model runs or using judges, Brown warns. This inflates scores without controlling for test-time compute, making comparisons misleading.

Also from this episode: (11)

Models (8)

Noam Brown argues that model evaluation benchmarks are broken because they ignore test-time compute. Performance now depends on inference budget, but published grids use single-number scores without a token or cost axis.
Brown says GPT-3 could not effectively scale test-time compute. With a $10 million budget, GPT-3 performed nearly the same as with $1, making cost irrelevant for its capabilities.
Modern models like GPT-4.5 can improve for weeks before performance plateaus on some benchmarks. Brown says you cannot reasonably test until plateau, so evaluations must adopt a fixed budget or performance curve.
Brown cites OpenAI's internal disproving of the Erdős unit distance conjecture as an example of latent capability. He claims GPT-4.5 could have solved it with $100,000 in compute via scaffolding, but nobody explored that budget.
Brown says current models are not at the point where infinite inference budget yields superintelligence across all tasks. Factual retrieval plateaus quickly, while Sudoku improves indefinitely, showing a spectrum.
Brown says models lack research taste and cannot replace researchers yet. They accelerate some tasks 100x but bottleneck others, leading to gradual transformation rather than overnight replacement.
Brown says he trusts GPT-4.5 outputs for high-stakes decisions like tax advice and condo paperwork more than expert human advice, indicating a shift in practical reliability.
Brown argues the research community agrees benchmarks should include a cost axis but is stuck in a bad equilibrium. Everyone publishes the grid because everyone expects it, despite knowing it's wrong.

Safety (3)

Safety frameworks like responsible scaling policies do not account for test-time compute, Brown states. Model dangerous capability is now a function of budget, not fixed, creating an unaddressed evaluation gap.
Large-scale test-time compute makes fast takeoff unlikely, Brown argues. Time becomes a bottleneck because models need long runtimes to unlock full capability, preventing an instantaneous intelligence explosion.
Brown states frontier labs understand the stakes and risks of their models. Competitive pressure exists, but researchers share a goal of steering toward positive outcomes.

#Reasoning #Safety #AI Infrastructure #Big Tech #Models