Price:

AI & TECH

Cursor founder warns talent bottleneck defines AI coding era

Monday, June 29, 2026 · from 5 podcasts, 6 episodes
  • Curation, not implementation, is the new bottleneck as AI can prototype any feature instantly.
  • AI eliminates job role silos, leaving firms reliant on builders with ‘taste’ to avoid ‘slop.’
  • Frontier model delays leave firms scrambling to master existing tools instead of waiting for upgrades.

The coding war is won. Implementation, the labor that once defined software, is now trivial. Andrew Ambrosino, OpenAI's Codex lead, describes an inversion of the product process: teams prototype first because building costs nothing. This abundance creates a coordination tax. At OpenAI, 90 uncoordinated teams might build 90 different versions of the same feature. The difficulty has shifted from ‘can we build it’ to ‘which of these 90 attempts is actually good.’

Ambrosino calls this the curation bottleneck. When token budgets are unlimited, the value of ‘taste’ skyrockets. It’s the only thing models can’t replicate because design taste requires a human feedback loop that training data lacks. A model knows if a function compiles, but it doesn’t intuitively understand if a UI feels ‘snappy.’ Humans remain the final arbiter of what feels right.

“When the cost of tokens is unlimited, the value of ‘taste’ - knowing how a system should feel and behave - skyrockets. It is the only thing the models can’t yet replicate.”

- Andrew Ambrosino, Lenny’s Podcast

This collapse reshuffles job titles. Ambrosino describes the modern OpenAI employee as a ‘member of technical staff’ whose actual role is just the average of where they spend their time. If a designer spends 60% of their day writing code, they are effectively a developer for that period.

He warns that abandoning specialized roles like product manager is a mistake because disciplines have knowable best practices. Stripping the role away often leads to ‘slop’ - unfocused features built simply because they were easy to code. Success now depends on ‘command over the discipline’ paired with technical agency. The ability to steer an agent is more valuable than memorizing syntax.

Nathaniel Whittemore, on The AI Daily Brief, argues firms must use a forced pause in frontier model releases to master the latent power already on their desktops. GPT-5.6, Claude Sonnet 5, and Gemini 3.5 Pro have been delayed. This creates a ‘capability overhang’ - the gap between what current models can do and how poorly we utilize them.

Whittemore’s playbook advises building personal infrastructure: portable context assets and custom benchmark portfolios to turn AI from a toy into a predictable workflow component. He warns companies are stuck in ‘Efficiency AI,’ using tools to perform existing tasks faster, and missing ‘Opportunity AI’ - capabilities and products that were previously impossible.

“Instead of waiting for a new flagship, operators should use this time to extract the latent value already sitting on their desktops. The race to the next model is paused, but the race to use them effectively is wide open.”

- Nathaniel Whittemore, The AI Daily Brief

Designers are uniquely positioned for this new era. On the a16z Show, Paul Backus noted designers consistently get better results from Claude and Codex than engineers because they use a specialized lexicon. Terms like ‘negative space’ and ‘rhythm’ act as precision steering for LLMs. He built Impeccable to encode this design-first language directly into the agent’s harness.

John Maeda argues we are moving from UX to AX - Agentic Experience. Design is no longer just about the screen. It’s about designing for robots, command lines, and ‘dash-dash help’ functions. In an AX world, value shifts from visual polish to speed, clarity, and verbosity for agents.

The bottleneck is now human talent, not model capability.

Source Intelligence

- Deep dive into what was said in the episodes

OpenAI Codex lead on the new shape of product work | Andrew AmbrosinoJun 28

  • Andrew Ambrosino states that 90% of OpenAI's entire company uses Codex weekly, not just engineers, indicating broad internal adoption beyond technical roles.
  • Ambrosino argues implementation is no longer the expensive part of product work; curation, taste, and steering the right ideas are now the core challenge.
  • He sees the product process inverted: teams now prototype first because implementation is cheap, leading to uncoordinated parallel exploration of the same idea.
  • Ambrosino contends AI models currently lag at design because human taste is part of the feedback loop and grading good design is more tedious than grading functional code.
  • He believes design's dependence on cultural context and novelty makes it harder for models to master than software engineering, which prefers known patterns.
  • Ambrosino describes role collapse at OpenAI, where roles are defined by the average of tasks performed, not strict boundaries between design, engineering, and product.
  • He warns against eliminating specialized roles like product manager, arguing disciplines have knowable best practices that shouldn't be abandoned.
  • Ambrosino says Codex usage has grown 6x since January and now has over 5 million weekly active users, with numbers quickly becoming outdated.
  • He recounts that the Codex app released in February would have failed if launched in November, as outcomes depended entirely on model improvements in that window.
  • Ambrosino describes a product strategy of building features that don't work yet, then waiting for model capability leaps to make them viable.
  • He states autonomous software development isn't ready because models increase code complexity and struggle with deleting code, reframing requests, and building proper abstractions.
  • Ambrosino uses Codex to automate his work, setting up tasks that scan Slack channels, summarize updates, and manage product releases, coaching the app along the way.
  • He shares that OpenAI's videographer used Codex to edit videos, prompting the app to build its own extension for Premiere Pro to complete tasks it couldn't handle directly.
  • Ambrosino explains Codex's vision as a home base that orchestrates work across other apps via connectors or computer use, not a super app that replaces all specialty tools.

The Capability Overhang PlaybookJun 28

  • Nathaniel Whittemore defines the 'capability overhang' as the gap between the latent power of existing models and the real value most individuals and organizations extract from them.
  • Whittemore asserts a forced AI pause is underway due to stalled frontier model releases: GPT-5.6, Claude Sonnet 5, and Gemini 3.5 Pro have been delayed, while Fable 5 remains blocked.
  • Leo from SynthWave reported GPT-5.6's new target release is mid-July and DeepMind delayed Gemini 3.5 Pro due to dissatisfaction with its current state.
  • AI Battle data shows the current wait for GPT-5.6 is 61 days, exceeding previous update gaps of 29, 56, and 49 days within the GPT-5 era.
  • Policy advisor Dean Ball argues the entire US AI industry is frozen from new public releases until the government resolves the Fable situation.
  • Whittemore's Capability Overhang Playbook first advises individuals to create a personal learning agenda by honestly assessing their weaknesses in AI tools and workflows.
  • He recommends building a personal benchmark or eval portfolio: reusable task sets with prompts and success criteria to quickly gauge new model performance.
  • WorkAI Institute Glean study found knowledge workers spend about 2.4 hours weekly organizing context for AI agents, a drain on productivity.
  • To reduce context overhead, Whittemore suggests building portable context assets, either broad-based personal portfolios or per-project context packs.
  • He cites two resources for this: his own project ContextPortfolio.ai and Jim Sanguine's 'The Librarian,' an agentic OS curator.
  • Whittemore advises users to experiment deeply with current AI harnesses by building the same project in both Claude Code/Cowork and Codex to compare interfaces and tool interactions.
  • He recommends exploring specific plugins within tools like Claude Code to discover new capabilities relevant to your role, as experimentation often falls off daily to-do lists.
  • For holdouts, Whittemore urges building a full end-to-end agent architecture, using resources like the free AgentOS program and employing a 'two window' method with a build window and a tutor chat.
  • Whittemore argues individuals should explore model independence using routers like Open Router and open models from Hugging Face, and question their own priorities around cost, privacy, and control.
  • For organizations, he suggests reviewing learning resources and incentive structures for AI adoption, ensuring they reward effective use and sharing of reusable systems.
  • Whittemore warns organizations about an 'overly strong known ROI bias' from token efficiency, which could prioritize efficiency AI over opportunity AI for new products and capabilities.
  • He proposes organizations develop a measurement philosophy linking AI usage to both individual and business outcomes, differentiating between adoption, usage, and outcome metrics.
  • An advanced pattern involves shifting from actively managing AI prompts to architecting loops where AI iterates towards a set goal, utilizing the '/goal' feature as a new primitive.
  • Whittemore recommends turning context portfolios into MCP servers to increase portability and efficiency, gaining familiarity with a key part of the agentic ecosystem.
  • He advises packaging recurring capabilities as reusable 'skills' to make agent work transportable across projects, referencing a past show with Nufar Gaspar on agent skills.
Also from this episode: (1)

Other (1)

  • Prediction market odds for a GPT-5.6 release this week collapsed from nearly 90% to below 30% on Tuesday, indicating a sharp change in expectations.

Why AI Users Are Raving About GLM 5.2Jun 22

  • The Economist reported that Senator Mark Warner claimed NSA Director General Joshua Rudd told him Mythos broke into almost all classified U.S. systems in hours, not weeks, on June 11, the same day Amazon reported the jailbreak that led to the Fable 5 ban.
  • Reporter Shashank Joshi clarified that the Mythos breach claim should not be taken literally, likely referring to a controlled test with caveats, not a real-world attack. Policy analyst Peter Weildford suggested more plausible scenarios, such as a red team exercise or Mythos being given prior access.
  • In a Saturday interview, President Trump stated he does not regard Anthropic or Dario Amodei as a current national security threat, does not want to shut the company down, and explicitly ruled out using the Defense Production Act to control AI.
  • Nobel laureate John Jumper left Google DeepMind for Anthropic, following the recent departure of VP Noam Shazeer to OpenAI, amid reports of plummeting morale and frustration over the lab's fall to third or fourth place in the AI race.
  • Leo at Synthwave reported DeepMind staff are demoralized by Z AI's GLM 5.2 overtaking Gemini 3.1 Pro and the lab's four-month gap without a flagship model release, with Gemini 3.5 Pro reportedly slated for June 30 and viewed internally as 'not the step change we need.'
  • Analyst Andrew Curran reported a new, more capable version of Mythos has finished training, speculating it could be called Mythos 5.1 or 6, and noted that banning public models does not slow internal development.
  • Industry observers found evidence of an upcoming Claude Sonnet 5 release on an Anthropic partner provider, while GPT 5.6 appears in Codex and OpenAI's Codex lead hinted at major upcoming front-end capability improvements.
  • GLM 5.2 is being hailed as a 'DeepSeek R1 moment' for open models, with users like Vercel's Guillermo Rauch and Itamar Golan reporting it feels meaningfully close to frontier lab quality for coding and real tasks.
  • Design Arena's benchmark found GLM 5.2 beat Fable 5 at website design due to better starting templates, avoidance of common coding errors, and more intricate outputs, though it lagged in game dev and 3D design and produced 25% more code with double the generation time.
  • Elon Musk debated the timeline for a Chinese model matching Mythos, predicting Q1 2026 for true usefulness, while Z AI's CEO suggested it would be sooner, and Box's Aaron Levie highlighted the strategic importance of open models reaching frontier performance.
Also from this episode: (1)

Models (1)

  • Theo notes GLM 5.2 is not cheap to run, as its high token usage makes it more expensive than Opus 48 and GPT-5.5 Medium, while Itamar Golan estimates proper local deployment requires eight H200 GPUs costing around $400k.

The next big breakthrough will be AIs learning on the jobJun 26

  • AI labs aim for Artificial General Intelligence (AGI) by training models on millions of verifiable tasks across thousands of diverse Reinforcement Learning (RL) environments, creating problem-solving agents for open-ended tasks.
  • Progress in AI for computer use is slower than other domains like coding because it lacks 'grindability,' meaning the ability to run many parallel rollouts against deterministic, replayable simulators.
  • Dario suggested that short-horizon RL training may not necessarily generalize to long-horizon real-world performance, indicating limits to RLVR's ability to create general agents.
  • Dwarkesh states that 30-50% of AI lab compute is allocated to inference, which currently does not improve models, representing a significant waste as valuable learning opportunities occur during deployment.
  • Current online learning models, like the Cursor tab model predicting user-accepted edits, require learning the same objective across hundreds of millions of daily requests to be effective.
  • On-Policy Self-Distillation (OPSD) is a technique that distills session learnings into model weights by encouraging the base model to match a 'veteran teacher model's' predictions, without requiring verifiable rewards.
  • OPSD provides a denser supervision signal than naive RL by training on per-token probability discrepancies, and is superior to supervised fine-tuning by consolidating relevant insights rather than recalling every observed token.
  • A speculative concept called 'dreaming' proposes AIs build and train against self-generated simulations of reality, allowing them to experience orders of magnitude more samples, akin to Efficient Zero's internal game simulations.
  • This 'dreaming' could become a fourth axis of AI scaling, alongside pre-training, RL, and inference compute, enabling models to rehearse production skills in simulated 'video game' environments for specific users.
  • By 2027-2028, AIs could learn primarily from broad deployment and user interactions, accumulating experience from diverse tasks and becoming smarter with every engagement, rather than solely from pre-release training.
Also from this episode: (3)

Models (3)

  • Dwarkesh notes AI models are significantly less sample-efficient than humans during training, with some estimates placing them 1 to 1 million times less efficient.
  • Many complex real-world skills, such as building a business or winning court cases, cannot be simulated within data centers because their verification requires interacting with the real world over extended periods.
  • Continual learning, critical for AIs to absorb real-world experience, requires distilling knowledge into model weights rather than expanding unscalable in-context memory, mirroring how human brains learn through compression.

What Happens to Design After AI?Jun 24

  • Paul Backus developed Impeccable after observing designers achieved superior Claude results from their specialized vocabulary. His open-source agent skill embeds this language, adds a quality layer to remove "slop," and features a visual iteration mode.
  • John Maeda, recalling MIT discussions from the 1990s, expresses excitement about "auto design" finally being highly feasible, fulfilling a decades-long vision.
  • Paul Backus highlights Muriel Cooper's 1980s work at MIT, where she predicted electronic publishing and collaborated with the AI lab on automating design through machine intelligence.
  • John Maeda distinguishes design from engineering, noting design requires restraint - like an architect shaping human experience - whereas engineering often builds everything it can.
  • John Maeda likens Impeccable to Kai's Power Tools, which broadened Photoshop's market through algorithmic effects. He also calls it a "PostScript moment," praising its encoding of visual graphic design primitives.
  • John Maeda is integrating Impeccable into the new GitHub Co-pilot app, aiming to elevate design craft within the "digital harness age" by setting a high bar for application quality.
  • Paul Backus notes "AI slop" evolves from "purple gradients" to current tells like "beige backgrounds." Impeccable combats this algorithmic homogeneity using "anti-attractors" and randomness to ensure unique design outcomes.
  • John Maeda advocates shifting from UX to AX (Agentic Experience) design, concentrating on non-visual interactions for agents (e.g., CLI, error messages) to force-multiply design capabilities beyond traditional interfaces.
  • John Maeda credits Bill Atkinson's Macintosh QuickDraw API, with its unique region-based routines like Floodfill, as foundational for Photoshop's development, a capability absent in Windows' DirectX at the time.
  • Paul Backus finds Impeccable shapes both design and engineering mindsets, leading to improved communication as users adopt each other's technical and design vocabularies through interaction with the tool.
  • Anisha A. references John Maeda's 2010 Forbes prediction that by 2020, the software industry would rediscover its craft heritage, shifting towards bespoke apps and valuing authorship.
Also from this episode: (8)

Models (4)

  • Paul Backus explains LLMs, trained on humanity's output rather than the input (design rationale), possess millions of taste definitions but only approximate taste, failing to grasp its underlying human viewpoint or origins.
  • Anisha A. highlights a debate on AI's impact, with some predicting design commoditization as AI enables anyone to generate interfaces via prompts.
  • Paul Backus argues AI "raises the floor" by automating mechanical design tasks, allowing humans to focus on higher-level thinking and "raise the ceiling" of creative output.
  • John Maeda explains current vision models lack adequate training data to "see" and think like human designers, especially regarding motion and temporal resolution in design outputs.

History (1)

  • John Maeda posits design became critical around 2014-2015 due to mobile's rise. Frequent mobile usage made poor experiences painful, unlike less-used desktop interfaces, demanding higher design quality.

Society (1)

  • John Maeda asserts taste is cultural and develops from material scarcity. He contrasts evolved design cultures like Denmark and Japan with the U.S. "styrofoam plate," noting royalty's desire for distinctiveness historically drove taste.

Business (1)

  • John Maeda defines conviction in leaders as a rare blend of design, business, and technical sense, enabling bets aimed at "global maximums" rather than local optimization.

Culture (1)

  • Paul Backus recounts how Steve Jobs tested employee conviction by challenging their ideas, often saying "that's a stupid idea," to see if they would defend their vision or simply acquiesce.

The US Government Banned Claude Fable 5...Jun 24

  • Cursor is launching 'Origin,' a new code review platform built by its acquired company, Graphite, designed to efficiently fork large monorepos. Cursor aims to be a comprehensive enterprise software development suite, including local IDE, cloud agents, and code hosting.
  • Theo and Ben agree that open-weight models like GLM 52 are becoming genuinely usable for coding tasks, with GLM 52 ranking fourth among top frontier models, comparable to Opus 4.5 in late 2023. However, they still require substantial compute resources.
  • Theo argues Google is significantly falling behind in AI, exemplified by an open-weight Chinese model (GLM 52) outperforming them in coding benchmarks and a continuous exodus of top talent. Ben adds that Google's 'thinking' approach and poor RL have hindered their progress.
  • The 2022 Chips and Science Act allocated $52.7 billion to boost US semiconductor manufacturing, driven by fears of China. Later in 2022, the US implemented the first ban on chip exports to China, followed by increased Senate attention to AI in 2023 after ChatGPT's launch.
  • Theo attributes the Fable ban to a significant knowledge gap between AI developers and non-technical officials. He argues Amazon engineers, unfamiliar with modern AI, misinterpreted Mythos's capabilities when compared to outdated tools, triggering exaggerated alarm.
  • Theo describes OpenAI's internal 'Grug speak' optimization, where models use highly compressed, nonsensical language to reduce token count and improve reasoning efficiency, a leaked detail he finds amusing.
  • Theo and Ben discuss how macOS's `syspolicyd` process, designed for security, creates a significant CPU bottleneck for AI agent workloads by extensively monitoring numerous sub-processes. This pushes them to offload tasks to Linux machines.
  • Theo reveals a 'pro tip' for advanced Codex users: a config file setting allows bumping the default parallel agents from three to 20 per thread, potentially burning 'hundreds of dollars of inference an hour.' Ben suggests asking Codex itself to configure this.
Also from this episode: (10)

Startups (1)

  • Theo was an investor in Cursor, which SpaceX AI acquired in a $60 billion all-stock deal. Theo stands to make between $50,000 and $2.5 million from the acquisition.

Big Tech (2)

  • Ben notes SpaceX AI, combining XAI and SpaceX, aims to overcome its lack of data and researchers by acquiring Cursor, despite having vast GPU warehouses. This positions it as a major AI lab.
  • Theo notes that while OpenAI historically led with major leaps (transformer, reasoning), Anthropic now drives 'big swings' in research and development, exemplified by Karpathy joining. Ben agrees OpenAI has shifted to an engineering-first company, focusing on gradual improvements.

Business (1)

  • Ben highlights that SpaceX's IPO strategy involves complex regulations, with only about 6% of its equity currently tradeable, and early investors facing vesting schedules stretching over three years.

Models (5)

  • Cursor's upcoming model, possibly Composer 3, will be pre-trained from scratch, aiming for general intelligence beyond code, with a claimed size of 1.5 trillion parameters. This marks a significant leap from previous Composer models based on fine-tuned open-weight models.
  • Theo notes rumors placed GPT 4.5 at 2-12 trillion parameters and Fable between 5-10 trillion, emphasizing the ongoing uncertainty about large model sizes. Ben argues strong Reinforcement Learning (RL) can enable smaller models like GPT 5.5 to perform comparably to much larger ones.
  • Ben predicts Cursor's new model will be branded as a Grok variant and exclusively available via XAI APIs, potentially with GPT-4-like pricing. Theo expects the model will likely avoid third-party cloud platforms like Bedrock or GCP.
  • In February (implied 2025), Anthropic refused to grant the US government unrestricted use of its models on private compute. Anthropic sought contractual carve-outs against autonomous killing and mass surveillance of US citizens, leading the government to consider them a 'supply chain risk.'
  • While Anthropic clashed with the government over direct model access with restrictions, OpenAI secured deals by offering its models as a service via APIs with built-in safety layers, avoiding direct control over government actions.

AI Infrastructure (1)

  • Anthropic's expenditure of $1.25 billion per month for GPU rental from xAI highlights the immense burn rates of frontier AI labs, driving them to prioritize commercialization over purely experimental research.