Price:

AI & TECH

AI labs weaponize pricing to squeeze third-party agents

Tuesday, May 26, 2026 · from 5 podcasts, 6 episodes
  • Anthropic slashed token subsidies for third-party coding tools, locking users into their own interfaces.
  • Startups respond with ‘punk software’ to swap underlying models and avoid vendor lock-in.
  • OpenClaw's creator spent $1.3 million last month automating development with AI agents.

Anthropic’s recent API pricing shift is a targeted strike on high-volume automation. The company now offers a 40x token subsidy only to users staring at a screen with a Claude logo, like their official web app or CLI. Any third-party interface, including popular wrappers like T3 Code, is shunted into a full-price “programmatic” bucket.

This move effectively bans the kind of massive automation that OpenClaw creator Pete relies on. In the last 30 days, Pete burned $1.3 million on tokens, making 7.6 million requests across over 100 agents that manage his GitHub repository - reviewing PRs, deduplicating issues, and verifying fixes. On Nerd Snipe, Theo argued the policy is a strategic land grab disguised as a user benefit, designed to kill applications that threaten Anthropic’s margins.

"The policy target seems to be massive automation plays like OpenClaw that threaten Anthropic's margins."

- Theo, Nerd Snipe

The response from the ecosystem is a move toward what This Week in AI called “punk software.” Kanjun Q of Imbue warned that frontier labs will eventually vertically integrate into every profitable niche. The defense for startups is building orchestration layers that allow users to easily swap out the underlying LLM, preventing a total “rug pull” by their providers.

Jason Calacanis noted the cultural split on the same show: Anthropic positions itself as the transparent, research-first alternative, while using equity as a “golden cage” to lock in talent like Andrej Karpathy and pricing to lock in users. The economic pressure is creating a clear bifurcation in the market.

Open-source models are gaining ground on price and lack of censorship but still trail in raw reasoning. Milan from NanoGPT stated on Ungovernable Misfits that top open-source models like GLM 5.1 lag the frontier by three to six months. For high-stakes coding, users overwhelmingly choose Claude Opus 4.7, which is used five times more than any other model on his platform.

The endgame isn't just cheaper code. It's a fundamental re-architecting of labor. Nufar Gaspar on The AI Daily Brief argued that the leader’s personal AI usage is the single biggest predictor of a team’s adoption. The progression leads to building a digital chief of staff that orchestrates specialized agents for research, strategy, and operations.

As compute becomes a commodity, the battle shifts to who controls the interface - and the data. Anthropic’s pricing gambit is the opening salvo in a war over the orchestration layer, where the spoils are the workflows of every developer.

Source Intelligence

- Deep dive into what was said in the episodes

AI Sub Slave VS NanoGPT with Milan | FREEDOM TECH FRIDAY 41May 26

  • Milan explains NanoGPT offers subscription and pay-per-prompt access to hundreds of text, image, video, and audio AI models through a single interface, with strong privacy defaults and crypto payments.
  • Milan says OpenAI's GPT 5.5 and Anthropic's Claude Opus 4.7 are the leading closed-source AI models for general chat and coding. Claude Opus 4.7 is used five times more than any other model on NanoGPT.
  • Milan states Google's Gemini 3.5 Flash is a top closed-source model because it is very fast and cheaper, while Opus 4.7 is expensive and slower.
  • Milan argues open-source text models like GLM 5.1, DeepSeek V4 Pro, and Kimi K 2.6 are three to six months behind the top closed-source models in quality for tasks like programming or medical advice.
  • Milan notes open-source competition drives down AI model costs and increases efficiency, while closed-source models have fixed prices due to monopolies from providers like OpenAI or Azure.
  • Milan observes running top AI models locally on consumer hardware like a MacBook or phone is currently unrealistic due to their size, but smaller image models are more feasible for local use.
  • Milan says proprietary image models like Midjourney and GPT Image 2 offer higher quality than open-source options, but they impose stricter content censorship than local fine-tuned models.
  • Milan describes users employing NanoGPT for creative applications like detailed role-playing in constructed worlds and automated polymarket betting operations where AI agents research and place wagers.
  • Milan states NanoGPT's development is driven 90% by user feedback, leading to additions like video and audio models, APIs, trusted execution environments, and a browser-installed local model for memory.
  • Milan explains NanoGPT's global memory uses a local model to learn user background details across chats, while context memory expands a model's window by feeding it only the most important parts of long conversations.
  • Milan notes users can optimize costs by using expensive models like Claude Opus as orchestrators and cheaper models for simple tasks, or by setting spending limits and choosing between premium, standard, or basic auto-model routing.
Also from this episode: (7)

AI & Tech (5)

  • Milan founded NanoGPT as a Telegram bot to let people pay small amounts in crypto to try ChatGPT, aiming to make AI accessible without credit cards or personal data after his own privacy experiences at a central bank.
  • Milan says NanoGPT prioritizes privacy by default, allowing use without accounts and storing data locally, but added optional encrypted storage and Google sign-in for users who want syncing.
  • Milan explains NanoGPT supports around twenty cryptocurrencies, with Monero being the most used for ten months, followed by Bitcoin and Nano, to meet users where they are and foster a circular crypto ecosystem.
  • Milan demonstrates NanoGPT's local model feature, which downloads a 500 million parameter model directly in the browser for tasks like conversation memory, aiming for accessibility on basic devices.
  • Milan outlines NanoGPT pricing: a $12 monthly subscription for open-source models with a 30-60 million token weekly allowance, or pay-per-prompt starting at $1, where costs vary widely based on model and task complexity.

Coding (1)

  • Milan says NanoGPT's OpenAI-compatible API allows integration with any coding harness or agent tool like Claude CLI, and its agents page provides endpoints for web search, image generation, and other functions discoverable by AI models.

AI Infrastructure (1)

  • Milan advises starting AI agent setups in a sandboxed environment like a separate computer or VPS and using AI models themselves to configure the workflows, rather than giving them blind access to emails or local files.

The 4 AI Team Members Execs Should Hire Right NowMay 25

  • For AI research, Gaspar recommends using 'wisdom of the crowd' by running the same query across multiple AI models or sessions, aggregating consensus results, and using a separate model to fact-check the aggregated findings, arguing consensus likely indicates factual accuracy.
  • Gaspar states the natural progression after mastering the four digital team members is to build an AI 'chief of staff' that orchestrates across them, providing a cross-functional view of decisions and priorities.
  • Gaspar emphasizes focusing on the methodology and results of AI systems over specific tool features, advising executives to 'sweat what you're building and how you're building it' rather than the tool choice.
Also from this episode: (10)

AI & Tech (9)

  • Nufar Gaspar identifies three common archetypes among executives lagging in AI adoption: the 'podcast CTO' who knows every release but hasn't built a system, the 'weekend tinkerer' who builds privately but not operationally, and the 'manifesto writer' with a vision who hasn't personalized AI use.
  • Gaspar argues the leader's quality of AI usage is the single biggest predictor of their team's AI adoption, and leaders who are the best users create the most forward-looking AI organizations.
  • Gaspar presents five non-negotiable operating principles for executives using AI: use voice/dictation over typing to capture unstructured thinking, habitually brain dump undocumented context, let AI 'interview' you before complex tasks to surface blind spots, separate planning from execution for critical tasks, and be intentional about where in a workflow your human judgment adds the most value.
  • Gaspar advises building a digital workforce with four AI 'team members': a Research Analyst, a Strategic Thought Partner, a Communication Expert, and an Operational Powerhouse, which provide capabilities beyond human bandwidth.
  • Before acting on AI research, Gaspar suggests running outputs through three questions: is it grounded in real sources or just AI pattern matching, what's missing that I didn't think to ask, and would you feel comfortable putting your name to it.
  • For strategic AI advising, Gaspar recommends building a 'board of advisors' with distinct personas and decision-making styles that debate a decision before presenting it, and calibrating the AI's pushback to match your personal decision-making style.
  • To make an AI communication expert write in your voice, Gaspar advises style profiling by feeding AI your best writing samples for analysis, and creating detailed personas of your target readers to have them review drafts for clarity and impact.
  • When giving AI feedback on writing, Gaspar recommends scoring outputs on specific dimensions like clarity and conciseness instead of giving generic critiques, which allows the model to understand precisely how to improve.
  • For operational AI, Gaspar says leaders should not just automate existing tasks but conceive of dashboards and reports they'd build with unlimited headcount, and they should manually test any new automated brief or process for one to two weeks before committing to full automation.

Enterprise (1)

  • Gaspar's training is based on working with executives across 30 different countries, observing recurring patterns in how leaders engage with AI.

9 Codex Tips From the Codex TeamMay 19

  • Cursor's Composer 2.5 coding model matches frontier model performance at a tenth of the cost, scoring 69.3% on Terminal Bench 2.0 and 79.8% on SweBench multilingual.
  • Cursor prices Composer 2.5 at 50 cents per million input tokens and $250 per million output tokens, half the cost of Opus 4.7 or GPT-5.5.
  • Chamath Palihapitiya argues enterprises using OpenAI or Anthropic directly are letting competitors into their data, creating an opening for model-agnostic harness-first companies.
  • Cloudflare's review finds Anthropic's secretive Mythos model can create multi-step exploit chains and generate functional exploit proofs, working like a senior security researcher.
  • A jury dismissed Elon Musk's lawsuit against OpenAI in two hours, ruling his breach of charitable trust claim was barred by a three-year statute of limitations.
  • Codex's tool use - computer, browser, and connectors - transforms it from a chat interface into an evidence-gathering work system that needs full environmental access.
  • Lou uses heartbeats, or scheduled check-ins, to create autonomous feedback loops where Codex monitors tools like Slack and triggers actions without human intervention.
  • The side panel in Codex is where parallel processing happens, allowing users to inspect and annotate artifacts while the agent continues working.
Also from this episode: (5)

AI & Tech (5)

  • Cursor is training a new model from scratch using XAI's Colossus 2 cluster, which has a million H100 GPU equivalents.
  • Jason Lou's first Codex tip advocates using durable, long-running threads for key work streams, relying on OpenAI's improved context compaction to maintain persistent memory.
  • Lou argues voice interaction with Codex unlocks richer context by allowing users to provide messy, uncertain backstory, letting the AI help clarify thoughts.
  • The steer feature in Codex lets users update prompts mid-execution, enabling parallel human-AI work instead of rigid turn-based prompting.
  • Lou built a structured Obsidian file vault for Codex memories, arguing work should leave behind inspectable artifacts, not just trapped chat history.

Grads boo AI, Reese Witherspoon gets dunked + Karpathy joins Anthropic | TWiAI E14May 20

  • Andre Karpathy’s move to Anthropic is more about communication than research, according to Jason Calacanis. He argues Dario Amodei’s grim predictions make him a poor AI spokesperson, while Karpathy’s credibility can alleviate industry pressure.
  • Fundamental builds tabular models for enterprise structured data, a modality poorly handled by LLMs. They have a confidential compute partnership with AWS, allowing models to be deployed and encrypted within a customer’s own VPC.
  • The best use of AI is as a reflective surface to ask better questions, not just a solution generator. Imbue open-sourced ‘Blueprint,’ an agent skill tuned to ask high-quality questions to gather user context.
  • AI industry leaders are forming distinct cultural cults. Jason Calacanis categorizes them: SpaceX for tech libertarian monks, Anthropic for the left-leaning and earnest, and OpenAI for cutthroat capitalists.
  • Kanjun Q warns frontier AI labs will vertically integrate into profitable application layers. The defense for startups is building headless products with orchestration layers that can easily swap underlying models.
  • A New York Times editorial attacked Reese Witherspoon for encouraging AI adoption. Kanjun Q argues this conflates two separate issues: using a helpful tool versus critiquing systemic power concentration.
Also from this episode: (9)

AI & Tech (6)

  • Anthropic’s API pricing penalizes third-party providers. They offer a 20x token savings plan only for customers using Anthropic’s first-party products, a subtle anti-competitive move aimed at locking users into their ecosystem.
  • Imbue co-founder Kanjun Q bought a 10,000 H100 GPU cluster in 2022 as an investment to fund the company, which now generates substantial rental revenue. She avoided venture capital, taking investment from corporate arms and a non-profit.
  • Linear is shifting from project management to AI execution. The product now includes an agent that can research feedback, write proposals, examine codebases, and delegate tasks, with a native coding agent in development.
  • Karri Saarinen argues AI-generated design is often soulless and can worsen product quality. Founders who delegate design to AI without understanding the problem produce aesthetically pleasing but non-functional outputs.
  • Kanjun Q sees AI enabling bespoke, personalized user interfaces. She built her own agent UI for email and task management, stating that design principles shift when creating for a single user versus a mass audience.
  • Jeremy Frankle defines poor AI etiquette as shifting the burden of reviewing AI-generated slop onto coworkers. He asserts all AI output is the user’s responsibility and must be reviewed before delegation.

AI Infrastructure (1)

  • A cost-effective local AI cluster can be built by daisy-chaining multiple Apple Mac Studios with high RAM. ExoLabs provides software to address multiple units as a single cluster.

Society (1)

  • Graduates are booing AI commencement speeches due to real fear and disempowerment. They perceive a future where entry-level jobs are automated and wealth creation excludes them, reacting against condescending advice.

Education (1)

  • Jeremy Frankle calls graduating students hypocrites for booing AI while using ChatGPT for essays. He argues this is the best time to graduate, as AI is a powerful tool for creative expression and starting companies.

Trump’s Taxpayer-Funded PlanMay 20

Also from this episode: (9)

Politics (9)

  • The Trump administration is establishing a $1.776 billion taxpayer-funded Justice Department account to compensate self-described victims of government 'weaponization' and 'lawfare'.
  • President Trump dropped a $10 billion lawsuit against the IRS after a federal judge questioned its legitimacy due to his control over both the plaintiff and defendant sides of the case.
  • The fund's creation was linked to the leak of Trump's tax returns. IRS contractor Charles Littlejohn pleaded guilty in 2023 and was sentenced to prison in 2024 for leaking the information to the New York Times and ProPublica.
  • The fund's $1.776 billion figure is a symbolic reference to the year of the nation's founding. Its administrators will be five people appointed by the Attorney General, Todd Blanche.
  • Potential beneficiaries include the nearly 1,600 rioters charged in the January 6th Capitol insurrection, particularly those pardoned by Trump who claim they were improperly investigated for being his supporters.
  • As part of the deal to drop his lawsuit, the IRS will drop any audits of Trump, his family, and his businesses, potentially allowing him to avoid tax bills that could exceed $100 million.
  • Acting Attorney General Todd Blanche claimed the fund is not limited to Republicans or January 6th defendants, but Democrats in Congress accused the administration of creating a 'political slush fund' for rewarding allies.
  • Even some Republicans, including Senate Majority Leader John Thune, expressed skepticism, indicating Congress may scrutinize the fund's legitimacy and the administration's need to answer questions about it.
  • The top lawyer at the Treasury Department resigned hours after the fund's creation, with initial reporting suggesting the move was linked to objections over the arrangement.

How the OpenClaw creator uses $1.3 million of tokensMay 20

  • Anthropic banned all non-interactive Claude Code usage, reclassifying tools like T3 Code and the Claude -p flag as 'programmatic' and removing their subsidized rates, effectively enforcing use only through their official interface.
  • AgentMail provides email inboxes as an API for AI agents, enabling use cases like automated notifications, site sign-ins with 2FA, and customer service, becoming a critical primitive for Theo's agent projects.
Also from this episode: (9)

Coding (4)

  • OpenClaw creator Pete spent $1.3 million on tokens in 30 days, making 7.6 million requests totaling 603 billion tokens, as shown in his Codex Bar utility screenshot.
  • Pete's OpenClaw project leverages unlimited tokens to automate software development, running over 100 Codex instances in the cloud to review PRs, deduplicate issues, run security scans, create patches, and even generate PRs from meeting discussions.
  • Gary Tan claims token maxing with tools like OpenClaw and G-Brain at $10k per month provides a competitive advantage, offering 2028-level AI capabilities years early for those willing to invest heavily now.
  • The Bun runtime was rewritten from Zig to Rust using AI in about a week and a half, merging into main with all tests passing, representing a massive, agent-driven transformation of a critical JavaScript ecosystem project.

AI Infrastructure (3)

  • Mark Cuban proposed a federal token tax of less than 50 cents per million tokens to incentivize efficiency, reduce energy strain, and generate revenue, but Theo argues it would disproportionately impact cheaper models and harm US competitiveness.
  • Theo spent approximately $10k-$12k on networking hardware, including a Synology NAS and drives, to build a secure private network for running AI agents, motivated by the increasing frequency of critical security vulnerabilities.
  • Clerk offers a unified platform for user authentication, organization management, and billing, with components that deeply link subscription and payment management directly into application interfaces.

Enterprise (2)

  • Hashimoto warns entire companies suffer from AI psychosis, prioritizing rapid bug fixes over system resilience, which risks creating incomprehensible, decaying architectures as seen in infrastructure's shift to cloud automation.
  • A severe two-week wave of security exploits included Copy Fail, 70 patched CVEs in macOS, a Windows BitLocker bypass, the Minishai Halad supply chain attack, and Google-confirmed AI-powered exploitations of zero-days.