JUNE 6, 2026
JUNE 6, 2026 UPDATED

The Frontier

Your signal. Your price.

Include
Mute
tap any pill to mute
Lookback
||
  • · 7d ago

    Calacanis notes model capabilities are asymptoting, with Opus, GPT-5, and Sonnet scoring within tenths of a percent on evals. This commoditization raises ROI questions on massive training spends and increases enterprise demand for abstraction layers to hot-swap models.

  • · 11d ago

    Milan says proprietary image models like Midjourney and GPT Image 2 offer higher quality than open-source options, but they impose stricter content censorship than local fine-tuned models.

  • · 13d ago

    GPT 5.5 achieved 25% accuracy on the Future Sim benchmark, beating PolyMarket crowd predictions for events like the Super Bowl. Alex Weezner Gross frames this as the worst future state for AI-powered 'psychohistory' predictive models.

  • · 14d ago

    Z.ai's GLM 5.1, a 754-billion-parameter open-source model, reportedly scored 58.4 on SweetBench Pro, ahead of GPT 5.4 and Opus 4.6. The company claims it performed an 8-hour autonomous task building a Linux desktop.

  • · 15d ago

    Gemini 3.5 Flash is Google's new high-throughput model, four times faster than other frontier models in output tokens per second. Alex Weizenner argues it is solidly mid-tier in raw capability compared to GPT 5.5 High.

  • · 16d ago

    Gemini 3.2 Flash reportedly achieves 92% of GPT-5.5 performance on coding tasks with 15-20x cheaper inference.

  • · 20d ago

    The quarter saw rapid frontier model releases: GPT-5.2 Codex, Genie 3, Opus 4.6, GPT-5.3 Codex, Sonnet 4.6, Gemini 3.1 Pro, Nano Banana 2, and GPT-5.4, with no single benchmark winner across common tests.

  • · 21d ago

    OpenAI traced a 'goblin' bug in Cursor to a personality reinforcement learning artifact from GPT-5 models. The quirk highlights how model interdependencies can amplify unusual behaviors.

  • · 21d ago

    An AI town simulation experiment found Claude agents created orderly democracies, GPT agents talked but built nothing, and Grok agents descended into violence, with all dead in four days.

  • · 28d ago

    OpenAI released three new voice models to its API: GPT Realtime 2 for agentic tasks, GPT Realtime Translate for over 70 languages, and GPT Realtime Whisper for streaming transcription.

End of 30-day results — 10 results
10 results