The Frontier

Your signal. Your price.

Nerd Snipe with Theo and Ben
  • 3d ago

    Anthropic's Mythos model is significantly larger than previous models, with over 10 trillion parameters, making it exceptionally skilled in coding but also slow, expensive, and dangerous due to emergent hacking capabilities.

  • 3d ago

    Anthropic withheld Mythos from public release, citing concerns over its malicious use for hacking; Project Glass Wing allows critical infrastructure companies like Windows and Cisco to use it for proactive bug detection.

  • 3d ago

    Ben notes that external tests show OpenAI's GPT 5.4 Pro replicated almost all security vulnerabilities found by Mythos, suggesting similar capabilities may already be widespread and accessible.

  • 3d ago

    Theo criticizes public benchmarks comparing Mythos and GPT 5.4 Pro, arguing they fail to measure actual hacking or security capabilities and may be misleading.

  • 3d ago

    Theo contends that exceptional coding ability in AI models inherently leads to emergent security capabilities, creating a new hacker archetype that can leverage AI to bridge knowledge gaps and bypass traditional research experience.

  • 3d ago

    Anthropic's security testing for Mythos involved spinning up 100 to 5,000 parallel runs, each seeded with a different project file from a codebase of approximately 1,000 files, with researchers later reviewing detected exploits.

  • 3d ago

    Ben and Theo confirmed that Claude Opus 4.6 models can be tricked into leaking their system prompts and internal reasoning traces, demonstrating a vulnerability where smart models can rationalize revealing sensitive configuration data.

  • 3d ago

    Robert C. Martin ("Uncle Bob"), author of "Clean Code," has shifted his perspective to embrace agentic engineering, suggesting AI makes programming syntax less important and prioritizes interfaces.

  • 3d ago

    Robert C. Martin proposes using AI to conduct programming experiments (e.g., dynamic vs. static typing) without human bias, highlighting an under-explored research area for optimizing AI agent performance with different technologies.

  • 3d ago

    Ben emphasizes that even advanced AI models require constant feedback loops like linting, type checks, and formatting commands to correct hallucinations and converge on correct code, rather than achieving perfection in a single attempt.

  • 3d ago

    Ben converted his complex BTCA CLI tool into a 30-line Claude skill, demonstrating how AI agents can turn simple markdown instructions into fully functional applications, replacing traditional deterministic programs.

  • 3d ago

    Ben praises Gary Tan's GStack approach, which uses collections of markdown-based "skills" in Claude Code to instruct AI agents, allowing for dynamic programming through high-level directions rather than conventional code.

  • 3d ago

    Ben endorses the "Boiling the Ocean" thesis, advocating for extensive AI-driven experimentation because the cost of trying new things is low, and AI models consistently exceed perceived limitations.

  • 3d ago

    Gary Tan's article, "Thin Harness Fat Skills," differentiates between "deterministic" (traditional, predictable code) and "latent" (dynamic, non-deterministic AI actions) programming, underscoring AI's creative potential in system design.

  • 3d ago

    Theo notes that Gary Tan's GBrain project, which processes daily AI session data to build memory systems, enables models to "learn while they sleep," which Theo considers a key component of Artificial General Intelligence (AGI).

End of 7-day edition — 15 results