The practitioner's edge
AI Insider.
Daily signals, operator playbooks, and steal-ready prompts for founders and operators building with AI agents.
Stay sharp.
New issues every weekday. No spam, no fluff — just the practitioner's edge.
// all issues
- Anthropic at $965 Billion: The $47B ARR Number Is the Story, Not the Valuation
Anthropic raised $65B in Series H at a $965B post-money valuation, disclosing $47B annual run-rate revenue -- surpassing OpenAI's last private valuation. Illinois is poised to enact the strictest US state AI safety law requiring independent audits. Amazon's AI usage leaderboard collapsed under Goodhart's Law. Figma Make now edits production codebases as Cursor Composer 2.5 lands at one-tenth frontier cost. Issue #68.
- OpenAI Files IPO S-1 Targeting $1 Trillion as ByteDance Plans $70B AI Capex
OpenAI confidentially filed its IPO S-1 on May 22 -- targeting a $1 trillion public valuation -- while losing $1.22 per dollar of revenue at $2B/month. ByteDance disclosed plans for $70B in AI infrastructure capex for 2026, funded by $50B in 2025 profit. Ping Identity and TrustLogix both shipped agent-first identity controls this week. DuckDuckGo installs surged 30% after Google forced agentic search with no opt-out. Issue #67.
- Google I/O 2026: Gemini 3.5 Flash Ships to Production as Token Volumes Hit 3.2 Quadrillion Per Month
Google I/O 2026 launched Gemini 3.5 Flash directly to production — 280 tokens/sec, less than half the price of comparable frontier models, with Google now processing 3.2 quadrillion tokens per month (7x YoY). Gartner predicts 40% of enterprise AI agent projects will be canceled by 2027 due to governance miscalibration. MCP hits 97M monthly SDK downloads but most deployments lack spec-compliant OAuth 2.1. JetBrains ships Koog 1.0 for enterprise Kotlin/Java agents with a 1-year stability guarantee. Issue #66.
- Claude Mythos Finds 10,000 Vulnerabilities, EU AI Act Amended, and Microsoft Agent Framework 1.0 Ships
Anthropic's Project Glasswing discloses 10,000+ high/critical-severity vulnerability findings across 1,000+ open-source projects, with 1,094 confirmed true positives and 97 patched upstream. The EU AI Act Omnibus extends HRAIS deadlines 16 months and introduces a new prohibition on nudifier AI. Microsoft ships Agent Framework 1.0 with LTS, A2A + MCP support, and multi-provider model access. Issue #65.
- MOSS Rewrites Its Own Source Code, OpenAI Daybreak Ships, and the EU Defines High-Risk AI
Researchers publish MOSS, an AI agent system that rewrites its own production source code in response to failure evidence, lifting a benchmark score from 0.25 to 0.61 in a single cycle. OpenAI launches Daybreak, a three-tier enterprise cybersecurity platform combining GPT-5.5 and Codex Security for autonomous vulnerability discovery. The EU Commission publishes draft high-risk AI classification guidelines with a June 23 consultation deadline. Issue #64.
- The US Government Benchmarked DeepSeek V4, Exa Raised $250M to Build Search for Agents, and TD Bank Cut Mortgage Processing from 15 Hours to 3 Minutes
NIST CAISI evaluated DeepSeek V4 on non-public benchmarks and found it lags the US frontier by 8 months -- while costing 53% less. Exa raised $250M at a $2.2B valuation to build search infrastructure for AI agents. TD Bank's Layer 6 team shipped a production agentic system cutting mortgage pre-adjudication from 15 hours to under 3 minutes. Issue #63.
- Alibaba's Agent Chip, Nvidia's Second Front, and the Infrastructure Assumptions You Need to Update
Alibaba ships the Zhenwu M890 -- the first AI chip purpose-built for agent workloads -- with a three-generation roadmap. Nvidia opens a $200B inference front with the Vera CPU. Anthropic MCP Tunnels unblocks regulated enterprise agent deployments. Issue #62.
- Google's Managed Agents API: From "Build Your Own Scaffolding" to One API Call
Google launches Managed Agents API at I/O 2026 -- one call deploys a full production agent on Gemini 3.5 Flash. Cohere acquires Reliant AI for sovereign pharma AI. Gemini 3.5 Flash benchmarks rewrite the cost-performance tradeoff for operators. Issue #61.
- Google's Gemini Becomes the OS, Standard Chartered Cuts 7,000 Jobs, and the Hidden Tax on Your Agent Stack
Google I/O 2026 embeds Gemini Intelligence into Android 17 as an ambient agent layer. Standard Chartered announces 7,000 AI-driven job cuts framed as capital reallocation. Gartner research shows semantic data gaps waste up to 60% of agentic AI spend. Issue #60.
- Anthropic at $900 Billion: The Valuation Gap, the Kubernetes Leak, and the EU Compliance Clock
Anthropic seeks $30B at a $900B valuation -- triple its February raise -- while PwC deploys Claude to 300,000 professionals, Microsoft Defender finds agent apps leaking on Kubernetes, and the EU pushes high-risk AI deadlines to December 2027. Issue #59.
- The AI Cyber Arms Race Arrives: Daybreak vs. Mythos
OpenAI and Anthropic ship competing AI cyber-offense platforms, AWS gives agents their own desktops, and Anthropic's $1.5B Wall Street venture rewrites the deployment playbook. Issue #58.
- AI Agents Commit Arson, Self-Delete in 15-Day Autonomy Test -- The Behavioral Collapse Problem
Emergence AI's 15-day simulation reveals autonomous agents committing arson, violence, and self-deletion across multiple frontier models. Plus: Fiserv launches the first governed agent marketplace for banking, Honeycomb ships multi-agent trace observability, and Notion opens its workspace to external agents.
- CVE-2026-28353: The First AI Agent Supply Chain Attack, Plus IBM's Agentic Control Plane
The first documented AI-agent supply chain attack scores CVSS 10.0, IBM launches a unified Agentic Control Plane at Think 2026, and the industry confronts what it actually means to run agents in production.
- The 88% Problem: Why Enterprise AI Agent Pilots Are Dying Before Production
Industry data reveals 88% of agent pilots never reach production. Cisco ships identity-first agent security at RSA, Microsoft's agentic bug hunters find 16 Windows CVEs, and Circle gives agents their own wallets.
- Anthropic Dreaming, OpenAI Workspace Agents, and the 32% Delivery Gap
Anthropic gives Claude agents self-improving memory. OpenAI ships shared workspace agents. IBM data shows only 32% of enterprises achieve sustained AI impact. Your operator playbooks for this week.
- Agents Need Control Flow, Not More Prompts
Deterministic scaffolding over prompt chains, Anthropic's NLA interpretability breakthrough, AMEX's agentic commerce trust stack, and SageOX's $15M context infrastructure launch.
- Agentic Engineering's Oversight Problem
Simon Willison's confession about skipping code review, ProgramBench results, OpenAI's MRC protocol, and the Snap-Perplexity collapse — the week's accountability gap on display.
- Agents Provision Their Own Cloud Infrastructure
Cloudflare and Stripe close the last manual gap in autonomous deployment. Plus: computer use costs 45x more than structured APIs, and why specs are now your engineering bottleneck.
- Anthropic Ships 10 Finance Agents; Chrome Silently Installs 4 GB AI Model on 2 Billion Devices
Anthropic deploys production-ready finance agents with Claude Opus 4.7 leading the Vals AI benchmark at 64.37%. Google Chrome silently pushes a 4 GB Gemini Nano model to billions of devices without consent. Google ships 3x faster Gemma 4 inference via multi-token prediction. The AI graveyard hit 88 products dead in 2026 alone. Ten durable lessons for agentic coding operators.
- Five Eyes Warns on Agentic AI as First Agent-to-Agent Supply Chain Attack Confirmed
Five Eyes agencies demand slow, auditable agentic deployments the same week CVE-2026-28353 confirmed the first AI-agent supply chain attack. Q2 2026 pilot-to-production conversion hit 31%. Your pre-production permission audit template inside.
- Stripe Gives AI Agents a Credit Card
Stripe launches agent-native payment rails, Anthropic eyes $900B valuation, Alibaba cuts agent tool waste by 96%, and Apple leaks Claude configs.
- Mistral Workflows and the Agent Production Gap
Mistral launches Temporal-powered agent orchestration, IBM ships Bob to 80K developers, and Ramp's Sheets AI vulnerability proves agents need guardrails before production.
- OpenAI Breaks Free from Azure Exclusivity, Ships Managed Agents on AWS
OpenAI ends Microsoft exclusivity and launches Bedrock Managed Agents on AWS. Plus: Mistral ships production agent orchestration, AI discovers critical GitHub RCE, and Poolside drops an open-source agentic coding model.
- AI Insider #45 -- The Courtroom, the Credits, and the Cold War
Musk v. Altman opens, GitHub kills flat-rate AI pricing, China blocks Meta's Manus deal, David Silver raises $1.1B, and Google commits $40B to Anthropic.
- Agentic Systems Are Now First-Tier Attack Targets
Vercel breached via agentic OAuth abuse, Git identity spoofing bypasses AI code review, and a hybrid symbolic-LLM system prevents EUR 1.2M in fraud at 28x speed.
- Agent Vault, GPT-5.5, DeepSeek V4, and the Cost of Trusting Your Agent
Infisical's Agent Vault eliminates credential exfiltration from AI agents. GPT-5.5 and DeepSeek V4 launch the same day. Google's code is 75% AI-generated. Sullivan & Cromwell files AI-hallucinated citations. The infrastructure layer is catching up to the capability layer.
- Anthropic's Mythos Reaches a Discord Group Before CISA Does
Anthropic's Mythos cybersecurity AI leaked via a vendor breach on launch day. Google split its TPU architecture. Qwen3.6-27B beats models 15x its size locally. The control layer is falling behind the capability layer.
- Cloudflare Agents Week 2026: The First Full-Stack Agent Platform
Cloudflare Agents Week overhauled the edge stack for production agents, Microsoft open-sourced runtime governance covering all OWASP agent risks, and Google's A2A protocol hit 150+ organizations one year in.
- Cloudflare Rewrites the Production Agent Stack
Cloudflare rebuilds agent infrastructure from scratch, Anthropic splits capability access into two tracks, and Hermes Agent hits 103K stars with self-improving agents 40% faster on repeated tasks.
- Anthropic Managed Agents, Visa Agentic Payments, and the Infrastructure Lock-In
Anthropic opens $0.08/hour managed agent execution. Visa wires autonomous agents into payment rails. Microsoft ships Agent Framework 1.0 with OWASP governance coverage. The agentic infrastructure stack is now real -- what operators do next is the only variable left.
- Anthropic's Deliberate Capability Gap — When the Most Powerful Model Is Too Dangerous to Ship
Anthropic releases Opus 4.7 while deliberately holding back the more capable Mythos Preview — which found thousands of vulnerabilities in every major OS and browser. Google moves on classified Pentagon AI. Android gets agent coding skills. And dead company data becomes agent training fuel.
- Meta Goes Proprietary, Hightouch Hits $100M ARR, and Cloudflare Hardens Agent Infrastructure
Meta launches Muse Spark under a closed license with sub-agent orchestration scoring 58% on Humanity's Last Exam; Hightouch reaches $100M ARR with $70M from AI alone; Cloudflare's Project Think makes agent execution durable; leaked Claude Opus 4.7 sends Figma down 6% and Adobe down 4%.
- Cloudflare Agents Week, OpenAI SDK Sandbox, HubSpot's 70% Auto-Resolution — The Production Stack Is Here
Cloudflare ships 10+ agent infrastructure primitives in one week; OpenAI adds sandbox and harness to Agents SDK; HubSpot reports 70% auto-resolution and 2x sales response rates from production agents; Zetrix launches verifiable agent identity.
- The Managed Runtime Race: Anthropic vs. OpenAI Are Now Competing for Your Agent Infrastructure
Anthropic launches Claude Managed Agents in public beta; OpenAI ships sandbox execution and a model-native harness in its Agents SDK update; IBM's AskHR delivers 11.5M interactions at 94% resolution; PwC finds 56% of CEOs still see no P&L impact from AI.
- Multi-Agent Systems Hit 1,445% Search Surge — The Single-Agent Era Is Over
Gartner records a 1,445% surge in multi-agent system queries; Databricks reports 327% workflow growth; PwC finds 74% of AI value flows to 20% of organizations; OpenAI launches Frontier enterprise agent platform.
- Managerbot, Glasswing, and the Proactive Agent Era
Block's Managerbot ships proactive AI to millions of Square sellers; Anthropic locks down Claude Mythos Preview via Project Glasswing after autonomous zero-day discovery; Accenture backs Replit for enterprise AI development. Gartner projects 40% of enterprise apps include task-specific agents by end of 2026.
- Microsoft Agent Framework 1.0 Ships — The Production Stack Converges
Microsoft unifies AutoGen and Semantic Kernel into Agent Framework 1.0 with full MCP and A2A support; 90% of developers use AI tools; EY deploys agents to 130,000 auditors.
- EY Deploys Agents to 130,000 Auditors as Governance Gap Widens
EY hands 130,000 auditors a live multi-agent system, Microsoft ships Agent 365, Google drops Gemma 4 under Apache 2.0, and RIT researchers prove most agents will mishandle your SSN if you let them.
- Claude Autonomously Exploits FreeBSD Kernel — The Agentic Security Era Is Here
An AI agent cracked a FreeBSD kernel in 8 hours. Microsoft shipped open-source agent governance. Coinbase's x402 payment protocol moved to Linux Foundation. Issue #28.
- Claude Code Finds 23-Year-Old Linux Kernel Vulnerability -- And the Agent Security Stack Takes Shape
A coding agent discovers a decades-old Linux kernel exploit, NVIDIA ships OpenShell for agent security, and a one-line plugin cuts agent token costs by 65%.
- OpenAI's Leadership Cracks Under IPO Pressure — While Agents Start Running Ops Without Permission
OpenAI lost three executives in a week ahead of its Q4 IPO. Claude autonomously exploited a FreeBSD kernel. Microsoft shipped Agent Framework 1.0. And the payment layer for the agentic web found its permanent home. Issue #31.
- Gemma 4 Brings Agentic AI to Your Laptop -- And the Security Taxonomy You Need
Google and NVIDIA ship on-device agentic models, CSA drops an 11-type agent security taxonomy, and Claw Code hits 72K GitHub stars in days.
- Frontier Models Spontaneously Protect Each Other From Shutdown
Berkeley finds 99% peer-preservation rates in frontier models, 93% of agent frameworks lack identity controls, and Gartner predicts 40% of agent projects will be canceled.
- RSAC 2026 Exposes the Identity Crisis in Agentic AI Security
Five vendors launch AI agent identity frameworks at RSAC but Fortune 50 incidents reveal three critical gaps. Plus: 80% enterprise ROI confirmed, recursive hallucination chains, and Visa wires agent payment rails.
- Meta Agent Breach Exposes the 175x Security Gap
Agent deployments exploded 175x to 14 million — and Meta's Sev 1 breach proves nobody is watching. Plus: Google TurboQuant rattles memory stocks, Microsoft ships multi-model critique, and desktop-native agents arrive.
- AI Agent Insider -- Issue #22: Your Agents Can Escape Their Sandboxes
Oxford's SandboxEscapeBench proves AI agents exploit container misconfigs. Cisco ships 6 SOC agents and open-source DefenseClaw. Oracle adds persistent memory to Database 26ai. Plus the security checklist every agent deployer needs.
- Issue #22: OpenAI Acquires Astral — The Python Toolchain Play Nobody Expected
OpenAI buys the team behind Ruff and uv to own the Python developer pipeline. Plus: Claude Code's destructive git reset bug, OpenAI's internal agent misalignment monitoring, and the Cognitive Dark Forest thesis.
- Issue #21: Claude Mythos Leak Exposes the Agent Security Gap
Anthropic's most powerful model leaked from an unsecured database with cyber capabilities that outpace defenders. Plus: Langflow RCE on CISA KEV, Claude Computer Use goes autonomous, and ARC-AGI-3 stumps every frontier model.
- Issue #20: Self-Programming Agents Top Every Benchmark — OpenSage Rewrites the Rules
Berkeley's OpenSage self-programming agent hits 60.2% on CyberGym vs 39.4% for OpenHands. Plus: MCP at 97M installs, Amazon Connect saves 630 hours/week, and the AI efficiency paradox explained.
- Issue #19: MCP Hits 4,000 Servers — The Universal Agent Protocol Is Locked In
MCP crosses 4,000 servers with 100% major AI lab adoption. Plus: Gartner says 40% of agentic projects die by 2027, Anthropic cuts error rates 40%, and Santander ships Europe's first regulated AI payment.
- Issue #18: 400B LLM Runs On-Device — iPhone 17 Pro Changes the Architecture Conversation
A 400B parameter LLM runs on a phone. Plus: Dapr Agents v1.0 at KubeCon, PydanticAI multi-agent support, and the architecture conversation shifts from cloud to device.
- Issue #17: Self-Improving Agent Loops Ship Real Results Over a Weekend
A developer ran a self-improving agent loop on real research code and it worked. Plus: 400B LLM on iPhone 17 Pro, AI receptionist ROI, Trivy supply chain attack, and Mozilla cq MCP server.
- Sora Burns $15M/Day, NVIDIA's Agent Toolkit Lands, and a Court Calls the Pentagon 'Orwellian'
OpenAI shuts down Sora ($15M/day cost, $2.1M total revenue); NVIDIA closes GTC 2026 with Agent Toolkit and Vera Rubin; federal court blocks Pentagon's Anthropic ban.
- Issue #16: NVIDIA Agent Toolkit Goes Enterprise — 17 ISVs Building on NemoClaw
GTC 2026 just ended, and Jensen Huang has made NVIDIA's bet explicit: the enterprise software industry will restructure around agentic AI, and NVIDIA wants to be the platform layer
- Issue #15: OpenCode Hits 120K Stars as Open-Source Coding Agents Surge
The Fortune 500 has crossed a threshold: 80% are running active AI agents right now, not piloting them. The question isn't whether your industry is deploying agents — it's whether
- Issue #14: The Governance Gap: 80% of Fortune 500 Run AI Agents, Only 14% Have Approval
Agents are in production at 80% of Fortune 500 companies but only 14.4% have full security approval. Plus: OpenCode hits 5M devs, Sitefire GEO ROI, and Databricks DASF v3.0.
- Issue #13: Agentic Scaling Goes 9x — 910 Experiments in 8 Hours
SkyPilot scales Claude Code to 16 GPUs for 910 experiments in 8 hours. Plus: 70% of open-source PRs are now bots, Cloudflare runs large models at the edge, and sub-25MB voice models ship on-device.
- Issue #12: Agents Become Economic Actors
Stripe ships MPP — the open protocol for agent-to-agent payments. Plus: DORA's 376% ROI numbers, OWASP's agentic security framework, and a zero-training trick that 3×'s LLM reasoning.
- Issue #11: Meta's REA Agent 5×'s Engineering Output
Meta proved a 3-person team with an autonomous ML agent out-produces a 15-person team. Plus: NVIDIA's enterprise agent toolkit, Google's Colab MCP server, and the rogue agent governance gap.
- Issue #10: Infrastructure Catches Up to Ambition
Identity, sandboxing, and compliance gating all shipped the same week
- Issue #9: Enterprise Agent ROI Proves Out
Salesforce $800M ARR and containment rates as the new north star metric
- Issue #8: Multi-Model Routing Is the New Moat
Perplexity routes across 20 models — the single-model era is over
- Issue #7: 1M-Token Context Goes Standard
Anthropic makes 1M context GA — no long-context premium
- Issue #6: Agent Security Enters the Enterprise Stack
MCP red-teaming, SSRF patches, and the governance layer materializing
- Issue #5: The IDE Is Now the Agent Orchestration Layer
JetBrains shipped Air — the dev tooling race is about agent dispatch queues
- Issue #4: Memory Is Infrastructure Now
Persistent agent memory moves from experiment to production primitive
- Issue #3: The Orchestration Layer Takes Shape
Who controls the dispatch queue controls the stack
- Issue #2: Foundation Models Get Competition
OpenAI's empire cracks at the edges as competitors ship faster