Issue #8 · AI Agent Insider

Issue #8: Multi-Model Routing Is the New Moat

Saturday, March 14, 2026 · 4 min read

Table of Contents

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

The Hook

Perplexity just shipped a 20-model orchestration engine to enterprise and 100 companies demanded access in a weekend. Meanwhile, China is paying founders $1.4M each to build one-person businesses run by AI agents. The “should we use agents?” era is over. The “how fast can we deploy them?” era started this week.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

This Week’s Signal

No single AI model commands more than 25% of Perplexity’s enterprise traffic. A year ago, two models handled 90% of queries. Now their orchestration engine routes across ~20 models — Opus for reasoning, Gemini for research, Grok for speed, GPT-5.2 for recall. The implication for operators: stop betting on one model. Build routing logic that picks the best tool for each subtask. The companies doing this are already outperforming single-model deployments.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

3 Operator Playbooks

1. Run a Multi-Model Agent Stack (Like Perplexity Does)

Perplexity Computer for Enterprise orchestrates Claude Opus for deep reasoning, Gemini for research sweeps, and specialized models for image/video — all in isolated Firecracker VMs per session. The Slack integration means employees see each other’s agent queries in shared channels, creating ambient onboarding that no training deck can match. Your move: Set up a model router (even a simple one — LiteLLM, OpenRouter) that assigns tasks by type. Reasoning → Opus. Speed → Flash. Research → Gemini. You’ll cut costs and improve output quality in the same sprint.

2. Turn Google Workspace Into Your Agent’s Operating System

Google shipped Gemini agents across Docs, Sheets, Slides, and Drive that pull data from multiple apps per prompt. The standout: “Fill with Gemini” in Sheets tested 9× faster than manual entry in a 95-person study. Drive now works as an active knowledge base with cross-file queries. Your move: If your team lives in Google Workspace, enable the Gemini Alpha program today. Create a shared Drive “project” folder that Gemini can query across — it turns your scattered docs into a searchable agent knowledge base without any RAG infrastructure.

3. Build a One-Person Company (China Is Literally Paying You To)

Seven Chinese cities launched AI agent subsidy programs this week. Hefei and Shenzhen each offer up to $1.4M for solo founders running “one-person companies” with AI agent employees. The thesis: a single operator with the right agent stack can output what a 10-person team did two years ago. Your move: You don’t need Chinese subsidies to run this play. Map your top 5 time sinks. Automate 3 of them this month with agent workflows (scheduling, research briefs, first-draft content, data entry, customer triage). Track hours saved. That’s your proof of concept for scaling.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Steal This

Tool: Hume AI’s TADA (MIT license) — zero-hallucination text-to-speech

Most TTS systems hallucinate — they skip words, add phantom syllables, or drift off-script. TADA maps exactly one audio frame per text token, hitting zero hallucinations across 1,000+ samples while running 5× faster than comparable systems. It’s small enough for smartphones (1B params for English, 3B for 8 languages). If you’re building voice agents, customer support bots, or audio content pipelines, swap your TTS layer for TADA and eliminate the “did the AI say the right thing?” QA step entirely. GitHub →

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Also on Our Radar

Meta plans up to 20% layoffs (~16,000 jobs) to offset $600B in AI infrastructure spending through 2028. Zuckerberg says projects that once needed large teams can now be handled by individuals. The pattern is clear: companies are simultaneously spending more on AI and employing fewer humans.
Mirendil — a neo-lab founded by ex-Anthropic researchers — is raising $175M at a $1B valuation to apply AI agents to biology and materials science research. The talent exodus from frontier labs into specialized startups continues accelerating.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Share AI Agent Insider with one operator who’d actually use this. That’s how we grow → insider.dforge.ca

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ AI Agent Insider is a weekly briefing for founders and operators who build with AI agents. No hype. No sponsored picks. Just signal.

The Hook

This Week’s Signal

3 Operator Playbooks

1. Run a Multi-Model Agent Stack (Like Perplexity Does)

2. Turn Google Workspace Into Your Agent’s Operating System

3. Build a One-Person Company (China Is Literally Paying You To)

Steal This

Also on Our Radar

Stay sharp.