§5.6 · Code

opusplan, subagents, and budget discipline

When to reach for Opus, and how to budget a swarm of subagents.

Claude Code lets you mix Opus (the heaviest, smartest, most expensive model) with Sonnet/Haiku and lets a primary agent spawn subagents for parallel work. Both features are leverage; both can quietly 10× your bill.

opusplan — Opus for the plan, Sonnet for the execution

The opusplan mode uses Opus for planning (where reasoning quality matters most), then Sonnet for the execution (where the work is mechanical and Sonnet handles it fine). Cost-wise this is the sweet spot for hard tasks: you pay Opus on a short plan and Sonnet on the long execution.

Use it when:

The task involves non-obvious architectural decisions (e.g. "how should this auth flow handle the X edge case?")
A wrong plan would waste hours of execution
The execution itself is mechanical: implementing a designed API, refactoring a known shape

Don't use it when:

The task is mechanical end-to-end ("rename this variable across the file")
Sonnet's first attempts have been fine
You're under a budget threshold — the Opus planning charge alone can eat your daily allowance

The opusplan Recommender scores each task before you commit.

Subagent budgets

A subagent spawned with the Agent tool runs its own conversation, pays its own context, and reports back. Spawn 5 in parallel and you've quintupled your token spend in 30 seconds.

Three rules:

Cap each subagent's max_tokens. Don't let any one of them write 10,000 tokens of output.
Give each one a tightly-scoped prompt. "Find files matching X and return the paths" is fine. "Investigate the auth system and tell me what's wrong" is unbounded — that's a 4k-token answer per agent.
Run them only when their results are independent. If agent B needs agent A's output, run them sequentially. Parallelism with dependencies is just waste.

The Subagent Budgeter projects spend before you spawn so you can adjust before the cost happens.

When NOT to spawn at all

If you can answer the question by reading three files yourself, do that. A subagent is appropriate when you'd otherwise read fifteen files in a stalled exploratory loop — not when the task is small enough that the spawn-and-summarize overhead exceeds the value.