Home / Blog / Claude Code

Claude Code subagents: how and when to use them

By Pavle Lazic · Founder, Scalably · June 2026

The first time a long Claude Code session of mine fell apart, it wasn't a bug. It was context. I'd let one thread read a hundred files, run the test suite three times, and grep half the repo, and by the time it got to the actual change it had lost the plot. Subagents are the fix: they run the messy side work in a separate window and hand back a summary. Here's how they work, how to define one, and the part most guides skip - when not to bother.

In this guide

What Claude Code subagents are
Why the separate context window is the whole point
How to define a subagent
How Claude decides to use one
Scoping tools and the model
When to reach for a subagent
When not to spawn one
How I actually use them in production

What Claude Code subagents are

Claude Code subagents are specialized assistants the main session can delegate a task to. Each one runs in its own context window with its own system prompt, its own tool permissions, and independent permissions. The main thread hands off a task, the subagent does the work in isolation, and only its summary comes back. That isolation is the feature, not a side effect.

Claude Code ships with a few built in: Explore (a fast, read-only search agent on Haiku), Plan (the research agent that runs during plan mode), and general-purpose (a full-tool agent for multi-step work). You can also write your own. A subagent is just a Markdown file with some YAML at the top, which makes the whole system far less mysterious than it sounds. The model reads a list of available subagents, matches your request against their descriptions, and delegates when one fits.

Why the separate context window is the whole point

The reason subagents matter is the separate context window. When a subagent runs a test suite or reads forty files, all that output stays in the subagent's window. Your main conversation only sees the result. That keeps the main thread focused on the task instead of drowning in logs you'll never read again.

Context is a budget, and verbose side work spends it fast. Anthropic's own subagents documentation puts the use case plainly: reach for one "when a side task would flood your main conversation with search results, logs, or file contents you won't reference again." A subagent starts with a fresh, isolated window - it doesn't inherit your conversation history, the files already read, or the skills already loaded. It gets a delegation message describing the task and works from there.

This cuts both ways, and it's worth being honest about. Because the subagent starts clean, it has to re-gather context you might already have in the main thread. That's latency, and it's why a subagent is the wrong tool for a quick edit. The win is real only when the work is bulky enough that keeping it out of your main window is worth the cold start.

How to define a subagent

A subagent is a Markdown file with YAML frontmatter, saved in .claude/agents/ for one project or ~/.claude/agents/ for all of them. The frontmatter sets the metadata; the Markdown body below it becomes the subagent's system prompt. Only name and description are required.

Here's a complete, working subagent - a read-only code reviewer. The frontmatter names it, tells Claude when to use it, restricts it to read tools, and the body is the instructions it runs under:

---
name: code-reviewer
description: Reviews code for quality and security. Use after writing or changing code.
tools: Read, Grep, Glob, Bash
model: sonnet
---

You are a senior code reviewer. When invoked, run git diff to see
recent changes, focus on the modified files, and report issues by
priority: critical (must fix), warnings (should fix), suggestions.
Flag exposed secrets, missing input validation, and weak error
handling. Include a concrete fix for each issue you raise.

Drop that file in .claude/agents/code-reviewer.md, restart the session, and Claude can now delegate code reviews to it. The filename doesn't have to match the name field - identity comes from name, not the path. If editing YAML by hand isn't your thing, the /agents command opens a guided interface that writes the file for you and loads it without a restart.

The four fields you'll touch most:

Field	Required	What it does
`name`	Yes	Lowercase-and-hyphens identifier. How you and Claude refer to the subagent.
`description`	Yes	When Claude should delegate to it. This is what automatic delegation reads.
`tools`	No	The tools it may use. Inherits all of the main conversation's tools if omitted.
`model`	No	`sonnet`, `opus`, `haiku`, `fable`, a full model ID, or `inherit`. Defaults to `inherit`.

How Claude decides to use one

Claude reads the description field of every available subagent and delegates when your request matches one. The description is the routing logic. A vague description means Claude won't know when the subagent applies; a sharp one with a clear trigger gets it picked at the right moment.

This is why "use proactively" or "use after writing code" inside a description does real work - it tells Claude the trigger condition, not just the capability. You can also force the choice yourself. Naming the subagent in your prompt ("use the code-reviewer subagent") usually gets Claude to delegate, and an @-mention guarantees a specific one runs for that task rather than leaving it to Claude's judgment.

The description is everything. A subagent with great instructions and a lazy description sits unused, because Claude never realizes the task fits it. Write the description as a trigger ("use when X"), not a label.

Scoping tools and the model

Scope a subagent's tools to the minimum it needs, and pick a model that matches the work. A review agent that never edits files should have Write and Edit removed; a cheap search agent should run on Haiku, not Opus. Tool and model scoping is how you make a subagent both safer and cheaper than doing the same work in the main thread.

If you omit tools, the subagent inherits everything the main conversation has, MCP tools included. You usually don't want that. Listing tools is an allowlist: tools: Read, Grep, Glob, Bash gives the subagent exactly those four and nothing else, so it can't write files or call any MCP server. There's also disallowedTools if you'd rather inherit everything except a few things - disallowedTools: Write, Edit keeps Bash and MCP but blocks file changes.

The model field is a cost lever the docs call out directly: route tasks "to faster, cheaper models like Haiku." A subagent that greps a codebase doesn't need the model running your architecture decisions. Set model: haiku on the search agent, leave the heavy reasoning agent on inherit, and you pay Haiku rates for the work that's fine on Haiku. The built-in Explore agent does exactly this - it runs on Haiku by default.

When to reach for a subagent

Use a subagent when the work produces verbose output you don't need in your main context, when you want to enforce tool restrictions, or when the task is self-contained enough to return a clean summary. The clearest win is isolating high-volume operations: running tests, fetching docs, or processing logs, where the output is large but only the conclusion matters.

The other strong case is parallel, independent research. If you need to understand the auth module, the database layer, and the API layer, and none of those investigations depends on the others, you can dispatch a subagent per area and let them run at once. Each explores its slice, and Claude synthesizes the findings. I lean on this constantly - it's the difference between a ten-minute sequential crawl and three searches happening together.

A subagent earns its keep when:

The side task would dump logs, search hits, or file contents into your main window that you'll never read again.
Two or more investigations are genuinely independent and can run in parallel.
You want a hard tool boundary - a reviewer that physically cannot edit, a reader that cannot write.
The work routes cleanly to a cheaper model without losing quality.

When not to spawn one

Don't spawn a subagent for a quick, targeted change, for work that needs constant back-and-forth, or for anything where multiple phases share heavy context. Subagents start fresh and have to gather their own context, so for a two-second edit the cold start costs more than the task. This is the honest part most tutorials leave out.

Anthropic's documentation is direct about this: use the main conversation when "the task needs frequent back-and-forth or iterative refinement," when "multiple phases share significant context (planning then implementation then testing)," or when "you're making a quick, targeted change." It also flags latency outright - subagents "start fresh and may need time to gather context."

There's a second trap. Every subagent that finishes returns its result into your main window. Spawn five subagents that each hand back a detailed report and you've refilled the context you were trying to protect. The discipline is to have subagents return summaries, not transcripts. If you find yourself spawning a subagent to do something the main thread could finish in one tool call, you've added a process, a cold start, and a round trip to save nothing.

For a quick question about something already in your conversation, the main thread already has the context a subagent would have to rebuild. Save the delegation for work that's actually bulky or actually parallel.

How I actually use them in production

In practice, the pattern I use most is: brief a fresh subagent fully, because the prompt is its only channel, then let it return a result without bloating my main thread. A subagent doesn't see the conversation that led here. Whatever it needs to know has to be in the task I hand it.

That single fact - the prompt is the only channel - changes how you write the delegation. A subagent isn't a colleague who's been in the room; it's a contractor who walked in cold. When I dispatch one to research how a system handles retries, I don't say "look into the retry thing." I give it the files worth reading, the specific question, and the shape of answer I want back, because everything I leave out is context it has to rediscover or guess.

The combination that holds up across real work: scope the tools so the subagent can't do damage outside its job, put the model on whatever tier the task actually needs, and write a description sharp enough that Claude reaches for it on its own. Subagents pair naturally with the rest of the Claude Code toolkit - I use Claude Code hooks to enforce guardrails on what a subagent's tools can do, the same way the docs show a PreToolUse hook blocking write queries on a read-only database agent.

Subagents are also where Claude Code starts to feel less like an autocomplete and more like something that delegates - which is the same line that separates it from editor-first tools in the Claude Code vs Cursor comparison. The mechanism is simple: a Markdown file, a fresh window, a summary back. The skill is knowing the difference between a task that deserves its own context and one that's just a tool call you should have made yourself.

Pavle Lazic is the founder of Scalably, where he builds and runs multi-tenant Claude agent platforms in production for real businesses. He writes about the Claude Agent SDK, MCP servers, and what it actually takes to put AI agents to work. See the platform.