Published on February 6, 2026 by Dominic Böttger · 46 min read
Yesterday, Anthropic announced Claude Opus 4.6 alongside a feature I’ve been waiting for: Agent Teams in Claude Code. Agent teams let you coordinate multiple Claude Code instances working together — one session acts as the team lead, spawning teammates that work independently, each in its own context window, communicating through a shared task list and direct messaging.
Note: As with my previous article on Ralph Loop, this represents my current experiments with AI-assisted development workflows. Agent Teams are experimental and the landscape is evolving fast. I’m building on the work of Addy Osmani, Kieran Klaassen, Paddo’s swarm writeups, and Anthropic’s own documentation.
This article is a deep dive: the complete source code of the /speckit.team-implement command, the algorithm behind stream detection, the agent prompts, the coordination loop, error recovery, practical usage examples, and lessons learned. If you’re using Spec Kit and want to understand exactly how parallel execution works under the hood, this is the complete picture.
Part 1: Understanding Agent Teams
What Claude Code Agent Teams Are
Agent Teams, announced February 5th and documented here, are an experimental feature in Claude Code. Here’s the architecture:
| Component | Role |
|---|---|
| Team lead | Your main Claude Code session — creates the team, spawns teammates, coordinates work |
| Teammates | Separate Claude Code instances, each with their own context window |
| Task list | Shared work items that teammates claim and complete, stored at ~/.claude/tasks/{team-name}/ |
| Mailbox | Messaging system for inter-agent communication |
Unlike subagents (which run within a single session and can only report back to the main agent), teammates are fully independent. They can message each other directly, self-claim tasks from the shared list, and you can interact with individual teammates using Shift+Up/Down in the terminal.
How They Differ from Subagents
This distinction matters for understanding why /speckit.team-implement uses Agent Teams rather than subagents:
| Subagents | Agent Teams | |
|---|---|---|
| Context | Own context window; results return to caller | Own context window; fully independent |
| Communication | Report results back to main agent only | Teammates message each other directly |
| Coordination | Main agent manages all work | Shared task list with self-coordination |
| Best for | Focused tasks where only the result matters | Complex work requiring discussion and collaboration |
| Token cost | Lower: results summarized back to main context | Higher: each teammate is a separate Claude instance |
For feature implementation, we need teammates that can work autonomously on different file sets for extended periods, coordinate through a shared task list, and communicate when they discover issues. That’s the Agent Teams model.
Enabling Agent Teams
Agent Teams are disabled by default. Enable them by adding the experimental flag to your settings:
// ~/.claude/settings.json
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
}
}
Or set it as an environment variable:
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
Key Capabilities
From the official documentation:
Self-coordination: After finishing a task, a teammate picks up the next unassigned, unblocked task automatically. Task claiming uses file locking to prevent race conditions.
Task dependencies: When a teammate completes a task that other tasks depend on, blocked tasks unblock without manual intervention. This is critical for phase-based execution.
Delegate mode: Restricts the lead to coordination-only tools — spawning, messaging, shutting down teammates, and managing tasks. Prevents the lead from implementing tasks itself. Press Shift+Tab to cycle into delegate mode.
Plan approval: Teammates can work in read-only plan mode until the lead approves their approach. Useful for complex or risky tasks.
Display modes: In-process mode (all teammates in one terminal, Shift+Up/Down to navigate) or split-pane mode (each teammate gets its own tmux/iTerm2 pane).
Direct interaction: You can message any teammate directly at any time. In in-process mode, press Enter to view a teammate’s session, Escape to interrupt their current turn.
Known Limitations
Agent Teams are experimental. The docs list several limitations:
- No session resumption:
/resumeand/rewinddon’t restore in-process teammates - Task status can lag: teammates sometimes fail to mark tasks as completed, blocking dependent tasks
- Shutdown can be slow: teammates finish their current request before shutting down
- One team per session: clean up the current team before starting a new one
- No nested teams: teammates cannot spawn their own teams
- Split panes require tmux or iTerm2: not supported in VS Code’s integrated terminal or Ghostty
Part 2: Why Sequential Execution Has Limits
The Throughput Ceiling
In my Ralph Loop article, I described how fresh-context-per-task execution solves context exhaustion. Ralph Loop is sequential by design: one task, one AI instance, one commit, repeat. That works remarkably well for maintaining quality.
But it has a fundamental throughput ceiling. Consider a typical full-stack feature with 40 tasks:
- 18 backend tasks (Rust handlers, migrations, tests)
- 10 frontend tasks (React components, state management)
- 4 E2E test tasks
- 6 cross-cutting tasks (docs, security audit, polish)
- 2 setup tasks
In Ralph Loop, these execute strictly sequentially. Task 20 (a React component) waits for task 19 (a Rust migration test) even though they share zero files. The total wall-clock time is the sum of all tasks.
With parallel execution, it could be closer to the maximum of independent streams:
Sequential (Ralph Loop):
┌──────────────────────────────────────────────────────────────────────┐
│ T001 │ T002 │ ... │ T018 │ T019 │ ... │ T028 │ T029 │ ... │ T040 │
│ backend │ frontend │ e2e + polish │
└──────────────────────────────────────────────────────────────────────┘
Total time: sum of all 40 tasks ≈ 4 hours
Parallel (Agent Teams):
┌──────────────────────────────┐
│ backend-dev: T001...T018 │ ←── longest stream
├──────────────────────┐ │
│ frontend-dev: T019...T028 │
├──────────────┐ │ │
│ e2e + polish │ │ │
└──────────────┴───────┴───────┘
Total time: longest stream ≈ 2 hours
Why Multiple Ralph Loops Don’t Work
The naive approach — running two Ralph Loop instances on the same repo — breaks immediately:
File conflicts: Two agents editing the same file simultaneously causes overwrites. Git merge conflicts at best, silent data loss at worst.
No coordination: Agent A doesn’t know agent B exists. They might both try to implement the same task, or implement conflicting approaches to the same problem.
No quality gating: Who verifies that the combined output of both agents actually works together? A backend change might break the frontend, and neither agent knows.
No dependency management: Some frontend tasks need the backend API to exist first. Without coordination, the frontend agent starts work that’s doomed to fail.
You need a system that understands which tasks can safely run in parallel, assigns them to the right agents, manages dependencies between streams, and verifies the combined result.
Part 3: The /speckit.team-implement Command
What Gets Built
One new file in the Spec Kit plugin:
~/Development/claude-plugins/.claude/commands/speckit.team-implement.md
This is a Claude Code slash command (like /speckit.implement or /speckit.tasks). When you type /speckit.team-implement in Claude Code, it executes this command file. The command contains the complete instructions for stream detection, team creation, and coordination.
How It Fits the Workflow
The command plugs into the existing Spec Kit workflow. You still use the same planning commands:
1. /speckit.specify → spec.md (what to build)
2. /speckit.plan → plan.md (how to build it)
3. /speckit.tasks → tasks.md (granular task list)
4. /speckit.team-implement → parallel execution with Agent Teams
Steps 1-3 are unchanged. Step 4 is where the new command replaces /speckit.implement (single-agent sequential) or Ralph Loop (fresh-context sequential).
Prerequisites
Before running the command, you need:
- Agent Teams enabled (see above)
- Spec Kit artifacts in your project:
specs/{feature}/tasks.md(required — the task list to execute)specs/{feature}/plan.md(required — architecture context for agents)specs/{feature}/spec.md(required — feature requirements).specify/memory/constitution.md(optional — quality gates and rules)
- A feature branch following Spec Kit naming:
001-feature-name
Running the Command
# Basic usage -- auto-detect everything
/speckit.team-implement
# Preview what would happen without creating a team
/speckit.team-implement --dry-run
# Limit to 2 parallel streams
/speckit.team-implement --streams 2
# Require plan approval before teammates implement
/speckit.team-implement --require-plan-approval
Part 4: The Complete Command Source
Here is the complete, unabridged source code of speckit.team-implement.md — the actual command file that Claude Code executes when you type /speckit.team-implement. This is the full file, exactly as it lives in the plugin repository:
---
description: Execute tasks.md using Agent Teams with auto-detected parallel work streams.
handoffs:
- label: Analyze First
agent: speckit.analyze
prompt: Run consistency analysis before team execution
send: true
---
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Outline
Execute the feature's tasks.md using Claude Code Agent Teams. Auto-detects parallel work streams from file paths, spawns specialist teammates for each stream, and includes a mandatory QA/Security gatekeeper. This is the parallel alternative to sequential `/speckit.implement` (or Ralph Loop).
**When to prefer this over sequential execution:**
- 5+ unchecked tasks remaining (coordination overhead is worth it below this threshold)
- Tasks span multiple directories (file-conflict analysis finds parallel streams)
- Project benefits from a dedicated QA/Security reviewer teammate
**When to suggest Ralph Loop / sequential `/speckit.implement` instead:**
- < 5 unchecked tasks remaining
- All tasks are strictly sequential with no `[P]` markers AND touch a single directory
- User explicitly wants fresh-context-per-task isolation
If conditions suggest sequential execution is better, inform the user and recommend `/speckit.implement` or Ralph Loop instead. Do not proceed with team creation.
---
## Step 1: Setup
Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse `FEATURE_DIR` and `AVAILABLE_DOCS` list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
Extract from the parsed JSON:
- `FEATURE_DIR` — absolute path to the feature's spec directory
- Load and read these files:
- **REQUIRED**: `FEATURE_DIR/tasks.md`
- **REQUIRED**: `FEATURE_DIR/plan.md`
- **REQUIRED**: `FEATURE_DIR/spec.md`
- **IF EXISTS**: `.specify/memory/constitution.md`
- **IF EXISTS**: `FEATURE_DIR/data-model.md`
- **IF EXISTS**: `FEATURE_DIR/research.md`
- **IF EXISTS**: `FEATURE_DIR/contracts/`
Derive the feature name from the `FEATURE_DIR` directory name (e.g., `001-deploy-pipeline` → `deploy-pipeline`).
---
## Step 2: Parse Unchecked Tasks
Find all unchecked task lines in `tasks.md`:
- Match lines with pattern: `- [ ]` (markdown unchecked checkbox)
- Skip already-completed `- [x]` or `- [X]` lines
For each unchecked task, extract:
- **Task ID**: The `T###` identifier (e.g., `T017`)
- **Parallel marker**: Whether `[P]` is present
- **Story label**: The `[US#]` label if present (e.g., `[US1]`, `[US2]`)
- **Description**: The full task description text
- **File path**: Extract file paths from the description using regex for patterns like `word/word/file.ext` or `word/word/` (path-like segments with `/`)
- **Phase**: Which phase section the task belongs to (from the `## Phase N:` heading it falls under)
Count total unchecked tasks. If zero, report "All tasks are already completed. Nothing to do." and stop.
If fewer than 5 unchecked tasks, recommend sequential execution:
```
Only N unchecked tasks remaining. Team coordination overhead is not worth it.
Recommendation: Use /speckit.implement or Ralph Loop for sequential execution.
```
Ask the user if they want to proceed anyway or switch to sequential execution. Stop if they choose sequential.
---
## Step 3: Auto-Detect Work Streams (File-Conflict Analysis)
Stream detection uses **file-conflict analysis** (not keyword matching) to determine what can safely run in parallel. Keywords are used only for agent role assignment and intra-stream task ordering.
### Algorithm
**3a. Extract file paths from each unchecked task:**
- Regex for patterns like `word/word/file.ext` or `word/word/` in the task description
- Use the first TWO path segments for grouping (e.g., `backend/src/`, `frontend/e2e/`, `tests/unit/`)
- Tasks with no file path → mark as "repo-wide" (e.g., security audits, documentation reviews)
**3b. Build file-conflict graph:**
- Task A conflicts with Task B if they share any file path OR share the same two-segment path prefix
- Connected tasks (sharing file paths) MUST be in the same stream (same teammate)
**3c. Find independent components (connected components in the conflict graph):**
- Each connected component = one work stream (one teammate)
- Independent components (no shared files) = can run in parallel (separate teammates)
- "Repo-wide" tasks (no file path) → always assigned to qa-security (runs last)
**3d. Assign agent roles using concern keywords:**
Keywords determine the agent's expertise and prompt additions, NOT stream boundaries:
- Tasks containing "security" / "SR-" / "audit" → security-aware prompt additions
- Tasks containing "test" / "E2E" / "spec.ts" / "spec.py" → testing-focused instructions
- Tasks containing "migration" / "schema" → database-aware ordering (run these first within stream)
- Tasks containing "doc" / "README" / "docs/" → documentation focus
Name each stream based on its dominant file path prefix and role:
- `backend/` paths → `backend-dev`
- `frontend/src/` paths → `frontend-dev`
- `frontend/e2e/` or `tests/` paths → `test-writer` (if separate from implementation dirs)
- `docs/` paths → `docs-writer`
- Multiple unrelated prefixes in one component → name by dominant concern
- Single directory project → `implementer`
**3e. Order tasks within each stream by concern priority:**
1. Migrations / schema changes (run first)
2. Models / entities
3. Services / business logic
4. Endpoints / handlers / UI components
5. Tests
6. Documentation
7. Polish / cleanup
**3f. Detect cross-stream dependencies:**
- **Story dependencies**: If tasks from different streams share the same `[US#]` label, the downstream stream's tasks for that story are blocked by the upstream stream's tasks for the same story (e.g., frontend US1 tasks depend on backend US1 tasks)
- **Phase dependencies**: Tasks in Phase N+1 are blocked by Phase N completion
- **E2E test dependencies**: E2E test tasks depend on the API/feature they test, even if in different directories — create a dependency edge
- **Setup/Foundation dependencies**: Phase 1 and Phase 2 tasks block all user story tasks
### Present Results for User Confirmation
Display the stream detection results clearly:
```
Stream Detection Results (file-conflict analysis):
Streams found:
{stream-name} ({N} tasks) — files: {dominant path prefixes}
{stream-name} ({N} tasks) — files: {dominant path prefixes}
qa-security ({N} repo-wide tasks + final verification)
Parallel groups:
{stream-a} + {stream-b} (no file conflicts)
Dependencies:
{stream-b} US1 tasks wait for {stream-a} US1 tasks
{stream-c} E2E tests wait for relevant implementations
qa-security runs final verification after all streams
Total: {N} unchecked tasks + qa verification
Proceed with this team structure? [Y/adjust/cancel]
```
Wait for user confirmation. If they say "adjust", ask what they want to change and rebuild streams. If "cancel", stop execution.
---
## Step 4: Build Dependency Graph
Map phase dependencies, `[P]` markers, and cross-stream relationships into task dependency relationships (for use with `addBlockedBy` in Step 5).
### Phase dependencies:
- Phase 1 (Setup) → no dependencies, starts immediately
- Phase 2 (Foundation) → blocked by ALL Phase 1 tasks completing
- Phase 3+ (User Stories) → blocked by ALL Phase 2 tasks completing
- Final Phase (Polish) → blocked by all user story phase tasks completing
### Within a phase:
- Tasks marked `[P]` → can run in parallel (no `addBlockedBy` within the phase, other than phase-level dependency)
- Tasks NOT marked `[P]` → sequential within their phase (each `addBlockedBy` on the previous non-`[P]` task in that phase within the same stream)
### Cross-stream dependencies:
- If a backend task `T017` and frontend task `T022` are in the same user story `[US1]`, and `T022` is in the downstream stream (determined by path analysis — frontend depends on backend API), then `T022` has `addBlockedBy: [T017]`
- E2E test tasks are blocked by the implementation tasks they test
- Detection heuristic: frontend tasks in a story that reference API-related paths are blocked by the backend handler tasks in the same story
Record these dependency relationships as a map: `{taskId: [blockedByTaskIds]}` for use in Step 5.
---
## Step 5: Hydrate into Claude Tasks
For each unchecked task from Step 2, call `TaskCreate`:
```
TaskCreate(
subject: "{TaskID} [{Story}] {Description (truncated to ~80 chars)}",
description: "From specs/{feature}/tasks.md.
File: {extracted file path}.
Phase: {phase number and name}.
Stream: {assigned stream name}.
Context: Read specs/{feature}/spec.md {story section} and plan.md for architecture.
Quality gates: {auto-detected from project — see Step 6 for detection}.",
activeForm: "{Present continuous form of task action}",
metadata: {
"stream": "{stream-name}",
"phase": "{phase-number}",
"story": "{US# or none}",
"speckit_id": "{TaskID}"
}
)
```
After creating all tasks, set up dependencies with `TaskUpdate(addBlockedBy: [...])` using the dependency map from Step 4.
### Present hydration results:
```
Task Hydration Complete:
Created: {N} tasks
Dependencies: {N} blocking relationships
Streams: {stream-name} ({N} tasks), ...
Create agent team and begin execution? [Y/cancel]
```
Wait for user confirmation before proceeding to team creation.
---
## Step 6: Create Agent Team with QA/Security Gatekeeper
### Auto-detect quality gates
Before creating the team, detect available quality gates from the project:
1. **From constitution** (`.specify/memory/constitution.md`): Extract any quality gate commands mentioned (test commands, lint commands, coverage requirements)
2. **From project config files** (auto-detect from repo root):
- `Cargo.toml` exists → `cargo clippy -- -D warnings` + `cargo test`
- `package.json` exists → read `scripts` for `test`, `lint`, `test:e2e` commands
- `pyproject.toml` or `setup.py` exists → `pytest` or detected test runner
- `go.mod` exists → `go vet ./...` + `go test ./...`
- `.eslintrc*` or `eslint.config.*` exists → `eslint` lint command
- `pnpm-lock.yaml` → use `pnpm` prefix; `yarn.lock` → use `yarn`; otherwise `npm`
3. **Fallback**: If no config files detected, use generic gates: "run any test commands found in project"
Build a quality gates list for the qa-security agent prompt.
### Create the team
```
TeamCreate(team_name: "{feature-name}")
```
### Spawn implementation agents (one per detected stream)
For each stream detected in Step 3, spawn an implementation agent:
```
Task(
name: "{stream-name}",
team_name: "{feature-name}",
subagent_type: "general-purpose",
prompt: "You are the {stream-name} developer for the {feature-name} feature.
FIRST: Read these files to understand the full context:
- {FEATURE_DIR}/spec.md (feature requirements — focus on sections relevant to your tasks)
- {FEATURE_DIR}/plan.md (architecture and technical decisions)
- {FEATURE_DIR}/tasks.md (full task list for reference)
{IF constitution exists: - .specify/memory/constitution.md (quality gates and non-negotiable rules)}
YOUR SCOPE: You ONLY work on files under: {list of path prefixes for this stream}
Do NOT touch files outside your scope. If a task requires files outside your scope,
message the lead to coordinate with the appropriate teammate.
WORKFLOW:
1. Check TaskList for tasks assigned to you
2. Pick the lowest-ID unblocked task assigned to you
3. Read the task description carefully via TaskGet
4. Implement the task following the spec and plan
5. After implementation: run quality gates for your scope:
{stream-specific quality gate commands}
6. If gates pass: mark task completed via TaskUpdate, commit changes:
git add {specific files you changed} && git commit -m 'feat({stream-name}): {task summary}'
7. If gates fail: fix issues and re-run before marking complete
8. Check TaskList for next available task and repeat
9. When all your tasks are done, notify the lead
IMPORTANT RULES:
- Never mark a task complete if quality gates are failing
- Never commit placeholder or stub implementations
- If blocked or confused, message the lead for help
- If you receive a fix request from qa-security, prioritize it over new tasks
- Commit after EACH completed task (not in batches)"
)
```
### Spawn QA/Security gatekeeper (ALWAYS present)
```
Task(
name: "qa-security",
team_name: "{feature-name}",
subagent_type: "general-purpose",
prompt: "You are the QA and Security gatekeeper for the {feature-name} feature.
FIRST: Read these files:
- {FEATURE_DIR}/spec.md (full feature requirements, especially security requirements SR-*, AL-*)
- {FEATURE_DIR}/plan.md (architecture and technical decisions)
{IF constitution exists: - .specify/memory/constitution.md (ALL quality gates and NON-NEGOTIABLE rules)}
YOUR ROLE: You VERIFY work done by other teammates. You do NOT implement features.
You are the final authority — nothing is marked truly complete until you approve.
QUALITY GATES (run ALL of these when asked to verify):
{auto-detected quality gate commands from Step 6 detection, formatted as a numbered list}
SECURITY CHECKS:
1. Input validation on all endpoints / user-facing interfaces
2. No hardcoded secrets, API keys, or credentials
3. Proper error handling (no stack traces leaked to users)
4. OWASP Top 10 review of new code
5. Authentication/authorization enforcement where required
6. Check for SQL injection, XSS, command injection vulnerabilities
{IF spec has security requirements: 7. Verify SR-*/AL-* requirements from spec.md}
TASK INTEGRITY CHECKS:
1. No placeholder or stub implementations (all features fully functional)
2. No undocumented TODO comments in new code
3. All features wired up and reachable (not dead code)
4. Code matches the task description and spec requirements
WORKFLOW:
1. Wait for the lead to notify you that a phase/story is ready for review
2. Run ALL quality gates listed above
3. Run security checks on all new/modified code
4. Run task integrity checks
5. Report results to the lead:
- PASS: All gates passed, provide summary
- FAIL: List EACH failure with specific file paths, line numbers, and error messages
6. For FINAL verification (after all implementation): run comprehensive review
across the entire feature codebase
REPO-WIDE TASKS: You also own any repo-wide tasks (security audits, cross-cutting
concerns) from the task list. Complete these after all implementation streams finish.
IMPORTANT: Be thorough. A false PASS is worse than a false FAIL."
)
```
---
## Step 7: Lead Coordination Loop (Autonomous Until Done)
You (the lead) now enter an autonomous coordination loop. Run this loop until ALL tasks are completed AND qa-security gives final approval.
### The Loop
```
REPEAT until all tasks completed AND qa-security approves:
1. ASSIGN TASKS
- Call TaskList to see current state
- For each unblocked, unassigned task:
- Match the task's stream (from metadata) to the correct teammate
- Assign via TaskUpdate(owner: "{stream-name}")
- Respect dependency ordering: only assign tasks whose addBlockedBy are all completed
2. MONITOR PROGRESS
- Teammates will message you when they complete tasks or need help
- TaskList shows current status of all tasks
- When a task completes, its dependents become unblocked automatically
3. PHASE/STORY GATE CHECK
When ALL tasks for a phase or user story are marked completed by their stream agents:
- Message qa-security: "Phase {N} / Story {US#} complete. Please run full verification."
- Wait for qa-security response
4. HANDLE QA FAILURES
If qa-security reports FAIL:
- Parse the failure details (which tests failed, which files, what errors)
- Create NEW fix tasks via TaskCreate:
- Clear description of what failed and what needs fixing
- Include file paths and error messages from qa-security's report
- Assign to the appropriate stream agent based on file paths
- Message the stream agent about the fix task
- Wait for fixes to complete
- Re-trigger qa-security verification for the same phase/story
5. PHASE TRANSITION
When qa-security reports PASS for a phase/story:
- Mark the phase/story as verified (internal tracking)
- Assign newly unblocked tasks from the next phase to stream agents
- Continue the loop
6. FINAL GATE
When ALL implementation tasks are completed:
- Trigger qa-security for COMPREHENSIVE final verification:
- All test suites (unit, integration, E2E)
- Full security review of all new code
- Task completion integrity (no stubs, no TODOs)
- This is the Feature Completion Gate
- Only proceed to Step 8 after qa-security gives FINAL PASS
END REPEAT
```
### Critical Failure Escalation
If qa-security reports a CRITICAL security issue (e.g., credential leak, injection vulnerability, broken auth):
- **STOP** all stream agents from taking new tasks
- Present the issue to the user with full details
- Ask the user how to proceed before continuing
- Do NOT auto-fix critical security issues without user review
### Handling Stuck Agents
If a teammate appears stuck (no progress on a task for an extended period):
- Message them asking for status
- If they report being blocked, help resolve the blocker or reassign the task
- If truly stuck, create a new approach as a subtask
---
## Step 8: Sync Back to tasks.md
When qa-security gives final PASS:
1. **Read current TaskList status** — get all completed task IDs and their `speckit_id` metadata
2. **Update tasks.md** — for each completed task, change the checkbox in `FEATURE_DIR/tasks.md`:
- Find the line containing the task's `speckit_id` (e.g., `T017`)
- Change `- [ ]` to `- [x]` on that line
- Preserve ALL other content (descriptions, phase headers, notes)
3. **Update plan.md** — add a completion note at the bottom of `FEATURE_DIR/plan.md`:
```
## Implementation Status
**Completed via Agent Teams**: {date}
**Tasks completed**: {N}/{total}
**Quality gates**: All passed
**Security review**: Approved by qa-security
```
4. **Commit all changes**:
```bash
git add {FEATURE_DIR}/tasks.md {FEATURE_DIR}/plan.md
git commit -m "feat({feature-name}): complete implementation via Agent Teams
Tasks completed: {N}/{total}
Streams: {list of stream names}
Quality gates: All passed
Security review: Approved"
```
5. **Shut down teammates gracefully** — send shutdown requests to all teammates via `SendMessage(type: "shutdown_request")`
6. **Clean up team resources** — call `TeamDelete` after all teammates have shut down
7. **Report summary to user**:
```
Team Implementation Complete
Feature: {feature-name}
Tasks completed: {N}/{total}
Streams used: {stream-name} ({N} tasks), ...
Quality Gate Results:
{gate}: PASS
{gate}: PASS
...
Security Review: APPROVED
- No critical vulnerabilities found
- {summary of security checks performed}
Files modified: {count}
Commits created: {count}
Next steps:
- Review changes: git log --oneline {first-commit}..HEAD
- Run full test suite manually if desired
- Create PR when ready: /commit or gh pr create
```
---
## Argument Handling
The command accepts optional arguments after `/speckit.team-implement`:
- **`--require-plan-approval`**: Spawn teammates in plan mode. The lead must approve each teammate's plan before they implement. Slower but gives more control.
- **`--dry-run`**: Run Steps 1-5 (detection, parsing, hydration) but do NOT create the team or execute. Useful for previewing what would happen.
- **`--streams N`**: Override the maximum number of parallel streams (default: auto-detected, capped at 4). Use `--streams 2` for simpler projects.
If `$ARGUMENTS` contains any of these flags, handle them before proceeding. Any other text in `$ARGUMENTS` is treated as additional context for the feature.
---
## Error Recovery
### If a teammate crashes or disconnects:
- Identify incomplete tasks from TaskList (status: in_progress, owner: crashed agent)
- Reset those tasks to pending (remove owner)
- Spawn a replacement teammate with the same scope
- Reassign the tasks
### If quality gates fail repeatedly (3+ times for same issue):
- Stop the failing stream
- Present the persistent failure to the user with full context
- Ask for guidance before continuing
### If git conflicts occur:
- Teammates work on different files, so conflicts should be rare
- If a conflict is detected during commit, message the lead
- Lead coordinates resolution: one teammate commits first, the other rebases
---
## Notes
- This command assumes a complete task breakdown exists in tasks.md. If tasks are incomplete or missing, suggest running `/speckit.tasks` first.
- File ownership is critical: two teammates editing the same file causes overwrites. The file-conflict analysis in Step 3 prevents this.
- The qa-security agent is ALWAYS spawned regardless of project type. Even single-directory projects benefit from a dedicated reviewer.
- Teammates commit after each task, not in batches. This keeps the git history clean and makes rollbacks easier.
- The lead stays in a coordination role and does NOT implement tasks directly.
The Supporting Scripts
The command’s Step 1 calls check-prerequisites.sh, which validates the project structure and returns the feature directory path. This script sources common.sh for shared utility functions. Both scripts live in .specify/scripts/bash/ and are used by all Spec Kit commands.
common.sh — Shared Utility Functions
#!/usr/bin/env bash
# Common functions and variables for all scripts
# Get repository root, with fallback for non-git repositories
get_repo_root() {
if git rev-parse --show-toplevel >/dev/null 2>&1; then
git rev-parse --show-toplevel
else
# Fall back to script location for non-git repos
local script_dir="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
(cd "$script_dir/../../.." && pwd)
fi
}
# Get current branch, with fallback for non-git repositories
get_current_branch() {
# First check if SPECIFY_FEATURE environment variable is set
if [[ -n "${SPECIFY_FEATURE:-}" ]]; then
echo "$SPECIFY_FEATURE"
return
fi
# Then check git if available
if git rev-parse --abbrev-ref HEAD >/dev/null 2>&1; then
git rev-parse --abbrev-ref HEAD
return
fi
# For non-git repos, try to find the latest feature directory
local repo_root=$(get_repo_root)
local specs_dir="$repo_root/specs"
if [[ -d "$specs_dir" ]]; then
local latest_feature=""
local highest=0
for dir in "$specs_dir"/*; do
if [[ -d "$dir" ]]; then
local dirname=$(basename "$dir")
if [[ "$dirname" =~ ^([0-9]{3})- ]]; then
local number=${BASH_REMATCH[1]}
number=$((10#$number))
if [[ "$number" -gt "$highest" ]]; then
highest=$number
latest_feature=$dirname
fi
fi
fi
done
if [[ -n "$latest_feature" ]]; then
echo "$latest_feature"
return
fi
fi
echo "main" # Final fallback
}
# Check if we have git available
has_git() {
git rev-parse --show-toplevel >/dev/null 2>&1
}
check_feature_branch() {
local branch="$1"
local has_git_repo="$2"
# For non-git repos, we can't enforce branch naming but still provide output
if [[ "$has_git_repo" != "true" ]]; then
echo "[specify] Warning: Git repository not detected; skipped branch validation" >&2
return 0
fi
if [[ ! "$branch" =~ ^[0-9]{3}- ]]; then
echo "ERROR: Not on a feature branch. Current branch: $branch" >&2
echo "Feature branches should be named like: 001-feature-name" >&2
return 1
fi
return 0
}
get_feature_dir() { echo "$1/specs/$2"; }
# Find feature directory by numeric prefix instead of exact branch match
# This allows multiple branches to work on the same spec (e.g., 004-fix-bug, 004-add-feature)
find_feature_dir_by_prefix() {
local repo_root="$1"
local branch_name="$2"
local specs_dir="$repo_root/specs"
# Extract numeric prefix from branch (e.g., "004" from "004-whatever")
if [[ ! "$branch_name" =~ ^([0-9]{3})- ]]; then
# If branch doesn't have numeric prefix, fall back to exact match
echo "$specs_dir/$branch_name"
return
fi
local prefix="${BASH_REMATCH[1]}"
# Search for directories in specs/ that start with this prefix
local matches=()
if [[ -d "$specs_dir" ]]; then
for dir in "$specs_dir"/"$prefix"-*; do
if [[ -d "$dir" ]]; then
matches+=("$(basename "$dir")")
fi
done
fi
# Handle results
if [[ ${#matches[@]} -eq 0 ]]; then
# No match found - return the branch name path (will fail later with clear error)
echo "$specs_dir/$branch_name"
elif [[ ${#matches[@]} -eq 1 ]]; then
# Exactly one match - perfect!
echo "$specs_dir/${matches[0]}"
else
# Multiple matches - this shouldn't happen with proper naming convention
echo "ERROR: Multiple spec directories found with prefix '$prefix': ${matches[*]}" >&2
echo "Please ensure only one spec directory exists per numeric prefix." >&2
echo "$specs_dir/$branch_name" # Return something to avoid breaking the script
fi
}
get_feature_paths() {
local repo_root=$(get_repo_root)
local current_branch=$(get_current_branch)
local has_git_repo="false"
if has_git; then
has_git_repo="true"
fi
# Use prefix-based lookup to support multiple branches per spec
local feature_dir=$(find_feature_dir_by_prefix "$repo_root" "$current_branch")
cat <<EOF
REPO_ROOT='$repo_root'
CURRENT_BRANCH='$current_branch'
HAS_GIT='$has_git_repo'
FEATURE_DIR='$feature_dir'
FEATURE_SPEC='$feature_dir/spec.md'
IMPL_PLAN='$feature_dir/plan.md'
TASKS='$feature_dir/tasks.md'
RESEARCH='$feature_dir/research.md'
DATA_MODEL='$feature_dir/data-model.md'
QUICKSTART='$feature_dir/quickstart.md'
CONTRACTS_DIR='$feature_dir/contracts'
EOF
}
check_file() { [[ -f "$1" ]] && echo " ✓ $2" || echo " ✗ $2"; }
check_dir() { [[ -d "$1" && -n $(ls -A "$1" 2>/dev/null) ]] && echo " ✓ $2" || echo " ✗ $2"; }
The key function here is get_current_branch(), which determines the active feature through a three-level fallback: the SPECIFY_FEATURE environment variable (for CI/CD), the current git branch, or the highest-numbered directory in specs/ (for non-git repos). find_feature_dir_by_prefix() then matches the branch’s numeric prefix (e.g., 004) to the corresponding spec directory, so branch 004-fix-deploy finds specs/004-deploy-pipeline/.
check-prerequisites.sh — Validation and Discovery
#!/usr/bin/env bash
# Consolidated prerequisite checking script
#
# Usage: ./check-prerequisites.sh [OPTIONS]
#
# OPTIONS:
# --json Output in JSON format
# --require-tasks Require tasks.md to exist (for implementation phase)
# --include-tasks Include tasks.md in AVAILABLE_DOCS list
# --paths-only Only output path variables (no validation)
# --help, -h Show help message
#
# OUTPUTS:
# JSON mode: {"FEATURE_DIR":"...", "AVAILABLE_DOCS":["..."]}
# Text mode: FEATURE_DIR:... \n AVAILABLE_DOCS: \n ✓/✗ file.md
# Paths only: REPO_ROOT: ... \n BRANCH: ... \n FEATURE_DIR: ... etc.
set -e
# Parse command line arguments
JSON_MODE=false
REQUIRE_TASKS=false
INCLUDE_TASKS=false
PATHS_ONLY=false
for arg in "$@"; do
case "$arg" in
--json)
JSON_MODE=true
;;
--require-tasks)
REQUIRE_TASKS=true
;;
--include-tasks)
INCLUDE_TASKS=true
;;
--paths-only)
PATHS_ONLY=true
;;
--help|-h)
cat << 'EOF'
Usage: check-prerequisites.sh [OPTIONS]
Consolidated prerequisite checking for Spec-Driven Development workflow.
OPTIONS:
--json Output in JSON format
--require-tasks Require tasks.md to exist (for implementation phase)
--include-tasks Include tasks.md in AVAILABLE_DOCS list
--paths-only Only output path variables (no prerequisite validation)
--help, -h Show this help message
EXAMPLES:
# Check task prerequisites (plan.md required)
./check-prerequisites.sh --json
# Check implementation prerequisites (plan.md + tasks.md required)
./check-prerequisites.sh --json --require-tasks --include-tasks
# Get feature paths only (no validation)
./check-prerequisites.sh --paths-only
EOF
exit 0
;;
*)
echo "ERROR: Unknown option '$arg'. Use --help for usage information." >&2
exit 1
;;
esac
done
# Source common functions
SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"
# Get feature paths and validate branch
eval $(get_feature_paths)
check_feature_branch "$CURRENT_BRANCH" "$HAS_GIT" || exit 1
# If paths-only mode, output paths and exit (support JSON + paths-only combined)
if $PATHS_ONLY; then
if $JSON_MODE; then
# Minimal JSON paths payload (no validation performed)
printf '{"REPO_ROOT":"%s","BRANCH":"%s","FEATURE_DIR":"%s","FEATURE_SPEC":"%s","IMPL_PLAN":"%s","TASKS":"%s"}\n' \
"$REPO_ROOT" "$CURRENT_BRANCH" "$FEATURE_DIR" "$FEATURE_SPEC" "$IMPL_PLAN" "$TASKS"
else
echo "REPO_ROOT: $REPO_ROOT"
echo "BRANCH: $CURRENT_BRANCH"
echo "FEATURE_DIR: $FEATURE_DIR"
echo "FEATURE_SPEC: $FEATURE_SPEC"
echo "IMPL_PLAN: $IMPL_PLAN"
echo "TASKS: $TASKS"
fi
exit 0
fi
# Validate required directories and files
if [[ ! -d "$FEATURE_DIR" ]]; then
echo "ERROR: Feature directory not found: $FEATURE_DIR" >&2
echo "Run /speckit.specify first to create the feature structure." >&2
exit 1
fi
if [[ ! -f "$IMPL_PLAN" ]]; then
echo "ERROR: plan.md not found in $FEATURE_DIR" >&2
echo "Run /speckit.plan first to create the implementation plan." >&2
exit 1
fi
# Check for tasks.md if required
if $REQUIRE_TASKS && [[ ! -f "$TASKS" ]]; then
echo "ERROR: tasks.md not found in $FEATURE_DIR" >&2
echo "Run /speckit.tasks first to create the task list." >&2
exit 1
fi
# Build list of available documents
docs=()
# Always check these optional docs
[[ -f "$RESEARCH" ]] && docs+=("research.md")
[[ -f "$DATA_MODEL" ]] && docs+=("data-model.md")
# Check contracts directory (only if it exists and has files)
if [[ -d "$CONTRACTS_DIR" ]] && [[ -n "$(ls -A "$CONTRACTS_DIR" 2>/dev/null)" ]]; then
docs+=("contracts/")
fi
[[ -f "$QUICKSTART" ]] && docs+=("quickstart.md")
# Include tasks.md if requested and it exists
if $INCLUDE_TASKS && [[ -f "$TASKS" ]]; then
docs+=("tasks.md")
fi
# Output results
if $JSON_MODE; then
# Build JSON array of documents
if [[ ${#docs[@]} -eq 0 ]]; then
json_docs="[]"
else
json_docs=$(printf '"%s",' "${docs[@]}")
json_docs="[${json_docs%,}]"
fi
printf '{"FEATURE_DIR":"%s","AVAILABLE_DOCS":%s}\n' "$FEATURE_DIR" "$json_docs"
else
# Text output
echo "FEATURE_DIR:$FEATURE_DIR"
echo "AVAILABLE_DOCS:"
# Show status of each potential document
check_file "$RESEARCH" "research.md"
check_file "$DATA_MODEL" "data-model.md"
check_dir "$CONTRACTS_DIR" "contracts/"
check_file "$QUICKSTART" "quickstart.md"
if $INCLUDE_TASKS; then
check_file "$TASKS" "tasks.md"
fi
fi
When the command calls check-prerequisites.sh --json --require-tasks --include-tasks, this script:
- Sources
common.shto get the utility functions - Determines the repo root and current feature branch
- Validates that the feature directory,
plan.md, andtasks.mdall exist - Discovers optional documents (
research.md,data-model.md,contracts/, etc.) - Returns a JSON payload like
{"FEATURE_DIR":"/path/to/specs/004-deploy-pipeline","AVAILABLE_DOCS":["research.md","contracts/","tasks.md"]}
If any required file is missing, it exits with a helpful error pointing to the Spec Kit command that creates it.
Now let’s walk through the key sections of the command to understand what each part does and why it’s designed this way.
Section-by-Section Walkthrough
Frontmatter and User Input
The handoffs section creates a button in Claude Code’s UI that lets you run /speckit.analyze first — a consistency check across spec, plan, and tasks before committing to parallel execution. This is the same pattern used by /speckit.tasks and /speckit.plan. The $ARGUMENTS block captures anything the user types after the command, including flags like --dry-run.
Step 1: Setup
This uses the same prerequisite script as /speckit.implement and /speckit.analyze. The --require-tasks flag ensures tasks.md exists (if not, it tells the user to run /speckit.tasks first). The script returns JSON with absolute paths:
{
"FEATURE_DIR": "/home/user/project/specs/001-deploy-pipeline",
"AVAILABLE_DOCS": ["research.md", "data-model.md", "contracts/", "tasks.md"]
}
Step 2: Parse Unchecked Tasks
This parses Spec Kit’s standard task format. A typical task line:
- [ ] T017 [P] [US1] Implement POST /api/apps/{slug}/deploy handler in backend/src/api/deploy.rs
Becomes:
| Field | Value |
|---|---|
| Task ID | T017 |
| Parallel | Yes ([P] present) |
| Story | US1 |
| Description | Implement POST /api/apps/{slug}/deploy handler |
| File path | backend/src/api/deploy.rs |
| Phase | 3 (from the ## Phase 3: heading above it) |
The threshold check (< 5 tasks → recommend sequential) prevents wasting tokens on team coordination when sequential execution would be faster.
Step 3: The File-Conflict Analysis Algorithm
This is the core algorithm and the most important design decision in the command.
Why File Paths, Not Keywords
An earlier draft used keyword detection: tasks with “backend” in the description go to the backend agent. This fails in multiple ways:
- A task mentioning “document-oriented database” contains “document” but isn’t a docs task
- A task about “backend test helpers” contains both “backend” and “test”
- “Frontend error handling for API timeouts” references both frontend and API concepts
File paths are unambiguous. backend/src/api/deploy.rs is clearly a backend file. frontend/e2e/deploy.spec.ts is clearly an E2E test. The two-segment prefix grouping (backend/src/ vs backend/tests/) means tasks in backend/tests/ are grouped with backend/src/ (they’re in the same connected component), which is correct — a backend test might import from backend source.
Real-World Stream Detection Examples
Example 1: Full-stack Rust + React app (SoftPatchRocket)
tasks.md contains 36 unchecked tasks:
T001-T003: Phase 1 setup (no specific paths)
T004-T009: Phase 2 foundation (backend/src/, backend/migrations/)
T010-T027: Phase 3-5 user stories
- T010-T015: backend/src/api/, backend/src/services/
- T016-T021: frontend/src/components/, frontend/src/hooks/
- T022-T025: frontend/e2e/
T026-T030: backend/tests/
T031-T034: docs/, cross-cutting
File-conflict analysis:
Component A: backend/src/, backend/tests/, backend/migrations/ → 18 tasks
Component B: frontend/src/ → 10 tasks
Component C: frontend/e2e/ → 4 tasks
Repo-wide: T031-T034 → 4 tasks
Cross-stream dependencies:
Component C (E2E) depends on Component A (backend API must exist)
Component B (frontend) depends on Component A (API endpoints) for same-story tasks
Team structure:
backend-dev: Component A (18 tasks) — starts immediately
frontend-dev: Component B (10 tasks) — starts after backend API tasks per story
e2e-tester: Component C (4 tasks) — starts after relevant APIs exist
qa-security: Repo-wide (4 tasks) — runs after all streams
Example 2: Single-directory Python project
tasks.md contains 16 unchecked tasks:
T001-T003: src/api/, src/services/ (connected — imports overlap)
T004-T008: src/models/, src/api/ (connected to above)
T009-T013: tests/ (separate directory!)
T014-T015: docs/
T016: security audit (no path)
File-conflict analysis:
Component A: src/api/, src/services/, src/models/ → 8 tasks (all connected)
Component B: tests/ → 5 tasks
Component C: docs/ → 2 tasks
Repo-wide: T016 → 1 task
Team structure:
implementer: Component A (8 tasks) — starts immediately
test-writer: Component B (5 tasks) — can start in parallel (different dir!)
docs-writer: Component C (2 tasks) — independent
qa-security: verification + T016 — runs after all
Even a single-directory project benefits from parallelism when tests are in a separate directory.
Example 3: Monorepo
File-conflict analysis:
Component A: packages/auth/src/ → 12 tasks
Component B: packages/billing/src/ → 10 tasks
Component C: packages/shared/ → 3 tasks (shared library)
Repo-wide: CI config, docs → 2 tasks
Note: Components A and B both import from Component C,
but they don't edit shared/ files — they only READ from it.
Since the tasks for A and B don't list shared/ paths, they're independent.
Component C tasks run first (Phase 2 foundation), then A and B in parallel.
What the User Sees
After analysis, the command presents results and waits for confirmation:
Stream Detection Results (file-conflict analysis):
Streams found:
backend-dev (18 tasks) — files: backend/src/, backend/tests/, backend/migrations/
frontend-dev (10 tasks) — files: frontend/src/
e2e-tester (4 tasks) — files: frontend/e2e/
qa-security (4 repo-wide tasks + final verification)
Parallel groups:
backend-dev + frontend-dev (no file conflicts)
Dependencies:
frontend-dev US1 tasks wait for backend-dev US1 tasks
e2e-tester waits for relevant implementations
qa-security runs final verification after all streams
Total: 36 unchecked tasks + qa verification
Proceed with this team structure? [Y/adjust/cancel]
If the user says “adjust”, they can modify the stream assignments. If “cancel”, execution stops.
Steps 4-5: Dependency Graph and Task Hydration
Step 4 maps Spec Kit’s existing task structure (phases, [P] markers, [US#] labels) directly into Claude’s addBlockedBy relationships. The key insight: the Agent Teams system handles automatic unblocking — “When a teammate completes a task that other tasks depend on, blocked tasks unblock without manual intervention.”
Step 5 hydrates each Spec Kit task into a Claude Task via TaskCreate. The metadata field is critical — it’s how the coordination loop knows which stream agent should get which task, and how the sync-back step maps Claude Task IDs to Spec Kit task IDs (speckit_id: "T017").
Step 6: Agent Prompts — Key Design Decisions
Each stream agent gets a detailed prompt. Key design decisions:
File scope enforcement: “You ONLY work on files under: {list of path prefixes}.” This is the primary mechanism preventing file conflicts. The agent is told its boundary explicitly.
Atomic commits: “Commit after EACH completed task (not in batches).” This gives a clean git history and makes rollbacks easy — same principle as Ralph Loop.
Quality gates per stream: Each agent runs only the gates relevant to its files. The backend agent runs cargo clippy && cargo test, not npm test.
Escalation path: “If blocked or confused, message the lead for help.” The Agent Teams messaging system handles this natively.
The qa-security agent addresses a real weakness of single-agent implementation: who watches the watcher? When the same AI implements and verifies, it has blind spots. A separate agent with a fresh context window catches things the implementer missed. This agent is always spawned, regardless of project size.
Step 7: The Lead Coordination Loop
The loop is self-healing. A test failure doesn’t stop everything — it creates a targeted fix task that goes back to the right agent. The system keeps running until the feature is complete and verified.
Critical security issues (credential leaks, injection vulnerabilities, broken auth) escalate to the user rather than being auto-fixed. Some things require human judgment.
Step 8: Sync Back and Error Recovery
The sync-back step closes the loop with Spec Kit by updating tasks.md (changing - [ ] to - [x]), adding completion status to plan.md, committing changes, and gracefully shutting down all teammates.
The error recovery section handles three scenarios: crashed teammates (respawn with same scope), persistent quality gate failures (escalate after 3 attempts), and git conflicts (rare due to file ownership, but coordinated through the lead when they occur).
Part 5: A Complete Usage Walkthrough
Let’s walk through a real execution from start to finish.
Starting Point
You have a Spec Kit project with these artifacts:
specs/004-deploy-pipeline/
├── spec.md # Feature: one-click deployment pipeline
├── plan.md # Rust backend + React frontend + PostgreSQL
├── tasks.md # 34 tasks across 6 phases
└── contracts/
└── api.md # REST API contract
Your tasks.md has 34 unchecked tasks:
## Phase 1: Setup (Shared Infrastructure)
- [ ] T001 Create project structure per implementation plan
- [ ] T002 [P] Initialize Rust workspace with Cargo.toml
- [ ] T003 [P] Initialize React app with Vite and TypeScript
## Phase 2: Foundational (Blocking Prerequisites)
- [ ] T004 Create database migrations in backend/migrations/001_initial.sql
- [ ] T005 [P] Implement database connection pool in backend/src/db.rs
- [ ] T006 [P] Configure CORS and middleware in backend/src/middleware.rs
- [ ] T007 Setup React Router and layout in frontend/src/App.tsx
## Phase 3: User Story 1 - Deploy Configuration (P1)
- [ ] T008 [P] [US1] Create App model in backend/src/models/app.rs
- [ ] T009 [P] [US1] Create Deploy model in backend/src/models/deploy.rs
- [ ] T010 [US1] Implement AppService in backend/src/services/app_service.rs
- [ ] T011 [US1] Implement POST /api/apps handler in backend/src/api/apps.rs
- [ ] T012 [US1] Implement POST /api/apps/{slug}/deploy in backend/src/api/deploy.rs
- [ ] T013 [P] [US1] Create AppList component in frontend/src/components/AppList.tsx
- [ ] T014 [P] [US1] Create DeployButton component in frontend/src/components/DeployButton.tsx
- [ ] T015 [US1] Create deploy configuration page in frontend/src/pages/DeployPage.tsx
...
## Phase 6: Polish & Cross-Cutting Concerns
- [ ] T032 [P] Documentation updates in docs/
- [ ] T033 Security hardening review
- [ ] T034 Run full test suite validation
Step 1: Run the Command
> /speckit.team-implement
The command runs check-prerequisites.sh, loads tasks.md, plan.md, spec.md, and finds the constitution.
Step 2: Task Parsing
Parsed 34 unchecked tasks (0 already completed).
Threshold check: 34 tasks > 5 minimum → proceeding with team detection.
Step 3: Stream Detection
The file-conflict analysis runs:
Stream Detection Results (file-conflict analysis):
Streams found:
backend-dev (20 tasks) — files: backend/src/, backend/tests/, backend/migrations/
frontend-dev (8 tasks) — files: frontend/src/
e2e-tester (2 tasks) — files: frontend/e2e/
qa-security (4 repo-wide tasks + final verification)
Parallel groups:
backend-dev + frontend-dev (no file conflicts)
Dependencies:
frontend-dev US1 tasks (T013-T015) wait for backend-dev US1 API tasks (T011-T012)
frontend-dev US2 tasks (T020-T022) wait for backend-dev US2 API tasks (T018-T019)
e2e-tester waits for both backend API + frontend pages
Phase 2 blocks all user stories
qa-security runs final verification after all streams
Total: 34 unchecked tasks + qa verification
Proceed with this team structure? [Y/adjust/cancel]
You type Y.
Step 4-5: Dependency Graph and Task Hydration
Building dependency graph...
Phase dependencies: 6 phase transitions
Story dependencies: 8 cross-stream blocking relationships
[P] parallel groups: 12 tasks can run in parallel within phases
Task Hydration Complete:
Created: 34 tasks
Dependencies: 14 blocking relationships
Streams: backend-dev (20), frontend-dev (8), e2e-tester (2), qa-security (4)
Create agent team and begin execution? [Y/cancel]
You type Y.
Step 6: Team Creation
Creating team: deploy-pipeline
Spawning backend-dev... ✓
Spawning frontend-dev... ✓
Spawning e2e-tester... ✓
Spawning qa-security... ✓
Team ready. Entering delegate mode for autonomous coordination.
Press Shift+Up/Down to interact with individual teammates.
Press Ctrl+T to toggle the task list.
Step 7: Autonomous Execution
The lead assigns Phase 1 setup tasks and begins the coordination loop. You can watch progress in the terminal:
[backend-dev] T001 ✓ Created project structure
[backend-dev] T002 ✓ Initialized Rust workspace
[frontend-dev] T003 ✓ Initialized React app
[backend-dev] T004 ✓ Created database migrations
[backend-dev] T005 ✓ Implemented connection pool ← parallel with T006
[backend-dev] T006 ✓ Configured middleware ← parallel with T005
[frontend-dev] T007 ✓ Setup React Router
[lead] Phase 2 complete. Triggering qa-security verification.
[qa-security] Running cargo clippy... PASS
[qa-security] Running cargo test... PASS
[qa-security] Running npm test... PASS
[qa-security] Phase 2: PASS
[lead] Phase 3 unblocked. Assigning US1 tasks.
[backend-dev] T008 ✓ Created App model ← parallel with T009
[backend-dev] T009 ✓ Created Deploy model ← parallel with T008
[backend-dev] T010 ✓ Implemented AppService
[backend-dev] T011 ✓ Implemented POST /api/apps
[lead] T011 complete → unblocking frontend-dev T013, T014
[backend-dev] T012 ✓ Implemented POST /api/apps/{slug}/deploy
[frontend-dev] T013 ✓ Created AppList component ← started after T011
[frontend-dev] T014 ✓ Created DeployButton component ← parallel with T013
[frontend-dev] T015 ✓ Created DeployPage
...
[lead] All implementation tasks complete. Triggering final gate.
[qa-security] Running comprehensive verification...
[qa-security] Quality gates: ALL PASS
[qa-security] Security review: No vulnerabilities found
[qa-security] Task integrity: All features wired up, no stubs
[qa-security] FINAL GATE: PASS
Step 8: Sync Back
Team Implementation Complete
Feature: deploy-pipeline
Tasks completed: 34/34
Streams used: backend-dev (20 tasks), frontend-dev (8 tasks),
e2e-tester (2 tasks), qa-security (4 tasks)
Quality Gate Results:
cargo clippy -- -D warnings: PASS
cargo test: PASS
npm test: PASS
npm run test:e2e: PASS
Security Review: APPROVED
- Input validation on all 6 API endpoints
- No hardcoded secrets found
- OWASP Top 10 checks passed
- RLS enforcement verified
Commits created: 38 (34 implementation + 4 verification)
Next steps:
- Review changes: git log --oneline HEAD~38..HEAD
- Create PR when ready
Part 6: When to Use Each Approach
Decision Matrix
| Scenario | Recommendation | Why |
|---|---|---|
| < 5 tasks remaining | Ralph Loop or /speckit.implement | Team coordination overhead exceeds parallelism benefit |
| 5+ tasks, multiple directories | /speckit.team-implement | Maximum parallelism benefit |
| 5+ tasks, single directory | /speckit.team-implement | qa-security reviewer still adds value |
| Strictly sequential, no [P] markers | Ralph Loop | No parallelism opportunity |
| Fresh-context isolation per task | Ralph Loop | Each iteration is a clean slate |
| Overnight unattended execution | Ralph Loop | More predictable, simpler failure modes |
| Interactive development with oversight | /speckit.team-implement | Watch agents work, interact directly |
Token Cost Considerations
Agent Teams use significantly more tokens than sequential execution. Each teammate has its own context window and reads the same spec files independently. From the Agent Teams docs: “Token usage scales with the number of active teammates.”
A rough estimate:
| Approach | Token multiplier | Best for |
|---|---|---|
/speckit.implement | 1x (baseline) | < 20 tasks, single session |
| Ralph Loop | ~1.2x (fresh context overhead) | Large features, overnight |
/speckit.team-implement (3 agents) | ~3-4x | Parallelizable work, wall-clock time matters |
/speckit.team-implement (5 agents) | ~5-6x | Large cross-layer features |
For large features where parallelism saves significant wall-clock time, it’s worth it. For smaller work, stick with sequential execution.
Part 7: Design Decisions and Lessons Learned
File Ownership Is Everything
Every community pattern on Agent Teams emphasizes this point. Addy Osmani: file ownership prevents conflicts. Paddo: “Coarser-grained specialization, avoiding over-fragmentation.” Kieran Klaassen: each teammate must own distinct files.
The file-conflict graph exists to make file ownership a hard guarantee, not a soft guideline. If two tasks touch the same file, they’re in the same stream. Period. The algorithm enforces this at detection time.
2-4 Specialists, Not More
Early designs considered spawning one agent per user story. For a feature with 6 user stories, that’s 6 implementation agents plus qa-security. But more agents means more coordination overhead, more context switching for the lead, and diminishing returns.
The sweet spot from community experience is 2-4 specialist agents. File-conflict analysis naturally produces this: most projects have 2-3 major directory trees.
The Agent Teams docs recommend 5-6 tasks per teammate for keeping everyone productive without idle time. With 34 tasks and 3 implementation agents, that’s ~11 tasks each — a bit high, but manageable with good task ordering.
Self-Organizing Beats Central Planning
The initial design had the lead micro-managing every task assignment. The current design is closer to a self-organizing swarm: agents check the shared task list, pick available tasks, and work autonomously. The lead only intervenes for gate checks and failure handling.
From the Agent Teams docs: “After finishing a task, a teammate picks up the next unassigned, unblocked task on its own.” This natural load balancing means faster streams automatically pick up slack.
Spec Quality Determines Output Quality
This was already true with Ralph Loop, but Agent Teams amplify it. With sequential execution, you can course-correct ambiguity mid-stream in a conversation. With parallel agents working independently, vague specs lead to inconsistent assumptions across teammates.
A backend agent might interpret “deploy configuration” differently than a frontend agent if the spec isn’t precise. The investment in /speckit.specify -> /speckit.plan -> /speckit.tasks pays off even more with parallel execution.
As Addy Osmani put it: “The better your specs, the better the agent output.”
The Reviewer Changes Everything
The qa-security gatekeeper addresses a real weakness of both single-agent implementation and Ralph Loop: who watches the watcher? When the same AI implements and verifies, it has blind spots. A separate agent with a fresh context window catches things the implementer missed.
Even for single-directory projects with only 5-6 tasks, having a dedicated reviewer is worth the token cost. It catches:
- Quality gate failures that the implementer might ignore
- Security issues that pass local tests but violate OWASP guidelines
- Placeholder implementations marked as “complete”
- Undocumented TODOs left behind
Context Compaction Enables Lead Longevity
The Opus 4.6 announcement introduced context compaction: “Automatically summarizes older conversation context to enable longer-running tasks.” This is critical for the lead’s coordination loop, which may run for hours across dozens of task completions and qa-security cycles.
Without compaction, the lead’s context would fill up with task assignment messages and status updates. With compaction, older coordination messages get summarized while recent context stays sharp.
Part 8: Future Directions
Hybrid Execution
For very large features (100+ tasks), a hybrid might work best: Agent Teams for the parallelizable phases, Ralph Loop for the sequential setup and foundation phases. Let the system pick the right tool per phase.
Cross-Agent Learning
Currently, if the backend agent discovers a useful pattern (e.g., “this project uses a specific error handling convention”), the frontend agent doesn’t know about it. A shared “discoveries” file or broadcast mechanism could help.
Cost Optimization
Use Sonnet for implementation agents (fast, good at code generation) and Opus for qa-security review (deeper reasoning for security analysis). The Agent Teams docs confirm you can specify models per teammate.
Adaptive Stream Sizing
If one stream finishes much earlier than another, could we split the remaining work? Currently streams are fixed at creation time. Dynamic rebalancing would improve throughput.
Integration with CI/CD
The final qa-security gate could trigger an actual CI pipeline instead of (or in addition to) running gates locally. This would provide an additional layer of verification.
Quick Reference
# Enable Agent Teams (required, one-time setup)
# Add to ~/.claude/settings.json:
# { "env": { "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1" } }
# Standard Spec Kit workflow (unchanged)
/speckit.specify # Create feature specification
/speckit.plan # Generate implementation plan
/speckit.tasks # Break plan into tasks
# Sequential execution options
/speckit.implement # Single-agent, all tasks in one session
# or Ralph Loop for fresh-context per task
# Parallel execution (new)
/speckit.team-implement # Auto-detect streams, execute in parallel
/speckit.team-implement --dry-run # Preview detection without executing
/speckit.team-implement --streams 2 # Limit parallel streams
/speckit.team-implement --require-plan-approval # Approve teammate plans before implementation
# During execution
# Shift+Up/Down → navigate between teammates
# Shift+Tab → toggle delegate mode
# Enter → view a teammate's session
# Escape → interrupt a teammate's turn
# Ctrl+T → toggle the task list
Conclusion
/speckit.team-implement bridges two worlds: Spec Kit’s structured planning artifacts and Claude Code’s new Agent Teams for parallel execution. The file-conflict analysis algorithm ensures safe parallelism without file conflicts. The mandatory qa-security gatekeeper ensures quality doesn’t degrade with parallelism. And the autonomous coordination loop keeps everything running until the feature is complete and verified.
The key insights:
- File paths are the source of truth for parallelism — not keywords, not directory names, but the actual files each task touches
- File ownership must be a hard guarantee — the conflict graph enforces this by construction
- A dedicated reviewer catches what implementers miss — qa-security is always present
- Spec quality is amplified by parallelism — vague specs lead to divergent implementations across agents
- Self-organizing beats micro-managing — let agents claim tasks from the shared list
If you’re using Spec Kit and building features that span multiple directories, Agent Teams can cut your wall-clock time significantly. Enable the experimental flag, run /speckit.team-implement, and let the agents work in parallel while you watch.
Resources
- Claude Opus 4.6 Announcement (February 5, 2026)
- Agent Teams Documentation
- Previous Article: Spec Kit + Ralph Loop
- Addy Osmani on AI Agent Teams
- Paddo on Claude Code Swarms
- Kieran Klaassen on Agent Team Patterns
Written by Dominic Böttger
← Back to blog