A stack of nested loops
with explicit exit criteria.
The pipeline runs as the main Claude Code session. Each phase has a contract; each loop has an exit checklist; each gate is enforced by a hook that exit-2's until evidence is real. The README is the source of truth; this page is the visual map.
Trace a run, phase by phase
A live signal travels the spine of the pipeline. Hover any phase to inspect its owners, the artifact it emits, and the contract that must be true before the next phase fires.
100%-coverage hard gate. 12-condition exit checklist.
- 01openspec validate --all --strict must be valid.
- 02Every requirement has ≥ 1 measurable scenario.
- 03Reuse decisions cite real files in CODEBASE_MAP.
- 04No duplicate capabilities. Phase 2 cannot start until clean.
10 phases at a glance
100%-coverage hard gate. 12-condition exit checklist.
- 01openspec validate --all --strict must be valid.
- 02Every requirement has ≥ 1 measurable scenario.
- 03Reuse decisions cite real files in CODEBASE_MAP.
- 04No duplicate capabilities. Phase 2 cannot start until clean.
Every phase, full detail
Build CODEBASE_MAP, ROUTE_MAP, DESIGN_MAP, INTEGRATION_MAP.
- ›Cartographer + route-mapper produce per-codebase maps.
- ›3 codebase-map-reviewer agents argue in parallel until all return ok.
- ›Integration mapping converges 3 explorers → master-synthesizer.
- ›Freshness short-circuit: skip if last_mapped ≥ git head.
Normalize OpenSpec / Superpowers / plain markdown briefs.
- ›Orchestrator inspects the requirements folder.
- ›Detects format and converts to a single internal contract.
- ›Initializes coverage-map.json.
100%-coverage hard gate. 12-condition exit checklist.
- ›openspec validate --all --strict must be valid.
- ›Every requirement has ≥ 1 measurable scenario.
- ›Reuse decisions cite real files in CODEBASE_MAP.
- ›No duplicate capabilities. Phase 2 cannot start until clean.
Parallel, non-overlapping teammates with their own 1M context.
- ›Long-lived named teammates (Agent Teams mode) or ephemeral subagents.
- ›Shared task list; SendMessage for direct teammate-to-teammate.
- ›Plan-approval triggers gate the dispatch.
Hook-enforced. 12 self-review fields + independent review.
- ›PostToolUse(TaskUpdate) blocks completion until evidence is complete.
- ›Visual-fidelity, test-completeness, integration & UI-interaction reviews.
- ›Independent reviewer ≠ teammate. 3 rejections → escalation handoff.
Shared boundaries; contract sync between teammates.
- ›No new feature code at this phase.
- ›Resolve cross-team shared types and API contracts.
Real backend. Playwright. Visual-fidelity. UI interaction.
- ›Full-stack tests run against the real running app.
- ›Editability + visual verification teams independently re-verify.
- ›Test-failure RCA: forward + backward + alt-hypotheses, mandatory.
Per-task-group dependency graph + ledger.
- ›Iterate task groups in dependency order.
- ›Solution Requirements (SRs) auto-spawn fix teams on every surfaced issue.
Coverage map fully green; re-spawn on gap.
- ›Master review verdict must be overall: pass.
- ›Any gap re-spawns the originating team.
Per requirement → commit → test → demo. Auto-push.
- ›Stop-hook completion audit verifies the run is actually clean.
- ›Auto-commit + push on green. openspec archive on success.
- ›Opt out with --no-commit / --no-push / --no-compact.
How flow is decided
Phase 3 review gate
Every TaskUpdate(completed) on a teammate-owned task is gated. The hook exits 2 (block) until the 12-field evidence schema + independent_review are valid. 3 consecutive rejections → escalation handoff.
Issue → fix routing
Every surfaced issue becomes a Solution Requirement. Test-failure origins route through diagnostic research first; editability + interaction gaps go straight to a fix team. The loop closes when the originating check passes.
Stop-hook completion audit
Blocks the orchestrator from ending a run while INCOMPLETE state exists (open SRs, unsatisfied editability loops, master verdict ≠ pass, dev-loop ceiling exceeded). Also gates the Phase 8 auto-commit.
Three hooks. exit 2 until evidence is real.
Every gate in the pipeline is enforced by a hook that blocks completion until the contract is satisfied. No agent can mark its own work done.
review-gate evidence (v6 + independent review)
teammate-idle review-gate re-check
pipeline-completion audit (terminal gate)
What's on disk when a phase passes
Every gate reads a JSON file. Below: the exact shape the Phase 3 hook requires, and the Solution Requirement the orchestrator picks up to spawn a fix team.
{
"task_id": "T-042-add-invoice-export",
"spec_review": "pass",
"quality_review": "pass",
"real_not_stubbed": true,
"tests": { "added": 4, "passing": 4 },
"demo_artifact": "demos/T-042-export.mp4",
"files_changed": [
"apps/web/routes/invoices.export.tsx",
"apps/api/handlers/invoices/export.ts",
"tests/e2e/invoices.export.spec.ts"
],
"reuse_compliance": "ok",
"visual_fidelity_review": "pass",
"test_completeness_review": "pass",
"integration_testing_review": "pass",
"ui_interaction_review": "pass",
"independent_review": {
"reviewer": "task-reviewer",
"verdict": "pass",
"spec_review": "pass",
"quality_review": "pass",
"real_not_stubbed": true,
"reuse_compliance": "ok",
"reviewed_at": "2026-05-31T14:22:08Z"
}
}{
"id": "SR-2026-05-31-014",
"status": "open",
"origin": {
"kind": "playwright-failure",
"test": "tests/e2e/invoices.export.spec.ts",
"discovered_by": "interaction-reviewer",
"discovered_at": "2026-05-31T14:18:51Z"
},
"summary": "Export button fires request but never resolves; spinner hangs.",
"acceptance_criteria": [
"Clicking Export downloads a CSV within 3s for ≤1k rows.",
"Failure path surfaces a toast and re-enables the button.",
"Playwright covers both success and failure flows."
],
"routing": {
"diagnostic_research_required": true,
"fix_team": "frontend+backend"
}
}12 conditions. all must hold. no iteration cap.
Phase 2 cannot start until every condition is satisfied. The orchestrator runs the checklist each iteration; failures route to the proposal-refiner.
- 01openspec validate --all --strict returns valid: true.
- 02Every artifact (proposal, specs, design, tasks) has status: done.
- 03Every source requirement has ≥ 1 scenario.
- 04Every requirement's acceptance criteria are measurable.
- 05Every front-end requirement has an explicit Playwright user-flow spec.
- 06Every back-end requirement has explicit dev-API integration test criteria.
- 07Every both-layer requirement has a front-to-back integration criterion (or recorded mock_testing_authorized opt-out).
- 08Every new module / file / dep in design.md has a Reuse Decision citing CODEBASE_MAP.md.
- 09Every Reuse Decision cites a file/symbol that actually exists.
- 10No duplicate capabilities (cross-checked via CODEBASE_MAP / INTEGRATION_MAP).
- 11Every new third-party dep has a documented comparison against the existing stack.
- 12tasks.md creates a new file only where existing files cannot be extended.
$ openspec validate --all --strict --json
{ "valid": true, "errors": [] }
$ openspec status --json
{ "proposal": "done", "specs": "done", "design": "done", "tasks": "done" }
▣ Phase 1 exit checklist
[12/12] all conditions satisfied → unlocking Phase 2 (team-spawn)Three failure modes. one hook field.
Ships as: page.request.post('/api/...') — bypasses the UI entirely.
Caught by: interaction-completeness flags zero genuine page.click on a non-stub control.
Ships as: Route wired to <ComingSoon /> while the design specifies a real screen.
Caught by: Every route enumerated and classified live / placeholder / confirmed-stub.
Ships as: Mockup's 'Welcome back, Sarah' shipped to every user.
Caught by: dynamic-value-discovery classifies from context, not from the literal.
# ui_interaction_review takes "pass" | "n/a" | "fail"
#
# pass — every interactive element genuinely user-flow-tested,
# every page live, every value correctly static or
# dynamically bound, OR a confirmed-stub.
# n/a — slice has no UI surface. REQUIRES non-empty
# ui_interaction_review_note.
# fail — BLOCKED by the hook. An unwired-control / placeholder-page /
# hardcoded-dynamic-value gap must be escalated as an SR,
# not marked complete.