Skip to content

Reviewer

The Reviewer is a senior engineer conducting code review of work completed by a Developer agent. Its primary job is verifying alignment between the code and the product specification, not just the task description. The task’s acceptance criteria may be incomplete or a misinterpretation of the spec. The spec is the source of truth.

PropertyValue
Commandspeed review (or speed review --task-id ID)
Assembly functionassemble_reviewer
Model tiersupport_model (Sonnet)
TriggerManual, or automatic after task completion
InputSourceDescription
Git diffWorktree branchThe code changes under review
Task acceptance criteriaTask JSONWhat the task says to do
Product specificationspecs/product/<feature>.mdWhat the product actually needs (source of truth)
Blast radius dataCSG analysisDownstream impact of changed symbols
Spec verification matrixPlan outputRequirement-to-task traceability
OutputLocationDescription
Review reportStdout (JSON)Verdict, spec verification, missing items, out-of-scope code, issues
  1. Read the product spec first. Before looking at the diff, form an independent understanding of what the code should accomplish. Write down expectations in the output.

  2. Check the diff against the spec. For every identified requirement, check whether the diff satisfies it. Quote specific lines from the product spec for each verification. If code cannot be traced to a spec line, flag it as potentially out of scope.

  3. Check for what is missing. Identify what the spec requires that the diff does not implement.

  4. Standard code review. Only after spec verification, review for code quality, convention adherence (CLAUDE.md), security, testing coverage, and performance.

The review pipeline iterates over completed tasks, running one Reviewer agent per task with a post-review Guardian gate.

speed review [--task-id ID]
├─ 1. Gather tasks to review
│ └─ All "done" tasks (or specific --task-id)
│ └─ Skip already-approved tasks
├─ 2. Per task:
│ ├─ a. Get git diff from worktree branch
│ ├─ b. Load relevant specs (smart loader)
│ ├─ c. Build Layer 2 context (reviewer stage)
│ ├─ d. Assemble reviewer prompt
│ ├─ e. Send to Reviewer agent (Sonnet, read-only)
│ ├─ f. Parse verdict
│ │ ├─ approve → mark reviewed
│ │ └─ request_changes → mark for retry
│ └─ g. Post-review Guardian gate (on approve)
│ ├─ aligned → keep approved
│ ├─ flagged → warn but keep
│ └─ rejected → override to request_changes
└─ 3. Collect and print nits from approved tasks

cmd_review (lib/cmd/review.sh:82-307) gets the diff via git_diff_branch, comparing the task branch against the main branch. Diffs exceeding DIFF_HEAD_LINES are truncated with a note pointing to the full branch.

_load_relevant_specs does a smart load: the primary product spec is marked as “SOURCE OF TRUTH”, related specs are included with their relative path. Not all specs are loaded, only those relevant to the diff.

context_build_layer2_task builds Layer 2 for the reviewer stage. assemble_reviewer (lib/context/assembly.py:855-1005) builds a 60k-token-budget prompt:

SectionBudget shareContent
Task identity + rationaleWhat this task should achieve
Acceptance criteriaTable format with verify_by methods
Criteria verificationAutomated check results if available
Assumptions to verifyPlanning inferences not in the spec
Git diff~40%Truncated to fit budget
Impact assessmentBridge symbols, blast radius table from CSG
Spec requirementsRelevant spec sections with alignment status
Cross-cutting constraintsFeature-level rules the diff must respect

The Reviewer runs with Sonnet and read-only tools. On approve, the task status updates to “reviewed.” On request_changes, task_request_changes writes review feedback into the task JSON for the Developer’s retry.

For approved tasks (unless SKIP_GUARDIAN=true), _run_guardian "post-review" (lib/shared.sh:28-146) sends the diff + criteria to the Product Guardian. A rejected verdict overrides the Reviewer’s approval, setting the task back to request_changes with the Guardian’s summary as feedback. Nit-level and minor issues from approved reviews are collected and printed as a batch at the end.

Task 3 of library-app: “Add book availability GraphQL query.” The Developer completed it on branch speed/task-3.

Acceptance criteria table:

#CriterionVerify By
1BookQuery.books returns all books with status fieldtest
2BookQuery.book(id) returns single book with borrowing historytest
3availability_check resolver returns true/false based on Book.statustest

Impact assessment:

SymbolBlast RadiusStabilityDependents
BookType6core6
BookQuery4stable4

Spec requirement:

From product § Library Catalog: “Users can search for books and see real-time availability status.”

{
"verdict": "approve",
"spec_verification": [
{
"spec_quote": "Users can search for books and see real-time availability status",
"spec_section": "Library Catalog",
"satisfied": true,
"evidence": "schema/book.py:15 — BookQuery.books includes status field via BookType",
"notes": "Search is basic name matching, not full-text. Acceptable for MVP."
}
],
"missing_from_spec": [],
"out_of_scope": [],
"issues": [
{
"severity": "nit",
"file": "src/backend/schema/book.py",
"line": 28,
"message": "availability_check could be a computed field on BookType instead of a separate resolver",
"suggestion": "Use @hybrid_property on the model for cleaner API"
}
],
"strengths": ["Good test coverage", "Consistent with existing schema patterns"]
}
═══ Code Review ═══
Reviewing task 3: Add book availability GraphQL query
... (agent output) ...
✓ Task 3: approved by Reviewer
Running Product Guardian (post-review)...
✓ Guardian: ALIGNED — Implementation serves core mission
Nits (non-blocking, from approved reviews):
Task 3: [nit] schema/book.py:28 — availability_check could be a computed field
  • Read-only access. Cannot modify files.
  • The spec is the source of truth. If the task says “implement X” but the spec says “implement Y”, the code should implement Y.
  • Quote the spec for every verification. If a citation cannot be provided, the requirement may be fabricated or the code may be out of scope.
  • Flag over-engineering: unnecessary abstractions, premature generalizations, frameworks nobody asked for.
  • Do not approve things you are not sure about. Mark uncertain items and explain the uncertainty.

Injected at runtime when reviewing a defect fix. In addition to standard review, the Reviewer checks:

Defect addressed. Does the fix resolve the specific issue in the defect report and triage hypothesis? For moderate defects, verify the failing test passes for the right reason, not by changing the test or masking the defect.

Scope containment. All changed files must be within the affected_files list from the triage output. Renamed variables, reformatted code, updated comments, and reorganized code outside the fix are flagged.

No new behavior. The fix should not introduce behavior beyond correcting the defect: no new endpoints, UI states, return type changes, or dependencies.

Regression risk. Based on regression_risks from triage, assess whether changed code paths are covered by the test suite and whether the happy path behavior is preserved.

Large diff flag. If the diff exceeds 100 lines changed or touches more than 3 files, the Reviewer flags it. The orchestrator uses the flag to decide whether to invoke the Product Guardian.

{
"verdict": "approve | request_changes",
"spec_verification": [
{
"spec_quote": "Exact text from the product spec",
"spec_section": "Section name",
"satisfied": true,
"evidence": "file:line reference or description of what's missing",
"notes": "Any concerns"
}
],
"missing_from_spec": [
{ "spec_quote": "What the spec says", "description": "What's missing from the implementation" }
],
"out_of_scope": [
{ "file": "path/to/file", "line": 42, "description": "Code that doesn't map to any spec requirement" }
],
"issues": [
{ "severity": "critical | major | minor | nit", "file": "path/to/file", "line": 42, "message": "Description", "suggestion": "How to fix" }
],
"strengths": ["Things done well"]
}

Defect fix review adds: defect_scope_check (scope_ok, large_diff) and scope_violations array.