Review Squad — Release Notes

What’s new in V5.0 Phase 2

Date 20 April 2026

Scope Theme A — CI + headless squad. GitHub Actions workflow, bash wrapper, verdict JSON schema, --ci flags on /review and /review-auto, per-repo budget config, wiring guide.

Files updated 7 (2 new commands, 1 new schema, 1 new script, 1 new workflow, 1 new config, 2 modified commands). 1603 insertions.

Phase 1 taught the squad to learn from itself. Phase 2 lets it run on every PR without a human watching. A GitHub Actions workflow invokes /review --ci headlessly via claude -p, a bash wrapper parses a structured verdict JSON, posts a PR comment with findings tagged by severity and reviewer, and optionally triggers /review-auto to apply safe fixes back to the PR branch. Cost-budget gate runs before every review; routing cap keeps CI aggressive on cost while preserving deep-mode escalation for security-sensitive diffs. The guiding rule: the squad becomes team infrastructure by existing without being invoked.

Theme A — CI + Headless Squad GitHub Actions workflow, bash wrapper, verdict JSON schema

/review --ci — headless mode with structured verdict New --ci flag (auto-enabled when CLAUDE_CODE_HEADLESS=1) suppresses the usual markdown narrative and emits a single <squad-verdict-json>...</squad-verdict-json> block per docs/squad-json-schema.md v1. Step 0.4 CI_MODE detection. Step 0.5d.1 CI routing cap downgrades full-mode to thin-mode for cost savings but PRESERVES deep-mode for auth / migration / crypto paths — security matters even in CI. Step 1 + 0.5a fork on GITHUB_BASE_REF or --pr to diff against origin/<base>..HEAD because CI checkouts produce a clean working tree (the original Phase 2 draft had this wrong; the squad caught it on round 1). Step 7.5 emits JSON with authoritative findings parsing rules (section-to-tier mapping, bullet extraction, stable IDs like B1, M1, parse warnings channel). Exit codes aligned to schema: APPROVE/CONDITIONAL/SKIPPED→0, REVISE/BLOCK→1, ABORTED→2.
/review-auto --ci --from-verdict-json <path> — CI-mode auto-fix with verdict reuse Skips the initial /review call when --from-verdict-json supplies a pre-computed verdict, saving the cost of two reviews per auto-fix cycle. Step 3 CI carve-out auto-proceeds when no finding has an UNSAFE class tag; aborts otherwise. Step 6.2 pushes commits back using ${GITHUB_HEAD_REF:-$(git branch --show-current)} — falls back gracefully when the workflow checked out a branch ref instead of a SHA. Step 7.5 emits post-fix verdict JSON with auto-fix augmentation keys: auto_fix_applied, auto_fix_rounds, auto_fix_items_applied, auto_fix_commits, auto_fix_status.
docs/squad-json-schema.md — v1 machine-parseable verdict schema Canonical contract for the <squad-verdict-json> block. Top-level: verdict, summary, routing_mode, routing_override, nando, emily, findings, overturned_findings, chunking, files_reviewed, metadata, reason. SKIPPED / ABORTED envelope shape fully specified with fixed reason enum (classifier-skip-mode, preflight-typecheck-failed, budget-exceeded, and more). Findings carry stable IDs, nullable class tags, severity tiers (blocker / must-fix / recommended / nits / boyscout). Consumer contract explicit: reject unknown schema_version, preserve IDs, never mutate the emitted object. metadata.parse_warnings and metadata.push_error always emitted (never absent) so consumers can rely on presence. Schema evolution policy: additive changes do not bump version; renames or removals do.
scripts/squad-pr-review.sh — bash wrapper (600 lines) Validates config (enum / bool / number) on load. Cap-aware cost pre-estimate (thin $0.15, full $0.60, deep $0.90, 4x multiplier above 30 files). Claude invocation uses set +e / set -e block with separate mktemp stdout + stderr captures — avoids the process-substitution race that was the original design. Python regex extraction of the verdict block (robust across single-line and multi-line sentinels). Five-pattern sed redaction for sk-ant-*, Bearer *, ghp_*, gho_*, ghs_* on every stderr surface that reaches PR comments. PR comment renders verdict with icon + heading + severity-tier findings tables; truncates detail at 600 chars with named-artifact breadcrumb; escapes <details> / </details> tokens in user-supplied finding text to prevent markdown-collapse escape. Follow-up auto-fix comment links back to primary comment via captured URL (strict regex anchored to github.com#issuecomment-<digits>). Schema-version mismatch writes a machine-readable failure JSON to the artifact AND posts a PR comment with expected/actual versions + migration doc anchor.
.github/workflows/review-squad.yml — Actions workflow Triggers on pull_request opened / synchronize / reopened. Concurrency group per PR number with cancel-in-progress: true — rapid pushes cancel stale runs. Skip-check step runs first and sets an output if the HEAD commit subject matches chore(auto-fix)* — breaks the loop where /review-auto pushes would trigger new workflow runs indefinitely. All downstream steps gate on the skip output. Checkout uses ref: github.event.pull_request.head.ref (branch name, not SHA) so the runner is on a non-detached HEAD and auto-fix pushes succeed. Plugin install via claude plugin install Review_Squad with repo-clone fallback. Permissions: contents: write + pull-requests: write, documented with a least-privilege path for repos on review-only mode to tighten to contents: read.
.github/squad-budget.yml — per-repo CI config Five keys: max_cost_per_pr_usd, mode, routing_cap, fail_action_on_revise, comment_on_skip. Conservative defaults (thin cap, review-only, $1.00 ceiling, fail on REVISE, no comment on skip). Inline comments explain the thin is special semantics: thin activates the CI cap that downgrades full but preserves deep; full / deep / skip are force-modes that disable deep-mode escalation. Wrapper rejects any value outside the enum / bool / positive-number shape with a clear error message.
commands/ci-wrapper.md — wiring guide Two-sided install model: target repo commits the 4 infrastructure files; runner installs the squad into ~/.claude/ on every job. Four-step one-time repo setup (curl + gh secret set + budget edit + commit + PR). Four-week rollout sequence (shadow → gate → full-cap → review-and-fix) with concrete advancement criteria. Cost table per routing mode with and without chunking. PR comment anatomy. Troubleshooting section covers the likely failure modes: plugin marketplace unreachable, schema version mismatch, budget exceeded, fork PR permission downgrade. Auto-fix safe-class gate mechanism documented with the HARD-UNSAFE class list (sql-injection, auth, secret, token, jwt, permission, rbac, crypto, password, csrf, xss, ssrf). Local-testing note using act.
Decision-gate metric — 3-criterion rollout exit Before advancing past shadow mode to gate mode, the squad must satisfy ALL three criteria across 5 consecutive PRs: (1) maintainer agrees with Nando’s verdict within ±1 tier on 5/5, (2) median cost_estimate_usd ≤ 80% of max_cost_per_pr_usd, (3) zero silent failures (no run exits 0 while actually crashed). If any criterion fails on any of the 5, restart the window — the next PR becomes PR 1 of a fresh window, no averaging, no cherry-picking. Same three criteria apply across the next 10 PRs for gate → full-cap advancement.

Squad self-review 3 rounds to APPROVE + CONFIRM

Round 1: 15 genuine findings across all four reviewers FC REVISE: set -e + process-substitution race, schema example drift, apt-get precedence bug, plugin install fallback untested, auto-fix loop trigger. Jared REVISE: path-predictable /tmp files, unconditional contents: write permission, unredacted stderr in PR comments. Stevey REVISE: all seven connectivity blockers — classifier file resolution broken in CI (every PR would have skipped), detached-HEAD push failure, /review-auto verdict JSON never consumed, cost pre-estimate ignored routing cap, 65KB comment cap, Emily CI-skipped invisible, </details> injection from finding text. PM Cory APPROVE with 3 carry-forward questions for Nando.
Round 2: 1 regression caught An over-broad replace_all on ${RAW_OUT}.stderr accidentally matched the STDERR_OUT="${RAW_OUT}.stderr" definition line itself, producing STDERR_OUT="${STDERR_OUT}" — an unset-variable crash under set -u. Both FC and Jared flagged the same line 203 anchor independently. One-line fix: STDERR_OUT=$(mktemp).
Round 3: clean across all four reviewers FC APPROVE (Quality A, Craft Solid). Jared APPROVE (Security/Efficiency/Reuse PASS). Stevey APPROVE (7/7 blockers resolved with file:line evidence). PM Cory APPROVE (plan coverage complete). Nando synthesized APPROVE. Emily CONFIRM.
V5.1 backlog cleared into this release Nando and Emily flagged 4 carry-forward items as Recommended (not blockers). User requested a final iteration loop to clear them before commit: (1) decision-gate metric locked in ci-wrapper.md, (2) schema-skew wrapper emits machine-readable failure JSON + PR comment with expected/actual versions and migration doc anchor, (3) stderr redaction applied to /review-auto invocation (defense-in-depth), (4) follow-up auto-fix comment links to primary comment via captured URL. Jared + Stevey re-reviewed, both APPROVE. Stevey caught two polish items (3-space indent that would render as a detached code block in GitHub; dangling “prior squad comment” text on URL-capture failure) — both resolved.
Meta observation: squad-self-review on meta-work IS worthwhile when the meta-work modifies the squad’s execution path The user’s standing preference (feedback_skip_squad_meta.md) was that squad-self-review on meta-work produces no signal. For V5.0 Phase 2, the squad produced 15 real findings in round 1 and caught a regression introduced in round 2. Nando proposed a documented exception: squad-self-review is worthwhile when meta-work modifies the squad’s own execution path (CI runner, wrapper, agent dispatch) because bugs there cascade across every future review. This release captures that exception in the squad’s institutional memory.

Still pending — V5.0 Phases 3-4 Team shared state, dashboard MVP

Phase 3 — Theme D (team shared state) /squad-sync synchronizes team state (learnings, patterns, codebase map, review history) across a configured remote — git repo, S3 bucket, or HTTP endpoint. Per-user state (Cory’s memory of each developer’s style) stays local; shared state syncs per-push or on a schedule. Three-way merge for pattern files. .review-squad/config.json declares the sync remote + strategy. squad-sync --init migrates single-user installs without data loss. Auth uses existing git credentials for git-backed sync.
Phase 4 — Dashboard MVP (vanilla JS) Local dev server extending agent-chat at :4001, reusing existing WebSocket lifecycle events. Static export via /squad-dashboard --export. Four sections: multi-project home, chronological timeline (Chart.js), per-reviewer performance stats, pattern library browser with citation graph. Vanilla JS over Svelte (zero build step, matches /ship and changelog HTML patterns).
Rollout criterion for downstream repos Phase 2 does NOT auto-enable on any downstream repo. Adoption requires the 3-criterion decision gate to pass across 5 consecutive PRs on this repo (Corye-CIC/Review_Squad) first. Real-world signal before real-world rollout.

← V5.0 Phase 1 V5.0 PHASE 2 V5.0 Phase 3 →