We ran an experimental multi-agent debate session and reviewed the chat logs. Several behavioral gaps
showed up clearly — agents posting duplicates, working in the wrong channels, not responding to each
other, and a verdict protocol with a dead end. V3.7 closes all of them. No architectural
changes; purely agent-level quality improvements across the full review pipeline.
-
Deduplication guard
Emily now tracks what she's already posted and won't repeat the same content within a session — the repeated identical posts observed in the debate logs are blocked.
-
Pre-defined problem fast-track
If requirements are already clear when a session starts, Emily skips the structured scoping questions and moves straight to work rather than asking questions that have already been answered.
-
Wrong direction framing strengthened
Catching a fundamentally wrong direction early is now explicitly weighted above iterating on a plan that shouldn't exist — Emily surfaces this as a blocker-level concern, not a recommendation.
-
Cross-agent connections section
FC's review output now includes a dedicated section flagging issues that overlap with Jared's or Stevey's domain. Previously these were silently omitted; now they're surfaced for Nando to consolidate.
-
Parallel execution contradiction resolved
Removed the structurally impossible "read other reviewers first" rule. FC now flags anticipated cross-agent connections inline for Nando to pick up — no dependency on other agents completing first.
-
Proportional threat calibration
Jared now scales scrutiny depth to the actual risk surface — a public read-only endpoint doesn't receive the same intensity as an auth flow. Any confirmed vulnerability still blocks regardless; calibration governs how hard you look, not whether you act.
-
Cross-reference pass added
Added an explicit pass at the end of Jared's review to flag issues touching FC's or Stevey's domain, consistent with the cross-agent connections pattern applied across all reviewers.
-
Security rule contradiction fixed
Resolved an internal conflict between "calibrate scrutiny" and "always block" — the two rules are no longer in tension. Calibration is a depth-of-scrutiny setting; blocking is a post-finding action.
-
Output channel verification
Jared now confirms he is posting to the correct review output channel before writing findings — addressing the wrong-channel incident from the debate session.
-
Chain-citing requirement
Stevey now explicitly links every finding to its source — a design token, component spec, WCAG criterion, or accessibility standard. Opinions without backing are no longer acceptable output.
-
Invisible bugs emphasis
Added focused attention on connectivity and data-pathway bugs that don't produce visible errors but silently degrade behavior — timeouts that never fire, retries that double-post, payloads that bloat unnoticed.
-
User-visible impact priority
Findings are now ordered by what the user actually experiences, not by what's architecturally interesting. Structural concerns that don't affect the product surface are deprioritized accordingly.
-
Parallel execution contradiction resolved
Same fix applied as FC — "read others first" removed and replaced with forward-flagging for Nando to consolidate.
-
Emily formally added to squad roster
Emily was absent from Nando's role block — she now appears with a clear description: requirements coverage, plan adherence, accessibility compliance, E2E validation. The protocol that she runs after Nando's verdict and can issue a CHALLENGE is now part of Nando's operating model.
-
Emily CHALLENGE resolution protocol
If Emily CHALLENGEs an APPROVE verdict, Nando must address each challenge item in the Reviewer Disagreements section before the verdict reaches the user. The previous state — a CHALLENGE that could silently pass through — is now a hard rule violation.
-
Logical fallacy identification
Nando now explicitly calls out reasoning errors in agent findings: importance-by-catastrophe ("if I'm removed the damage is highest"), conflating criticality with contribution, and claiming foundational status as a proxy for correctness. These now appear in Reviewer Disagreements when they occur.
-
Receipts-backed challenges
When Cory raises a concern rooted in prior sessions, it must cite the specific
learnings.jsonl entry or past review that supports it. Unanchored "we've seen this before" claims are no longer valid output.
-
Outcome specificity
Cory's questions and suggestions must now include a concrete expected outcome — not just a concern or a question. The change in behavior or result that would resolve the concern must be stated explicitly.
-
Memory carve-out fix
The
<injected-context> rule in Consult and Implement modes was incorrectly prohibiting Cory from reading its own .review-squad/ memory files when context was pre-loaded. The carve-out is now applied consistently across all three PM Cory agents.
-
Stale copy-paste rules removed
"Commit atomically", "flag it and fix it", and "acknowledge what's done well" were left in agent files where they didn't belong from earlier copy-paste. All instances removed across Emily (×5), FC review, Jared review, and Nando review.
-
Parallel execution contradiction resolved
FC, Jared, and Stevey all carried a "read other reviewers first" rule that's structurally impossible in parallel runs. All three now forward-flag anticipated cross-agent connections for Nando to consolidate — no inter-agent read dependency.
-
PM Cory memory carve-out — Consult and Implement
The carve-out exempting
.review-squad/ from the injected-context prohibition was only present in PM Cory Review. Now applied to Consult and Implement as well.