Review Squad — Release Notes

What’s new in V3.8

Date   27 March 2026
Scope   Debate system + pre-flight calibration
Agents updated   5 review agents + debate command
We ran 11 iterations of a structured multi-agent debate stress test — a scenario where the code is correct and all expected concerns are false positives. Starting from 1 of 6 cleared and 5 phantom claims in Run 1, the system reached 7 of 7 cleared and 0 phantoms in Run 11. The core finding: structural intervention in the command prompt overrides behavioral priors more reliably than rules in agent definition files. V3.8 ships the calibrated debate command and the pre-flight gate system that achieved it.
/debate-false-positive New command
Pre-flight gate system Core methodology
Emily Validation
Calibration results 11-run progression
Research documentation Methodology
← V3.7 V3.8 V3.9 →