-
Notifications
You must be signed in to change notification settings - Fork 3.2k
feat(cold-pass): universal checklist + reviewer command for independent PR review #1264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
fundedullc-dev
wants to merge
2
commits into
coleam00:dev
Choose a base branch
from
fundedullc-dev:feat/cold-second-pass
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+160
−0
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| # Cold-Pass Default Checklist (Universal) | ||
|
|
||
| **Reviewer instruction:** Walk each `[ ]` item literally. Output PASS / FAIL / N/A | ||
| with a file:line reference for every FAIL. Do not summarize. Do not skip. | ||
|
|
||
| The #1 heuristic: **one-sided parity.** For every producer touched in the diff, | ||
| identify its consumers. For every filter added, identify every sibling endpoint | ||
| that needs matching behavior. Multi-agent reviewers miss this because they read | ||
| files in isolation — your job is to read them together. | ||
|
|
||
| --- | ||
|
|
||
| ## ONE-SIDED PARITY (read/write drift) | ||
| [ ] For every function the diff modifies, which other functions call it? Does | ||
| each call site still pass the correct arguments? | ||
| [ ] For every new parameter or field: is it threaded through every path that | ||
| reaches the modified function, or only the one the PR author tested? | ||
| [ ] For every writer touched: find its readers. Still in sync? | ||
| [ ] For every reader touched: find its writers. Still in sync? | ||
| [ ] "Sibling endpoint" parity: if PR adds filter/field on endpoint A, do | ||
| siblings (list, detail, count, summary, inventory) need matching behavior? | ||
|
|
||
| ## CROSS-FILE STATE MACHINES | ||
| [ ] If the PR advances a state in one table/record, are all linked states | ||
| advanced together? (Approval workflows, order fulfillment, multi-step | ||
| transitions.) | ||
| [ ] Are invariants maintained across the cross-table transition, or only | ||
| per-table? | ||
|
|
||
| ## MIGRATION / SCHEMA HYGIENE | ||
| [ ] Migrations have explicit BEGIN/COMMIT wrappers or transaction guarantees? | ||
| [ ] UPDATE-only migrations on existing rows: do they silent no-op on fresh DBs | ||
| where the rows don't exist yet? | ||
| [ ] Numeric prefix / filename collisions with existing migrations? | ||
| [ ] Schema changes referenced consistently across all code paths (ORM + raw | ||
| SQL + serializers)? | ||
|
|
||
| ## RESOURCE / FAILURE MODES | ||
| [ ] `query[0]` / `.first()` on a multi-row query without ORDER BY | ||
| (non-deterministic on Postgres, MySQL, most engines)? | ||
| [ ] New parameter handled for empty string, None/null, special characters, | ||
| very large input? | ||
| [ ] Bypasses existing auth, rate-limit, or validation middleware? | ||
| [ ] Error paths return the same shape as success paths? (Silent-200, | ||
| partial-failure hidden behind 200 OK.) | ||
|
|
||
| ## TEST COVERAGE BLIND SPOTS | ||
| [ ] Every branch added in the diff maps to at least one test assertion? | ||
| [ ] Tests exercise the REAL write path, not direct SQL insert / mock that | ||
| skips triggers, validators, or middleware? | ||
| [ ] If the PR introduces multiple backends / modes / providers, does at least | ||
| one test cover each? | ||
|
|
||
| ## DOWNSTREAM CONSISTENCY | ||
| [ ] If public API shape changes, are consumers updated or version-gated? | ||
| [ ] Config, environment variables, feature flags: documented and plumbed | ||
| through all environments? | ||
|
|
||
| ## HARDCODED VALUES | ||
| [ ] Hardcoded identifiers (IDs, names, keys, domains) that should be | ||
| config-driven? | ||
| [ ] Defaults that only work for the author's happy path but break other | ||
| legitimate inputs? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,97 @@ | ||
| --- | ||
| description: Independent cold second-pass PR review — single reviewer, repo-local checklist | ||
| argument-hint: (none — reads PR number from ARTIFACTS_DIR/.pr-number) | ||
| --- | ||
|
|
||
| # Cold Pass Review | ||
|
|
||
| ## Your Role | ||
|
|
||
| You are the **first reviewer** of this PR. That is not metaphor — for your purposes, | ||
| no one has looked at this code before. You are walking through it fresh, with a | ||
| checklist of structural concerns this repo's maintainers have learned to watch for. | ||
|
|
||
| Archon's 5-agent review has already run and posted its findings as a PR comment. | ||
| You will glance at those findings ONCE, at the end, for deduplication only. They | ||
| must not shape where you look during your review. Correlated-attention blind spots | ||
| are the entire reason you exist: if you let Archon's report anchor your search, you | ||
| become a 6th Archon agent and add no value. | ||
|
|
||
| ## Phase 1: LOAD | ||
|
|
||
| 1. Read `$ARTIFACTS_DIR/.pr-number` → `PR` | ||
| 2. Read `$ARTIFACTS_DIR/.checklist-source` → `SOURCE` (`default` or `repo+default`) | ||
| 3. Read `$ARTIFACTS_DIR/cold-pass-checklist.md` → the layered checklist. It always contains the universal default; when `SOURCE=repo+default`, the repo-local extension is appended after a `---` separator. Walk both in order. | ||
| 4. `gh pr diff $PR` → the changes | ||
| 5. For every file in the diff, `cat <file>` in full (not just the diff context) | ||
| 6. Do NOT read Archon's artifacts. Do NOT read `$ARTIFACTS_DIR/review/*`. | ||
|
|
||
| **Phase 1 checkpoint:** | ||
| - [ ] PR number loaded | ||
| - [ ] Checklist source loaded | ||
| - [ ] Layered checklist loaded (universal default present; repo-local present if `SOURCE=repo+default`) | ||
| - [ ] Diff read | ||
| - [ ] All modified files read in full | ||
| - [ ] Archon artifacts NOT consulted | ||
|
|
||
| ## Phase 2: ANALYZE | ||
|
|
||
| Walk the checklist literally — check each `[ ]` box, one at a time, out loud. | ||
| The literal walk is the discipline. Never summarize the checklist. Never skip | ||
| an item because it "looks fine." | ||
|
|
||
| For each item: `PASS` / `FAIL` / `N/A`. For every `FAIL`, include a file:line | ||
| reference and a one-sentence explanation. | ||
|
|
||
| **Special attention to one-sided parity.** For every producer/writer touched by | ||
| the diff, identify its consumers/readers (grep the codebase if needed) and verify | ||
| they are still in sync. This is the #1 shape to hunt for. | ||
|
|
||
| **Phase 2 checkpoint:** | ||
| - [ ] Every checklist item walked | ||
| - [ ] FAILs have file:line refs | ||
| - [ ] One-sided parity checks done for every modified writer | ||
|
|
||
| ## Phase 3: GENERATE | ||
|
|
||
| Write `$ARTIFACTS_DIR/cold-pass-findings.md`: | ||
|
|
||
| ``` | ||
| # 🧊 COLD PASS: PR #<N> | ||
|
|
||
| **Reviewer model:** <state the model you are, e.g., `gpt-5.3-codex` or `claude-opus-4.6` — name the actual model you run as> | ||
| **Checklist source:** <`default` or `repo+default` — read from `$ARTIFACTS_DIR/.checklist-source`> | ||
| **Verdict:** <APPROVE or REQUEST_CHANGES> | ||
|
|
||
| ## HIGH | ||
| - <file:line> — <finding> | ||
|
|
||
| ## MEDIUM | ||
| - <file:line> — <finding> | ||
|
|
||
| ## LOW | ||
| - <file:line> — <finding> | ||
|
|
||
| ## Deduplication note | ||
|
|
||
| After reading `gh pr view <PR> --comments`, I removed findings Archon already | ||
| caught at the same or higher severity. Findings retained below are either new, | ||
| or graded higher than Archon did (with reasoning). | ||
|
|
||
| ## Summary | ||
|
|
||
| <One paragraph. If nothing beyond Archon, say: "No additional findings. APPROVE."> | ||
| ``` | ||
|
|
||
| **Phase 3 checkpoint:** | ||
| - [ ] Findings file exists | ||
| - [ ] Verdict line present | ||
| - [ ] Dedup note present (even if empty) | ||
|
|
||
| ## Phase 4: VALIDATE | ||
|
|
||
| Check that `cold-pass-findings.md`: | ||
| - Has a Verdict line (APPROVE or REQUEST_CHANGES) | ||
| - Has a Summary section | ||
| - Has a Deduplication note | ||
| - Uses file:line refs for all FAILs | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a language tag to the fenced block to satisfy markdownlint.
At Line 59, the fenced code block is missing a language identifier (MD040). If linting is enforced, this can fail docs checks.
Suggested fix
📝 Committable suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)
[warning] 59-59: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents