From 7e427f8d66799e893f22d26ffd630f13c4f3f44c Mon Sep 17 00:00:00 2001 From: Tom Boucher Date: Sat, 25 Apr 2026 14:04:53 -0400 Subject: [PATCH 1/4] =?UTF-8?q?feat(#2306):=20plan-review-convergence=20v2?= =?UTF-8?q?=20=E2=80=94=20CYCLE=5FSUMMARY=20contract,=20config=20gate,=20l?= =?UTF-8?q?ocal=20model=20reviewers?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fixes the false-stall detection bug in the plan→review→replan convergence loop. REVIEWS.md accumulates history across cycles so raw grep inflated HIGH counts; HIGH count now comes from a per-cycle CYCLE_SUMMARY contract emitted in the review agent's return message. Key changes: - workflow.plan_review_convergence config gate (disabled by default, same pattern as workflow.code_review / workflow.nyquist_validation) - Review agent prompt defines CYCLE_SUMMARY: current_high= contract with PARTIALLY RESOLVED / FULLY RESOLVED counting rules - Orchestrator aborts on absent/malformed CYCLE_SUMMARY (distinguishes both) - Warns when HIGH_COUNT > 0 but ## Current HIGH Concerns section is missing - Stall detection and --ws forwarding preserved and tested - Local model reviewers: --ollama, --lm-studio, --llama-cpp flags added to convergence workflow and review workflow; all three use OpenAI-compatible /v1/chat/completions endpoint with jq --rawfile for safe JSON encoding - review.ollama_host / review.lm_studio_host / review.llama_cpp_host config keys registered and documented (default to localhost:11434/1234/8080) - review.models.ollama / .lm_studio / .llama_cpp model-name config support - 58 tests (up from 29 in PR #2339), all passing Closes #2306 Closes #2339 Co-authored-by: Tom Boucher Co-Authored-By: Claude Sonnet 4.6 --- commands/gsd/plan-review-convergence.md | 10 +- docs/CONFIGURATION.md | 7 + get-shit-done/bin/lib/config-schema.cjs | 2 + .../workflows/plan-review-convergence.md | 93 ++++- get-shit-done/workflows/review.md | 103 +++++- tests/plan-review-convergence.test.cjs | 322 +++++++++++++++++- 6 files changed, 509 insertions(+), 28 deletions(-) diff --git a/commands/gsd/plan-review-convergence.md b/commands/gsd/plan-review-convergence.md index aeba824e75..acdf21d452 100644 --- a/commands/gsd/plan-review-convergence.md +++ b/commands/gsd/plan-review-convergence.md @@ -1,7 +1,7 @@ --- name: gsd:plan-review-convergence description: "Cross-AI plan convergence loop — replan with review feedback until no HIGH concerns remain (max 3 cycles)" -argument-hint: " [--codex] [--gemini] [--claude] [--opencode] [--text] [--ws ] [--all] [--max-cycles N]" +argument-hint: " [--codex] [--gemini] [--claude] [--opencode] [--ollama] [--lm-studio] [--llama-cpp] [--text] [--ws ] [--all] [--max-cycles N]" allowed-tools: - Read - Write @@ -42,8 +42,14 @@ Phase number: extracted from $ARGUMENTS (required) - `--gemini` — Use Gemini CLI as reviewer - `--claude` — Use Claude CLI as reviewer (separate session) - `--opencode` — Use OpenCode as reviewer -- `--all` — Use all available CLIs +- `--ollama` — Use local Ollama server as reviewer (OpenAI-compatible, default host `http://localhost:11434`; configure model via `review.models.ollama`) +- `--lm-studio` — Use local LM Studio server as reviewer (OpenAI-compatible, default host `http://localhost:1234`; configure model via `review.models.lm_studio`) +- `--llama-cpp` — Use local llama.cpp server as reviewer (OpenAI-compatible, default host `http://localhost:8080`; configure model via `review.models.llama_cpp`) +- `--all` — Use all available CLIs and running local model servers - `--max-cycles N` — Maximum replan→review cycles (default: 3) + +**Feature gate:** This command requires `workflow.plan_review_convergence=true`. Enable with: +`gsd config-set workflow.plan_review_convergence true` diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md index 8b9c270494..f6119696aa 100644 --- a/docs/CONFIGURATION.md +++ b/docs/CONFIGURATION.md @@ -195,6 +195,7 @@ All workflow toggles follow the **absent = enabled** pattern. If a key is missin | `workflow.plan_bounce_script` | string | (none) | Path to the external script invoked for plan bounce validation. Receives the PLAN.md path as its first argument. Required when `plan_bounce` is `true`. Added in v1.36 | | `workflow.plan_bounce_passes` | number | `2` | Number of sequential bounce passes to run. Each pass feeds the previous pass's output back into the validator. Higher values increase rigor at the cost of latency. Added in v1.36 | | `workflow.post_planning_gaps` | boolean | `true` | Unified post-planning gap report (#2493). After all plans are generated and committed, scans REQUIREMENTS.md and CONTEXT.md `` against every PLAN.md in the phase directory, then prints one `Source \| Item \| Status` table. Word-boundary matching (REQ-1 vs REQ-10) and natural sort (REQ-02 before REQ-10). Non-blocking — informational report only. Set to `false` to skip Step 13e of plan-phase. | +| `workflow.plan_review_convergence` | boolean | `false` | Enable the `/gsd-plan-review-convergence` command. Disabled by default — the command exits with an enable instruction when this key is `false`. The command automates the manual plan→review→replan loop: it spawns external AI reviewers (Codex, Gemini, Claude, OpenCode), counts unresolved HIGH concerns via the CYCLE_SUMMARY contract, replans with `--reviews` feedback, and repeats until converged or max cycles reached. Enable with `gsd config-set workflow.plan_review_convergence true`. Added in v1.39 | | `workflow.plan_chunked` | boolean | `false` | Enable chunked planning mode. When `true` (or when `--chunked` flag is passed to `/gsd-plan-phase`), the orchestrator splits the single long-lived planner Task into a short outline Task followed by N short per-plan Tasks (~3-5 min each). Each plan is committed individually for crash resilience. If a Task hangs and the terminal is force-killed, rerunning with `--chunked` resumes from the last completed plan. Particularly useful on Windows where long-lived Tasks may hang on stdio. Added in v1.38 | | `workflow.code_review_command` | string | (none) | Shell command for external code review integration in `/gsd-ship`. Receives changed file paths via stdin. Non-zero exit blocks the ship workflow. Added in v1.36 | | `workflow.tdd_mode` | boolean | `false` | Enable TDD pipeline as a first-class execution mode. When `true`, the planner aggressively applies `type: tdd` to eligible tasks (business logic, APIs, validations, algorithms) and the executor enforces RED/GREEN/REFACTOR gate sequence. An end-of-phase collaborative review checkpoint verifies gate compliance. Added in v1.36 | @@ -531,6 +532,12 @@ Configure per-CLI model selection for `/gsd-review`. When set, overrides the CLI | `review.models.opencode` | string | (CLI default) | Model used when `--opencode` reviewer is invoked | | `review.models.qwen` | string | (CLI default) | Model used when `--qwen` reviewer is invoked | | `review.models.cursor` | string | (CLI default) | Model used when `--cursor` reviewer is invoked | +| `review.models.ollama` | string | (server default) | Model name passed to Ollama when `--ollama` reviewer is invoked. If unset, the first available model reported by the server is used (e.g. `llama3`). Set to a specific tag: `gsd config-set review.models.ollama codellama` | +| `review.models.lm_studio` | string | (server default) | Model name passed to LM Studio when `--lm-studio` reviewer is invoked. If unset, the first available model reported by the server is used. | +| `review.models.llama_cpp` | string | (server default) | Model name passed to llama.cpp when `--llama-cpp` reviewer is invoked. If unset, the first model reported by `/v1/models` is used. | +| `review.ollama_host` | string | `http://localhost:11434` | Base URL of the Ollama server. Override when running Ollama on a non-default port or remote host: `gsd config-set review.ollama_host http://192.168.1.10:11434` | +| `review.lm_studio_host` | string | `http://localhost:1234` | Base URL of the LM Studio local server. Override when using a non-default port. | +| `review.llama_cpp_host` | string | `http://localhost:8080` | Base URL of the llama.cpp server (`llama-server`). Override when using a non-default port. | ### Example diff --git a/get-shit-done/bin/lib/config-schema.cjs b/get-shit-done/bin/lib/config-schema.cjs index 7768c4adeb..a786ea8f39 100644 --- a/get-shit-done/bin/lib/config-schema.cjs +++ b/get-shit-done/bin/lib/config-schema.cjs @@ -34,6 +34,7 @@ const VALID_CONFIG_KEYS = new Set([ 'workflow.plan_bounce_script', 'workflow.plan_bounce_passes', 'workflow.plan_chunked', + 'workflow.plan_review_convergence', 'workflow.post_planning_gaps', 'workflow.security_enforcement', 'workflow.security_asvs_level', @@ -42,6 +43,7 @@ const VALID_CONFIG_KEYS = new Set([ 'workflow.drift_action', 'git.branching_strategy', 'git.base_branch', 'git.phase_branch_template', 'git.milestone_branch_template', 'git.quick_branch_template', 'planning.commit_docs', 'planning.search_gitignored', 'planning.sub_repos', + 'review.ollama_host', 'review.lm_studio_host', 'review.llama_cpp_host', 'workflow.cross_ai_execution', 'workflow.cross_ai_command', 'workflow.cross_ai_timeout', 'workflow.subagent_timeout', 'workflow.inline_plan_threshold', diff --git a/get-shit-done/workflows/plan-review-convergence.md b/get-shit-done/workflows/plan-review-convergence.md index f02f2cd5f9..f0e557899e 100644 --- a/get-shit-done/workflows/plan-review-convergence.md +++ b/get-shit-done/workflows/plan-review-convergence.md @@ -2,7 +2,7 @@ Cross-AI plan convergence loop — automates the manual chain: gsd-plan-phase N → gsd-review N --codex → gsd-plan-phase N --reviews → gsd-review N --codex → ... Each step runs inside an isolated Agent that calls the corresponding Skill. -Orchestrator only does: init, loop control, HIGH count check, stall detection, escalation. +Orchestrator only does: init, loop control, parse CYCLE_SUMMARY for HIGH count, stall detection, escalation. @@ -17,7 +17,7 @@ Read all files referenced by the invoking prompt's execution_context before star ## 1. Parse and Normalize Arguments -Extract from $ARGUMENTS: phase number, reviewer flags (`--codex`, `--gemini`, `--claude`, `--opencode`, `--all`), `--max-cycles N`, `--text`, `--ws`. +Extract from $ARGUMENTS: phase number, reviewer flags (`--codex`, `--gemini`, `--claude`, `--opencode`, `--ollama`, `--lm-studio`, `--llama-cpp`, `--all`), `--max-cycles N`, `--text`, `--ws`. ```bash PHASE=$(echo "$ARGUMENTS" | grep -oE '[0-9]+\.?[0-9]*' | head -1) @@ -27,6 +27,9 @@ echo "$ARGUMENTS" | grep -q '\-\-codex' && REVIEWER_FLAGS="$REVIEWER_FLAGS --cod echo "$ARGUMENTS" | grep -q '\-\-gemini' && REVIEWER_FLAGS="$REVIEWER_FLAGS --gemini" echo "$ARGUMENTS" | grep -q '\-\-claude' && REVIEWER_FLAGS="$REVIEWER_FLAGS --claude" echo "$ARGUMENTS" | grep -q '\-\-opencode' && REVIEWER_FLAGS="$REVIEWER_FLAGS --opencode" +echo "$ARGUMENTS" | grep -q '\-\-ollama' && REVIEWER_FLAGS="$REVIEWER_FLAGS --ollama" +echo "$ARGUMENTS" | grep -q '\-\-lm-studio' && REVIEWER_FLAGS="$REVIEWER_FLAGS --lm-studio" +echo "$ARGUMENTS" | grep -q '\-\-llama-cpp' && REVIEWER_FLAGS="$REVIEWER_FLAGS --llama-cpp" echo "$ARGUMENTS" | grep -q '\-\-all' && REVIEWER_FLAGS="$REVIEWER_FLAGS --all" if [ -z "$REVIEWER_FLAGS" ]; then REVIEWER_FLAGS="--codex"; fi @@ -37,6 +40,25 @@ GSD_WS="" echo "$ARGUMENTS" | grep -qE '\-\-ws\s+\S+' && GSD_WS=$(echo "$ARGUMENTS" | grep -oE '\-\-ws\s+\S+') ``` +## 1.5. Config Gate (feature disabled by default) + +```bash +CONVERGENCE_ENABLED=$(gsd-sdk query config-get workflow.plan_review_convergence 2>/dev/null || echo "false") +``` + +**If `CONVERGENCE_ENABLED` is not `"true"`:** Display and exit: + +```text +gsd-plan-review-convergence is disabled (workflow.plan_review_convergence=false). + +This feature automates the plan→review→replan loop using external AI reviewers. +Enable it with: + + gsd config-set workflow.plan_review_convergence true + +Then re-run: /gsd-plan-review-convergence {PHASE} +``` + ## 2. Initialize ```bash @@ -120,7 +142,35 @@ Agent( Execute: Skill(skill='gsd-review', args='--phase {PHASE} {REVIEWER_FLAGS} {GSD_WS}') -Complete the full review workflow. Do NOT return until REVIEWS.md is committed.", +Complete the full review workflow. Do NOT return until REVIEWS.md is committed. + +IMPORTANT — CYCLE_SUMMARY contract (required): +Your final response MUST include a machine-readable line of exactly this form: + + CYCLE_SUMMARY: current_high= + +Where is the integer count of HIGH-severity concerns that REMAIN UNRESOLVED in this cycle's findings. + +Counting rules: + INCLUDE in the count: + - Newly raised HIGHs in this cycle + - PARTIALLY RESOLVED HIGHs: concern acknowledged and a mitigation is in progress, but not yet verified/completed + - Previously raised HIGHs that are still unresolved + + EXCLUDE from the count: + - FULLY RESOLVED HIGHs: concern addressed with verification complete (closed ticket, verification log, or reviewer sign-off) + - HIGH mentions in retrospective/summary tables comparing cycles + - Quoted excerpts from prior reviews referencing past HIGH items + +Definitions: + PARTIALLY RESOLVED — concern acknowledged and mitigation is in progress but not yet verified/completed (e.g., open ticket exists but fix not landed). + FULLY RESOLVED — concern addressed with verification complete (closed ticket, verification log, or explicit reviewer sign-off confirming closure). + +Your final response MUST also include this section immediately after the CYCLE_SUMMARY line: + +## Current HIGH Concerns +[List each unresolved HIGH with a brief description, one per bullet] +[If none: write exactly 'None.']", mode="auto" ) ``` @@ -132,12 +182,32 @@ REVIEWS_FILE=$(ls ${phase_dir}/${padded_phase}-REVIEWS.md 2>/dev/null) If REVIEWS_FILE is empty: Error — review agent did not produce REVIEWS.md. Exit. -### 5b. Check for HIGH Concerns +### 5b. Extract HIGH Count from CYCLE_SUMMARY Contract + +**Do NOT grep REVIEWS.md for HIGH count.** REVIEWS.md accumulates history across cycles — resolved HIGHs from prior cycles remain in the file as audit trail, inflating a raw grep count and causing false stall detection. + +Parse HIGH_COUNT from the review agent's return message via the CYCLE_SUMMARY contract: ```bash -HIGH_COUNT=$(grep -c '\*\*HIGH' "${REVIEWS_FILE}" 2>/dev/null || true) -HIGH_COUNT=${HIGH_COUNT:-0} -HIGH_LINES=$(grep -B0 -A1 '\*\*HIGH' "${REVIEWS_FILE}" 2>/dev/null) +# Extract the integer from "CYCLE_SUMMARY: current_high=N" in the agent's return message +HIGH_COUNT=$(echo "$REVIEW_AGENT_RETURN" | grep -oE 'CYCLE_SUMMARY:\s*current_high=[0-9]+' | grep -oE '[0-9]+$') + +if [ -z "$HIGH_COUNT" ]; then + # Distinguish malformed contract from completely absent contract + if echo "$REVIEW_AGENT_RETURN" | grep -q 'CYCLE_SUMMARY:'; then + echo "CYCLE_SUMMARY present but current_high is malformed — expected integer, got non-numeric value. Retry or switch reviewer." + else + echo "Review agent did not honor the CYCLE_SUMMARY contract — cannot determine HIGH count. Retry or switch reviewer." + fi + exit 1 +fi + +# Extract the ## Current HIGH Concerns section from the agent's return message +HIGH_LINES=$(echo "$REVIEW_AGENT_RETURN" | awk '/^## Current HIGH Concerns/{found=1; next} found && /^##/{exit} found{print}') + +if [ "${HIGH_COUNT}" -gt 0 ] && [ -z "${HIGH_LINES}" ]; then + echo "⚠ Review agent's CYCLE_SUMMARY reports ${HIGH_COUNT} HIGHs but did not provide ## Current HIGH Concerns section — continuing with incomplete escalation details." +fi ``` **If HIGH_COUNT == 0 (converged):** @@ -243,10 +313,15 @@ After agent returns → go back to **step 5a** (review again). +- [ ] Config gate checked before running — exits with enable instructions if workflow.plan_review_convergence is false - [ ] Initial planning via Agent → Skill("gsd-plan-phase") if no plans exist -- [ ] Review via Agent → Skill("gsd-review") — isolated, not inline +- [ ] Review via Agent → Skill("gsd-review") — isolated, not inline; {GSD_WS} forwarded - [ ] Replan via Agent → Skill("gsd-plan-phase --reviews") — isolated, not inline -- [ ] Orchestrator only does: init, loop control, grep HIGHs, stall detection, escalation +- [ ] Orchestrator only does: init, config gate, loop control, parse CYCLE_SUMMARY for HIGH count, stall detection, escalation +- [ ] HIGH count extracted from review agent's CYCLE_SUMMARY return message (not by grepping REVIEWS.md) +- [ ] Review agent prompt defines CYCLE_SUMMARY: current_high= contract with PARTIALLY/FULLY RESOLVED definitions +- [ ] Abort with clear error if CYCLE_SUMMARY is absent; distinguish malformed from absent +- [ ] Warn if HIGH_COUNT > 0 but ## Current HIGH Concerns section is absent from return message - [ ] Each Agent fully completes its Skill before returning - [ ] Loop exits on: no HIGH concerns (converged) OR max cycles (escalation) - [ ] Stall detection reported when HIGH count not decreasing diff --git a/get-shit-done/workflows/review.md b/get-shit-done/workflows/review.md index d9c9911d0c..f52611c283 100644 --- a/get-shit-done/workflows/review.md +++ b/get-shit-done/workflows/review.md @@ -22,6 +22,19 @@ command -v coderabbit >/dev/null 2>&1 && echo "coderabbit:available" || echo "co command -v opencode >/dev/null 2>&1 && echo "opencode:available" || echo "opencode:missing" command -v qwen >/dev/null 2>&1 && echo "qwen:available" || echo "qwen:missing" command -v cursor >/dev/null 2>&1 && echo "cursor:available" || echo "cursor:missing" + +# Check local model servers (OpenAI-compatible HTTP API — no CLI binary required) +OLLAMA_HOST=$(gsd-sdk query config-get review.ollama_host 2>/dev/null | jq -r '.' 2>/dev/null || echo "") +if [ -z "$OLLAMA_HOST" ] || [ "$OLLAMA_HOST" = "null" ]; then OLLAMA_HOST="http://localhost:11434"; fi +curl -s --max-time 2 "${OLLAMA_HOST}/v1/models" >/dev/null 2>&1 && echo "ollama:available" || echo "ollama:missing" + +LM_STUDIO_HOST=$(gsd-sdk query config-get review.lm_studio_host 2>/dev/null | jq -r '.' 2>/dev/null || echo "") +if [ -z "$LM_STUDIO_HOST" ] || [ "$LM_STUDIO_HOST" = "null" ]; then LM_STUDIO_HOST="http://localhost:1234"; fi +curl -s --max-time 2 "${LM_STUDIO_HOST}/v1/models" >/dev/null 2>&1 && echo "lm_studio:available" || echo "lm_studio:missing" + +LLAMA_CPP_HOST=$(gsd-sdk query config-get review.llama_cpp_host 2>/dev/null | jq -r '.' 2>/dev/null || echo "") +if [ -z "$LLAMA_CPP_HOST" ] || [ "$LLAMA_CPP_HOST" = "null" ]; then LLAMA_CPP_HOST="http://localhost:8080"; fi +curl -s --max-time 2 "${LLAMA_CPP_HOST}/v1/models" >/dev/null 2>&1 && echo "llama_cpp:available" || echo "llama_cpp:missing" ``` Parse flags from `$ARGUMENTS`: @@ -32,7 +45,10 @@ Parse flags from `$ARGUMENTS`: - `--opencode` → include OpenCode - `--qwen` → include Qwen Code - `--cursor` → include Cursor -- `--all` → include all available +- `--ollama` → include Ollama (local server, OpenAI-compatible) +- `--lm-studio` → include LM Studio (local server, OpenAI-compatible) +- `--llama-cpp` → include llama.cpp (local server, OpenAI-compatible) +- `--all` → include all available (CLIs + running local servers) - No flags → include all available If no CLIs are available: @@ -223,7 +239,70 @@ if [ ! -s /tmp/gsd-review-cursor-{phase}.md ]; then fi ``` -If a CLI fails, log the error and continue with remaining CLIs. +**Ollama (local, OpenAI-compatible):** + +Read host and model from config. All three local backends share the same `/v1/chat/completions` endpoint — only host and model differ. Use `jq --rawfile` to safely encode the multi-line prompt as JSON without shell-escaping issues. + +```bash +OLLAMA_HOST=$(gsd-sdk query config-get review.ollama_host 2>/dev/null | jq -r '.' 2>/dev/null || echo "") +if [ -z "$OLLAMA_HOST" ] || [ "$OLLAMA_HOST" = "null" ]; then OLLAMA_HOST="http://localhost:11434"; fi +OLLAMA_MODEL=$(gsd-sdk query config-get review.models.ollama 2>/dev/null | jq -r '.' 2>/dev/null || echo "") +if [ -z "$OLLAMA_MODEL" ] || [ "$OLLAMA_MODEL" = "null" ]; then + OLLAMA_MODEL=$(curl -s --max-time 2 "${OLLAMA_HOST}/v1/models" 2>/dev/null | jq -r '.data[0].id // "llama3"' 2>/dev/null || echo "llama3") +fi +jq -n --rawfile content /tmp/gsd-review-prompt-{phase}.md \ + --arg model "$OLLAMA_MODEL" \ + '{model: $model, messages: [{role: "user", content: $content}]}' | \ + curl -s --max-time 120 -X POST "${OLLAMA_HOST}/v1/chat/completions" \ + -H "Content-Type: application/json" -d @- 2>/dev/null | \ + jq -r '.choices[0].message.content // "Ollama review failed or returned empty output."' \ + > /tmp/gsd-review-ollama-{phase}.md +if [ ! -s /tmp/gsd-review-ollama-{phase}.md ]; then + echo "Ollama review failed or returned empty output." > /tmp/gsd-review-ollama-{phase}.md +fi +``` + +**LM Studio (local, OpenAI-compatible):** +```bash +LM_STUDIO_HOST=$(gsd-sdk query config-get review.lm_studio_host 2>/dev/null | jq -r '.' 2>/dev/null || echo "") +if [ -z "$LM_STUDIO_HOST" ] || [ "$LM_STUDIO_HOST" = "null" ]; then LM_STUDIO_HOST="http://localhost:1234"; fi +LM_STUDIO_MODEL=$(gsd-sdk query config-get review.models.lm_studio 2>/dev/null | jq -r '.' 2>/dev/null || echo "") +if [ -z "$LM_STUDIO_MODEL" ] || [ "$LM_STUDIO_MODEL" = "null" ]; then + LM_STUDIO_MODEL=$(curl -s --max-time 2 "${LM_STUDIO_HOST}/v1/models" 2>/dev/null | jq -r '.data[0].id // "local-model"' 2>/dev/null || echo "local-model") +fi +jq -n --rawfile content /tmp/gsd-review-prompt-{phase}.md \ + --arg model "$LM_STUDIO_MODEL" \ + '{model: $model, messages: [{role: "user", content: $content}]}' | \ + curl -s --max-time 120 -X POST "${LM_STUDIO_HOST}/v1/chat/completions" \ + -H "Content-Type: application/json" -d @- 2>/dev/null | \ + jq -r '.choices[0].message.content // "LM Studio review failed or returned empty output."' \ + > /tmp/gsd-review-lm_studio-{phase}.md +if [ ! -s /tmp/gsd-review-lm_studio-{phase}.md ]; then + echo "LM Studio review failed or returned empty output." > /tmp/gsd-review-lm_studio-{phase}.md +fi +``` + +**llama.cpp (local, OpenAI-compatible):** +```bash +LLAMA_CPP_HOST=$(gsd-sdk query config-get review.llama_cpp_host 2>/dev/null | jq -r '.' 2>/dev/null || echo "") +if [ -z "$LLAMA_CPP_HOST" ] || [ "$LLAMA_CPP_HOST" = "null" ]; then LLAMA_CPP_HOST="http://localhost:8080"; fi +LLAMA_CPP_MODEL=$(gsd-sdk query config-get review.models.llama_cpp 2>/dev/null | jq -r '.' 2>/dev/null || echo "") +if [ -z "$LLAMA_CPP_MODEL" ] || [ "$LLAMA_CPP_MODEL" = "null" ]; then + LLAMA_CPP_MODEL=$(curl -s --max-time 2 "${LLAMA_CPP_HOST}/v1/models" 2>/dev/null | jq -r '.data[0].id // "local-model"' 2>/dev/null || echo "local-model") +fi +jq -n --rawfile content /tmp/gsd-review-prompt-{phase}.md \ + --arg model "$LLAMA_CPP_MODEL" \ + '{model: $model, messages: [{role: "user", content: $content}]}' | \ + curl -s --max-time 120 -X POST "${LLAMA_CPP_HOST}/v1/chat/completions" \ + -H "Content-Type: application/json" -d @- 2>/dev/null | \ + jq -r '.choices[0].message.content // "llama.cpp review failed or returned empty output."' \ + > /tmp/gsd-review-llama_cpp-{phase}.md +if [ ! -s /tmp/gsd-review-llama_cpp-{phase}.md ]; then + echo "llama.cpp review failed or returned empty output." > /tmp/gsd-review-llama_cpp-{phase}.md +fi +``` + +If a CLI or local server fails, log the error and continue with remaining reviewers. Display progress: ``` @@ -242,7 +321,7 @@ Combine all review responses into `{phase_dir}/{padded_phase}-REVIEWS.md`: ```markdown --- phase: {N} -reviewers: [gemini, claude, codex, coderabbit, opencode, qwen, cursor] +reviewers: [gemini, claude, codex, coderabbit, opencode, qwen, cursor, ollama, lm_studio, llama_cpp] reviewed_at: {ISO timestamp} plans_reviewed: [{list of PLAN.md files}] --- @@ -291,6 +370,24 @@ plans_reviewed: [{list of PLAN.md files}] --- +## Ollama Review + +{ollama review content} + +--- + +## LM Studio Review + +{lm_studio review content} + +--- + +## llama.cpp Review + +{llama_cpp review content} + +--- + ## Consensus Summary {synthesize common concerns across all reviewers} diff --git a/tests/plan-review-convergence.test.cjs b/tests/plan-review-convergence.test.cjs index b022d3df09..fc0fe85415 100644 --- a/tests/plan-review-convergence.test.cjs +++ b/tests/plan-review-convergence.test.cjs @@ -3,10 +3,25 @@ * * Validates that the command source and workflow contain the key structural * elements required for correct cross-AI plan convergence loop behavior: - * initial planning gate, review agent spawning, HIGH count detection, - * stall detection, escalation gate, and STATE.md update on convergence. + * initial planning gate, review agent spawning, CYCLE_SUMMARY contract for + * HIGH count extraction, stall detection, escalation gate, and STATE.md update + * on convergence. + * + * v2 additions (#2306-v2): + * - CYCLE_SUMMARY contract replaces raw grep (prevents false stalls from + * accumulated REVIEWS.md history across cycles) + * - workflow.plan_review_convergence config gate (disabled by default) + * - --ws forwarded to review agent (symmetric with replan agent) + * - PARTIALLY RESOLVED / FULLY RESOLVED definitions in contract + * - HIGH_LINES validation warning when HIGH_COUNT > 0 but section absent + * - Success criteria updated to reflect CYCLE_SUMMARY parsing */ +// allow-test-rule: source-text-is-the-product +// The workflow markdown IS the runtime instruction. Testing its text content +// tests the deployed contract — if the CYCLE_SUMMARY requirement is absent, +// the false-stall bug is absent from defenses too. + const { test, describe } = require('node:test'); const assert = require('node:assert/strict'); const fs = require('fs'); @@ -14,6 +29,8 @@ const path = require('path'); const COMMAND_PATH = path.join(__dirname, '..', 'commands', 'gsd', 'plan-review-convergence.md'); const WORKFLOW_PATH = path.join(__dirname, '..', 'get-shit-done', 'workflows', 'plan-review-convergence.md'); +const SCHEMA_PATH = path.join(__dirname, '..', 'get-shit-done', 'bin', 'lib', 'config-schema.cjs'); +const CONFIG_DOC_PATH = path.join(__dirname, '..', 'docs', 'CONFIGURATION.md'); // ─── Command source ──────────────────────────────────────────────────────── @@ -36,6 +53,12 @@ describe('plan-review-convergence command source (#2306)', () => { assert.ok(command.includes('--max-cycles'), 'must document --max-cycles flag'); }); + test('command documents local model reviewer flags (--ollama, --lm-studio, --llama-cpp)', () => { + assert.ok(command.includes('--ollama'), 'must document --ollama flag for local Ollama server'); + assert.ok(command.includes('--lm-studio'), 'must document --lm-studio flag for local LM Studio server'); + assert.ok(command.includes('--llama-cpp'), 'must document --llama-cpp flag for local llama.cpp server'); + }); + test('command references the workflow file via execution_context', () => { assert.ok( command.includes('@$HOME/.claude/get-shit-done/workflows/plan-review-convergence.md'), @@ -80,6 +103,14 @@ describe('plan-review-convergence command source (#2306)', () => { '--codex must be documented as the default reviewer' ); }); + + test('command documents the workflow.plan_review_convergence config key', () => { + assert.ok( + command.includes('workflow.plan_review_convergence') || + command.includes('plan_review_convergence'), + 'command must document the config key required to enable the feature (#2306-v2)' + ); + }); }); // ─── Workflow: initialization ────────────────────────────────────────────── @@ -109,6 +140,41 @@ describe('plan-review-convergence workflow: initialization (#2306)', () => { }); }); +// ─── Workflow: config gate (disabled by default) ─────────────────────────── + +describe('plan-review-convergence workflow: config gate (#2306-v2)', () => { + const workflow = fs.readFileSync(WORKFLOW_PATH, 'utf8'); + + test('workflow checks workflow.plan_review_convergence config key before running', () => { + assert.ok( + workflow.includes('workflow.plan_review_convergence'), + 'workflow must check workflow.plan_review_convergence config key — feature is disabled by default (#2306-v2)' + ); + }); + + test('workflow exits with enable instructions when config key is false', () => { + // Must tell the user how to enable the feature + assert.ok( + workflow.includes('gsd config-set workflow.plan_review_convergence true') || + workflow.includes('config-set workflow.plan_review_convergence'), + 'workflow must show the user how to enable the feature when disabled (#2306-v2)' + ); + }); + + test('workflow defaults config key to false (opt-in, not opt-out)', () => { + // The config-get call must default to false, not true + const configGetMatch = workflow.match(/config-get\s+workflow\.plan_review_convergence[^\n]*/); + assert.ok( + configGetMatch, + 'workflow must read workflow.plan_review_convergence via config-get' + ); + assert.ok( + configGetMatch[0].includes('"false"') || configGetMatch[0].includes("'false'") || configGetMatch[0].includes('false'), + 'workflow must default workflow.plan_review_convergence to false (disabled by default) (#2306-v2)' + ); + }); +}); + // ─── Workflow: initial planning gate ────────────────────────────────────── describe('plan-review-convergence workflow: initial planning gate (#2306)', () => { @@ -148,10 +214,47 @@ describe('plan-review-convergence workflow: convergence loop (#2306)', () => { ); }); - test('workflow detects HIGH concerns by grepping REVIEWS.md', () => { + test('workflow extracts HIGH count from CYCLE_SUMMARY contract, NOT from grepping REVIEWS.md', () => { + // Critical regression guard: REVIEWS.md accumulates history across cycles; + // resolved HIGHs from cycle N remain in the file during cycle N+1 as audit trail, + // inflating raw grep counts and causing false stalls. HIGH count must come from + // the review agent's CYCLE_SUMMARY return message, not from the file. + assert.ok( + workflow.includes('CYCLE_SUMMARY'), + 'workflow must use CYCLE_SUMMARY contract from review agent return message, not raw grep (#2306-v2 false-stall fix)' + ); + assert.ok( + workflow.includes('current_high'), + 'workflow must parse current_high from CYCLE_SUMMARY line' + ); + }); + + test('workflow aborts if review agent omits CYCLE_SUMMARY contract', () => { + assert.ok( + workflow.includes('did not honor the CYCLE_SUMMARY contract') || + workflow.includes('CYCLE_SUMMARY contract'), + 'workflow must abort with clear error when review agent omits CYCLE_SUMMARY (#2306-v2)' + ); + }); + + test('workflow distinguishes malformed CYCLE_SUMMARY from absent CYCLE_SUMMARY', () => { + // Helps debugging: "present but malformed" vs "completely missing" are different errors + assert.ok( + workflow.includes('malformed') || + (workflow.includes('CYCLE_SUMMARY') && workflow.includes('present')), + 'workflow must distinguish malformed CYCLE_SUMMARY from absent one for debuggability (#2306-v2)' + ); + }); + + test('review agent spawn forwards --ws via GSD_WS (symmetric with replan agent)', () => { + // Critical correctness bug: if GSD_WS is not forwarded to the review agent, + // the review reads from the wrong workspace while replanning reads from the correct one. + const reviewAgentBlock = workflow.match(/gsd-review['"`,\s][\s\S]{0,300}?GSD_WS/); assert.ok( - workflow.includes('HIGH_COUNT') || workflow.includes('grep'), - 'workflow must grep REVIEWS.md for HIGH concerns to determine convergence' + reviewAgentBlock || + (workflow.includes("'gsd-review'") && workflow.includes('{GSD_WS}') && + workflow.indexOf('{GSD_WS}') < workflow.indexOf("'gsd-plan-phase'")), + 'review agent spawn must forward {GSD_WS} — workspace flag must reach the reviewer (#2306-v2 --ws fix)' ); }); @@ -186,6 +289,55 @@ describe('plan-review-convergence workflow: convergence loop (#2306)', () => { }); }); +// ─── Workflow: CYCLE_SUMMARY contract definition ────────────────────────── + +describe('plan-review-convergence workflow: CYCLE_SUMMARY contract definition (#2306-v2)', () => { + const workflow = fs.readFileSync(WORKFLOW_PATH, 'utf8'); + + test('review agent prompt defines CYCLE_SUMMARY: current_high= format', () => { + assert.ok( + workflow.includes('CYCLE_SUMMARY: current_high='), + 'review agent spawn prompt must define the CYCLE_SUMMARY: current_high= output format (#2306-v2)' + ); + }); + + test('CYCLE_SUMMARY contract defines PARTIALLY RESOLVED (acknowledged, mitigation incomplete)', () => { + assert.ok( + workflow.includes('PARTIALLY RESOLVED'), + 'CYCLE_SUMMARY INCLUDE list must define PARTIALLY RESOLVED — prevents under-counting of in-progress issues (#2306-v2)' + ); + }); + + test('CYCLE_SUMMARY contract defines FULLY RESOLVED (verified/closed)', () => { + assert.ok( + workflow.includes('FULLY RESOLVED'), + 'CYCLE_SUMMARY EXCLUDE list must define FULLY RESOLVED — prevents over-counting of closed issues (#2306-v2)' + ); + }); + + test('CYCLE_SUMMARY contract requires ## Current HIGH Concerns section in review return', () => { + assert.ok( + workflow.includes('## Current HIGH Concerns'), + 'review agent must provide ## Current HIGH Concerns section so escalation gate can show specific issues (#2306-v2)' + ); + }); +}); + +// ─── Workflow: HIGH_LINES validation ────────────────────────────────────── + +describe('plan-review-convergence workflow: HIGH_LINES validation (#2306-v2)', () => { + const workflow = fs.readFileSync(WORKFLOW_PATH, 'utf8'); + + test('workflow warns when HIGH_COUNT > 0 but ## Current HIGH Concerns section is absent', () => { + // Prevents silent UX degradation: escalation gate shows blank concern list + assert.ok( + workflow.includes('HIGH_LINES') && + (workflow.includes('incomplete escalation') || workflow.includes('Current HIGH Concerns')), + 'workflow must warn when HIGH_COUNT > 0 but HIGH_LINES is empty (contract partially violated) (#2306-v2)' + ); + }); +}); + // ─── Workflow: stall detection ───────────────────────────────────────────── describe('plan-review-convergence workflow: stall detection (#2306)', () => { @@ -247,21 +399,16 @@ describe('plan-review-convergence workflow: stall detection behavioral (#2306)', const workflow = fs.readFileSync(WORKFLOW_PATH, 'utf8'); test('workflow surfaces stall warning when prev_high_count equals current HIGH_COUNT', () => { - // Behavioral test: two consecutive cycles with the same HIGH count must trigger - // the stall warning. The workflow must compare HIGH_COUNT >= prev_high_count and - // emit a warning string that would appear in output. assert.ok( workflow.includes('prev_high_count') || workflow.includes('prev_HIGH'), 'workflow must track prev_high_count across cycles' ); - // The comparison that detects the stall assert.ok( workflow.includes('HIGH_COUNT >= prev_high_count') || workflow.includes('HIGH_COUNT >= prev_HIGH') || workflow.includes('not decreasing'), 'workflow must compare current HIGH count against previous to detect stall' ); - // The stall warning text that appears in output assert.ok( workflow.includes('stall') || workflow.includes('Stall') || workflow.includes('not decreasing'), 'workflow must emit a stall warning when HIGH count is not decreasing' @@ -275,16 +422,12 @@ describe('plan-review-convergence workflow: --max-cycles 1 immediate escalation const workflow = fs.readFileSync(WORKFLOW_PATH, 'utf8'); test('workflow escalates immediately after cycle 1 when --max-cycles 1 and HIGH > 0', () => { - // Behavioral test: when max_cycles=1, after the first review cycle, if HIGH_COUNT > 0 - // the workflow must trigger the escalation gate (cycle >= MAX_CYCLES check fires on - // cycle 1 itself). Verify the workflow contains the logic for this edge case. assert.ok( workflow.includes('cycle >= MAX_CYCLES') || workflow.includes('cycle >= max_cycles') || (workflow.includes('MAX_CYCLES') && workflow.includes('AskUserQuestion')), 'workflow must check cycle >= MAX_CYCLES so --max-cycles 1 triggers escalation after first cycle' ); - // Escalation gate must fire when HIGH > 0 (not just at exactly max_cycles) assert.ok( workflow.includes('HIGH_COUNT > 0') || workflow.includes('HIGH concerns remain') || @@ -313,3 +456,154 @@ describe('plan-review-convergence workflow: artifact verification (#2306)', () = ); }); }); + +// ─── Workflow: success criteria ──────────────────────────────────────────── + +describe('plan-review-convergence workflow: success criteria (#2306-v2)', () => { + const workflow = fs.readFileSync(WORKFLOW_PATH, 'utf8'); + + test('success criteria references CYCLE_SUMMARY parsing, not grep HIGHs', () => { + const successBlock = workflow.slice(workflow.lastIndexOf('')); + assert.ok( + successBlock.includes('CYCLE_SUMMARY') || successBlock.includes('parse'), + 'success_criteria must reflect that orchestrator parses CYCLE_SUMMARY, not greps REVIEWS.md (#2306-v2)' + ); + assert.ok( + !successBlock.includes('grep HIGHs'), + 'success_criteria must NOT say "grep HIGHs" — that was the false-stall bug (#2306-v2)' + ); + }); +}); + +// ─── Config schema registration ─────────────────────────────────────────── + +describe('plan-review-convergence config schema registration (#2306-v2)', () => { + const schema = fs.readFileSync(SCHEMA_PATH, 'utf8'); + + test('workflow.plan_review_convergence is registered in config-schema.cjs', () => { + assert.ok( + schema.includes("'workflow.plan_review_convergence'"), + "workflow.plan_review_convergence must be registered in VALID_CONFIG_KEYS in config-schema.cjs so gsd config-set accepts it (#2306-v2)" + ); + }); +}); + +// ─── CONFIGURATION.md documentation ────────────────────────────────────── + +describe('plan-review-convergence CONFIGURATION.md documentation (#2306-v2)', () => { + const configDoc = fs.readFileSync(CONFIG_DOC_PATH, 'utf8'); + + test('workflow.plan_review_convergence is documented in CONFIGURATION.md', () => { + assert.ok( + configDoc.includes('workflow.plan_review_convergence'), + 'workflow.plan_review_convergence must be documented in docs/CONFIGURATION.md — schema/docs parity test enforces this (#2306-v2)' + ); + }); + + test('CONFIGURATION.md entry documents disabled-by-default behavior', () => { + const row = configDoc.match(/workflow\.plan_review_convergence[^\n]*/); + assert.ok(row, 'workflow.plan_review_convergence row must exist in CONFIGURATION.md'); + assert.ok( + row[0].includes('false') || row[0].includes('disabled'), + 'CONFIGURATION.md entry must document that the feature defaults to false (disabled by default) (#2306-v2)' + ); + }); +}); + +// ─── Local model reviewer support ──────────────────────────────────────── + +describe('plan-review-convergence local model reviewer flags (#2306-local)', () => { + const workflow = fs.readFileSync(WORKFLOW_PATH, 'utf8'); + + test('workflow parses --ollama flag into REVIEWER_FLAGS', () => { + assert.ok( + workflow.includes('--ollama'), + 'workflow must parse --ollama flag so it is forwarded to the review agent' + ); + }); + + test('workflow parses --lm-studio flag into REVIEWER_FLAGS', () => { + assert.ok( + workflow.includes('--lm-studio'), + 'workflow must parse --lm-studio flag so it is forwarded to the review agent' + ); + }); + + test('workflow parses --llama-cpp flag into REVIEWER_FLAGS', () => { + assert.ok( + workflow.includes('--llama-cpp'), + 'workflow must parse --llama-cpp flag so it is forwarded to the review agent' + ); + }); +}); + +describe('plan-review-convergence local model config schema registration (#2306-local)', () => { + const schema = fs.readFileSync(SCHEMA_PATH, 'utf8'); + + test('review.ollama_host is registered in config-schema.cjs', () => { + assert.ok( + schema.includes("'review.ollama_host'"), + "review.ollama_host must be in VALID_CONFIG_KEYS so gsd config-set accepts it" + ); + }); + + test('review.lm_studio_host is registered in config-schema.cjs', () => { + assert.ok( + schema.includes("'review.lm_studio_host'"), + "review.lm_studio_host must be in VALID_CONFIG_KEYS so gsd config-set accepts it" + ); + }); + + test('review.llama_cpp_host is registered in config-schema.cjs', () => { + assert.ok( + schema.includes("'review.llama_cpp_host'"), + "review.llama_cpp_host must be in VALID_CONFIG_KEYS so gsd config-set accepts it" + ); + }); +}); + +describe('plan-review-convergence local model CONFIGURATION.md documentation (#2306-local)', () => { + const configDoc = fs.readFileSync(CONFIG_DOC_PATH, 'utf8'); + + test('review.ollama_host is documented in CONFIGURATION.md', () => { + assert.ok( + configDoc.includes('review.ollama_host'), + 'review.ollama_host must be documented in docs/CONFIGURATION.md' + ); + }); + + test('review.lm_studio_host is documented in CONFIGURATION.md', () => { + assert.ok( + configDoc.includes('review.lm_studio_host'), + 'review.lm_studio_host must be documented in docs/CONFIGURATION.md' + ); + }); + + test('review.llama_cpp_host is documented in CONFIGURATION.md', () => { + assert.ok( + configDoc.includes('review.llama_cpp_host'), + 'review.llama_cpp_host must be documented in docs/CONFIGURATION.md' + ); + }); + + test('review.models.ollama is documented in CONFIGURATION.md', () => { + assert.ok( + configDoc.includes('review.models.ollama'), + 'review.models.ollama must be documented so users know how to configure the local model name' + ); + }); + + test('review.models.lm_studio is documented in CONFIGURATION.md', () => { + assert.ok( + configDoc.includes('review.models.lm_studio'), + 'review.models.lm_studio must be documented so users know how to configure the local model name' + ); + }); + + test('review.models.llama_cpp is documented in CONFIGURATION.md', () => { + assert.ok( + configDoc.includes('review.models.llama_cpp'), + 'review.models.llama_cpp must be documented so users know how to configure the local model name' + ); + }); +}); From 60af2736b6314837018d1f3bc67badde15a281f0 Mon Sep 17 00:00:00 2001 From: Tom Boucher Date: Sat, 25 Apr 2026 14:09:06 -0400 Subject: [PATCH 2/4] fix(ci): sync sdk/src/query/config-schema.ts with CJS schema (#2306) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add workflow.plan_review_convergence, review.ollama_host, review.lm_studio_host, and review.llama_cpp_host to the SDK-side TypeScript mirror — required by the CJS↔SDK parity test (#2653). Co-Authored-By: Claude Sonnet 4.6 --- sdk/src/query/config-schema.ts | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sdk/src/query/config-schema.ts b/sdk/src/query/config-schema.ts index ab2b3e6ac2..cadf82a26e 100644 --- a/sdk/src/query/config-schema.ts +++ b/sdk/src/query/config-schema.ts @@ -36,6 +36,7 @@ export const VALID_CONFIG_KEYS: ReadonlySet = new Set([ 'workflow.plan_bounce_script', 'workflow.plan_bounce_passes', 'workflow.plan_chunked', + 'workflow.plan_review_convergence', 'workflow.post_planning_gaps', 'workflow.security_enforcement', 'workflow.security_asvs_level', @@ -44,6 +45,7 @@ export const VALID_CONFIG_KEYS: ReadonlySet = new Set([ 'workflow.drift_action', 'git.branching_strategy', 'git.base_branch', 'git.phase_branch_template', 'git.milestone_branch_template', 'git.quick_branch_template', 'planning.commit_docs', 'planning.search_gitignored', 'planning.sub_repos', + 'review.ollama_host', 'review.lm_studio_host', 'review.llama_cpp_host', 'workflow.cross_ai_execution', 'workflow.cross_ai_command', 'workflow.cross_ai_timeout', 'workflow.subagent_timeout', 'workflow.inline_plan_threshold', From 88cd0bcde12464095fe16e9b1e15ad4c722c1c71 Mon Sep 17 00:00:00 2001 From: Tom Boucher Date: Sat, 25 Apr 2026 14:14:05 -0400 Subject: [PATCH 3/4] fix(#2306): resolve CodeRabbit review findings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Anchor HIGH_COUNT extraction with head -1 to prevent multi-match when agent return message contains multiple CYCLE_SUMMARY lines (e.g. quoted back from prompt context) - Replace hardcoded reviewers list in REVIEWS.md frontmatter template with runtime-derived placeholder — the static list did not reflect which reviewers were actually invoked - Broaden workflow.plan_review_convergence docs to include local reviewers (Ollama, LM Studio, llama.cpp) alongside cloud reviewers Co-Authored-By: Claude Sonnet 4.6 --- docs/CONFIGURATION.md | 2 +- get-shit-done/workflows/plan-review-convergence.md | 2 +- get-shit-done/workflows/review.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md index f6119696aa..858ca6b130 100644 --- a/docs/CONFIGURATION.md +++ b/docs/CONFIGURATION.md @@ -195,7 +195,7 @@ All workflow toggles follow the **absent = enabled** pattern. If a key is missin | `workflow.plan_bounce_script` | string | (none) | Path to the external script invoked for plan bounce validation. Receives the PLAN.md path as its first argument. Required when `plan_bounce` is `true`. Added in v1.36 | | `workflow.plan_bounce_passes` | number | `2` | Number of sequential bounce passes to run. Each pass feeds the previous pass's output back into the validator. Higher values increase rigor at the cost of latency. Added in v1.36 | | `workflow.post_planning_gaps` | boolean | `true` | Unified post-planning gap report (#2493). After all plans are generated and committed, scans REQUIREMENTS.md and CONTEXT.md `` against every PLAN.md in the phase directory, then prints one `Source \| Item \| Status` table. Word-boundary matching (REQ-1 vs REQ-10) and natural sort (REQ-02 before REQ-10). Non-blocking — informational report only. Set to `false` to skip Step 13e of plan-phase. | -| `workflow.plan_review_convergence` | boolean | `false` | Enable the `/gsd-plan-review-convergence` command. Disabled by default — the command exits with an enable instruction when this key is `false`. The command automates the manual plan→review→replan loop: it spawns external AI reviewers (Codex, Gemini, Claude, OpenCode), counts unresolved HIGH concerns via the CYCLE_SUMMARY contract, replans with `--reviews` feedback, and repeats until converged or max cycles reached. Enable with `gsd config-set workflow.plan_review_convergence true`. Added in v1.39 | +| `workflow.plan_review_convergence` | boolean | `false` | Enable the `/gsd-plan-review-convergence` command. Disabled by default — the command exits with an enable instruction when this key is `false`. The command automates the manual plan→review→replan loop: it spawns configured reviewers (Codex, Gemini, Claude, OpenCode, Ollama, LM Studio, llama.cpp), counts unresolved HIGH concerns via the CYCLE_SUMMARY contract, replans with `--reviews` feedback, and repeats until converged or max cycles reached. Enable with `gsd config-set workflow.plan_review_convergence true`. Added in v1.39 | | `workflow.plan_chunked` | boolean | `false` | Enable chunked planning mode. When `true` (or when `--chunked` flag is passed to `/gsd-plan-phase`), the orchestrator splits the single long-lived planner Task into a short outline Task followed by N short per-plan Tasks (~3-5 min each). Each plan is committed individually for crash resilience. If a Task hangs and the terminal is force-killed, rerunning with `--chunked` resumes from the last completed plan. Particularly useful on Windows where long-lived Tasks may hang on stdio. Added in v1.38 | | `workflow.code_review_command` | string | (none) | Shell command for external code review integration in `/gsd-ship`. Receives changed file paths via stdin. Non-zero exit blocks the ship workflow. Added in v1.36 | | `workflow.tdd_mode` | boolean | `false` | Enable TDD pipeline as a first-class execution mode. When `true`, the planner aggressively applies `type: tdd` to eligible tasks (business logic, APIs, validations, algorithms) and the executor enforces RED/GREEN/REFACTOR gate sequence. An end-of-phase collaborative review checkpoint verifies gate compliance. Added in v1.36 | diff --git a/get-shit-done/workflows/plan-review-convergence.md b/get-shit-done/workflows/plan-review-convergence.md index f0e557899e..9440531408 100644 --- a/get-shit-done/workflows/plan-review-convergence.md +++ b/get-shit-done/workflows/plan-review-convergence.md @@ -190,7 +190,7 @@ Parse HIGH_COUNT from the review agent's return message via the CYCLE_SUMMARY co ```bash # Extract the integer from "CYCLE_SUMMARY: current_high=N" in the agent's return message -HIGH_COUNT=$(echo "$REVIEW_AGENT_RETURN" | grep -oE 'CYCLE_SUMMARY:\s*current_high=[0-9]+' | grep -oE '[0-9]+$') +HIGH_COUNT=$(echo "$REVIEW_AGENT_RETURN" | grep -oE 'CYCLE_SUMMARY:\s*current_high=[0-9]+' | head -1 | grep -oE '[0-9]+$') if [ -z "$HIGH_COUNT" ]; then # Distinguish malformed contract from completely absent contract diff --git a/get-shit-done/workflows/review.md b/get-shit-done/workflows/review.md index f52611c283..1007616e32 100644 --- a/get-shit-done/workflows/review.md +++ b/get-shit-done/workflows/review.md @@ -321,7 +321,7 @@ Combine all review responses into `{phase_dir}/{padded_phase}-REVIEWS.md`: ```markdown --- phase: {N} -reviewers: [gemini, claude, codex, coderabbit, opencode, qwen, cursor, ollama, lm_studio, llama_cpp] +reviewers: [{comma-separated list of reviewers actually invoked this run}] reviewed_at: {ISO timestamp} plans_reviewed: [{list of PLAN.md files}] --- From 4d642bdb350a8dafba11ef9a4ba92ea6ed874939 Mon Sep 17 00:00:00 2001 From: Tom Boucher Date: Sat, 25 Apr 2026 14:16:44 -0400 Subject: [PATCH 4/4] fix(ci): restore reviewers frontmatter list with runtime note MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The cursor-reviewer.test.cjs (and equivalent per-reviewer tests) assert that each supported reviewer appears on the reviewers: line — these are wiring tests that catch when a new reviewer is added to invocation but not to the REVIEWS.md template. Replacing the list with a placeholder broke those tests. Restore the full static list and add an inline comment clarifying that the actual committed frontmatter should be filtered to only the reviewers invoked that run — satisfying both the per-reviewer tests and the CodeRabbit correctness note. Co-Authored-By: Claude Sonnet 4.6 --- get-shit-done/workflows/review.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/get-shit-done/workflows/review.md b/get-shit-done/workflows/review.md index 1007616e32..16d729418a 100644 --- a/get-shit-done/workflows/review.md +++ b/get-shit-done/workflows/review.md @@ -321,7 +321,7 @@ Combine all review responses into `{phase_dir}/{padded_phase}-REVIEWS.md`: ```markdown --- phase: {N} -reviewers: [{comma-separated list of reviewers actually invoked this run}] +reviewers: [gemini, claude, codex, coderabbit, opencode, qwen, cursor, ollama, lm_studio, llama_cpp] # populate at runtime with only the reviewers actually invoked reviewed_at: {ISO timestamp} plans_reviewed: [{list of PLAN.md files}] ---