Skip to content

feat: add gates ensuring discuss-phase decisions are translated to plans and verified (closes #2492)#2611

Merged
trek-e merged 4 commits intogsd-build:mainfrom
trek-e:feat/2492-context-coverage-gate
Apr 23, 2026
Merged

feat: add gates ensuring discuss-phase decisions are translated to plans and verified (closes #2492)#2611
trek-e merged 4 commits intogsd-build:mainfrom
trek-e:feat/2492-context-coverage-gate

Conversation

@trek-e
Copy link
Copy Markdown
Collaborator

@trek-e trek-e commented Apr 23, 2026

TL;DR

Two gates close the loop between discuss-phase decisions and downstream work, so a decision captured in CONTEXT.md <decisions> cannot silently disappear during planning or execution.

Closes #2492

What

  • Plan-phase translation gate (BLOCKING). Inserted as Step 13a in workflows/plan-phase.md, after the existing requirements coverage gate and before plans are committed. Each trackable decision in <decisions> must be cited (by id D-NN or by 6+-word phrase) in at least one plan's designated sections — front-matter must_haves / truths / objective, or body sections under must_haves / truths / tasks / objective headings. HTML comments and fenced code blocks are stripped before searching, so a commented-out citation or a literal example never counts as coverage. The shell snippet wraps the SDK call with jq -e '.data.passed == true' || exit 1 so a coverage gap actually halts the workflow rather than only printing a warning.
  • Verify-phase validation gate (NON-BLOCKING). Inserted as a <step name="verify_decisions"> in workflows/verify-phase.md, before behavioral_verification. Each trackable decision is searched against shipped artifacts (PLAN.md, SUMMARY.md, files modified, recent commit messages — last 200). Misses are reported in VERIFICATION.md as a warning section but do not flip the overall verification status.
  • Shared parser parseDecisions() in sdk/src/query/decisions.ts — exposed as decisions.parse query handler — so Add unified post-planning gap checker for requirements and context #2493 (post-planning gap checker) can consume the same logic without duplicating decision-parsing code.
  • Config key workflow.context_coverage_gate (boolean, default true). Added to WorkflowConfig, CONFIG_DEFAULTS, VALID_CONFIG_KEYS, and checkConfigGates. When false, both gates skip silently. Workstream-scoped configs are honored — the handlers thread workstream through loadConfig so per-workstream overrides take effect.

Why

The issue describes a real-world failure: discuss-phase captured ~40 decisions, plans were created without implementing several of them, and verification passed because it only checked plan objectives. Comments and workarounds don't fix this; instrumentation does.

Why the asymmetry (plan blocks, verify warns)

  • Plan time is cheap. Failing at plan time costs minutes — the user re-cites the decision id in a plan, or moves the decision to Discretion.
  • Verify time is expensive. Thousands of dollars of executor work has already happened. A fuzzy substring miss should not fail an otherwise green phase. The verify gate exists to surface drift, not to block.

Design decisions

  1. Decision IDs are normative. discuss-phase.md already produces D-01, D-02, … — we lean on that. Citing D-NN in a plan is the deterministic, cheap path that always passes.
  2. Soft phrase match uses the first 6 normalized words of the decision text and is gated on word count: decisions shorter than 6 words must be cited by id (no soft-match path exists for them).
  3. Plan haystack is restricted to designated sections. This keeps the user's contract clear — there is exactly one place to put a coverage citation deliberately, and exactly one way to make a decision deliberately uncovered. Verify-phase keeps the broader artifact-wide haystack on purpose (non-blocking).
  4. Opt-outs: decisions under ### Claude's Discretion (any unicode quote variant) or tagged [informational] / [folded] / [deferred] are not tracked.
  5. Single source of truth for parsing. parseDecisions() is the only place that reads CONTEXT.md <decisions> blocks. Add unified post-planning gap checker for requirements and context #2493 will import it.
  6. No new agent. Gates run as inline workflow steps invoking SDK query handlers — no new subagent spawn.

Documentation

  • docs/CONFIGURATION.md — "Decision Coverage Gates" section with the config key, the asymmetry, the matching strategy, and opt-outs.
  • docs/USER-GUIDE.md — "Decision Coverage Gates" section explaining how to write decisions the gates accept.
  • docs/ARCHITECTURE.md — Phase Execution Flow diagram updated to call out both gates.
  • workflows/plan-phase.md and workflows/verify-phase.md — gate insertion points, skip clauses, failure UX.

Tests

  • sdk/src/query/decisions.test.ts — parser tests including fenced-code stripping, tab-indented continuation, multi-block extraction, unicode-quote discretion variants, and the relative-path resolution of decisions.parse.
  • sdk/src/query/check-decision-coverage.test.ts — gate tests covering every spec case PLUS adversarial-review regressions: HTML-comment citation rejected, code-fence citation rejected, sub-6-word decision requires id, multi-summary files_modified aggregation, path-traversal rejection, workstream-scoped skip, numeric-config validation warning.
  • tests/bug-2492-context-coverage-gate.test.cjs — workflow-as-prompt regression tests verifying gate steps, anchored-heading ordering, the ${CONTEXT_PATH} casing, and the || exit 1 block.

Full root suite: 5409/5409 passing. SDK gate + parser tests: 36/36 passing.

Cross-reference

Intersects with #2493 (post-planning gap checker). The shared parseDecisions() helper is the single source of truth — #2493 should import it rather than duplicate parsing logic.

Adversarial review fixes (rolled into this PR)

20 review findings against the initial commit have been fixed in three follow-up commits on this branch. Highlights:

  • F1: gate snippet referenced ${context_path} (lowercase) — the BLOCKING gate would never run because the variable is defined as CONTEXT_PATH.
  • F15: gate printed the failure message but lacked an exit 1 so the workflow continued past failure — now wrapped in a jq -e ... || exit 1.
  • F3: gate handlers ignored the workstream arg, so workstream-scoped context_coverage_gate=false had no effect.
  • F4: previous matcher accepted citations anywhere in the plan body, including inside HTML comments and code fences. Now restricted to designated sections.
  • F6/F7: SUMMARY parser only saw the first files_modified block and accepted absolute / ../ paths. Now walks every summary, validates each path stays inside projectDir, and caps reads at 256 KB.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Decision Coverage Gates: Plans must reference decisions from CONTEXT.md; verify phase checks artifact coverage with warnings
    • New workflow.context_coverage_gate configuration option to enable/disable gates (enabled by default)
  • Documentation

    • Architecture, configuration, and user guide updated with decision coverage gate details and matching rules

@trek-e trek-e requested a review from glittercowboy as a code owner April 23, 2026 02:20
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e6c11079-874e-4a71-a120-727446083301

📥 Commits

Reviewing files that changed from the base of the PR and between 1a3d953 and b599883.

📒 Files selected for processing (16)
  • docs/ARCHITECTURE.md
  • docs/CONFIGURATION.md
  • docs/USER-GUIDE.md
  • get-shit-done/workflows/plan-phase.md
  • get-shit-done/workflows/verify-phase.md
  • sdk/src/config.ts
  • sdk/src/query/check-decision-coverage.test.ts
  • sdk/src/query/check-decision-coverage.ts
  • sdk/src/query/config-gates.ts
  • sdk/src/query/config-mutation.test.ts
  • sdk/src/query/config-mutation.ts
  • sdk/src/query/decisions.test.ts
  • sdk/src/query/decisions.ts
  • sdk/src/query/index.ts
  • sdk/src/query/utils.ts
  • tests/bug-2492-context-coverage-gate.test.cjs

📝 Walkthrough

Walkthrough

Implements issue #2492: two new workflow gates for decision coverage. The plan-phase gate (blocking) ensures all discuss-phase decisions from CONTEXT.md are referenced in plans. The verify-phase gate (non-blocking) validates these decisions appear in shipped artifacts and surfaces any gaps. Both gates can be disabled via configuration flag.

Changes

Cohort / File(s) Summary
Configuration & Type System
sdk/src/config.ts, sdk/src/query/utils.ts
Added context_coverage_gate boolean config flag (default true) to WorkflowConfig and CONFIG_DEFAULTS. Generalized QueryResult and QueryHandler types to support typed data payloads.
Decision Parsing
sdk/src/query/decisions.ts, sdk/src/query/decisions.test.ts
New decision parser for CONTEXT.md <decisions> blocks. Extracts decision ID, text, category, and tags; marks decisions trackable/non-trackable based on section ("Claude's Discretion") and tags ([informational], [folded], [deferred]). Includes 215-line comprehensive test suite.
Coverage Gate Implementation
sdk/src/query/check-decision-coverage.ts, sdk/src/query/check-decision-coverage.test.ts
Two gate handlers: checkDecisionCoveragePlan (blocking, plan-phase) validates decisions appear in plan must_haves/truths/body; checkDecisionCoverageVerify (non-blocking, verify-phase) checks decisions honored in shipped artifacts. Includes 519-line test suite with soft/strict matching, skip conditions, and path traversal guards.
Configuration & Registry
sdk/src/query/config-mutation.ts, sdk/src/query/config-gates.ts, sdk/src/query/index.ts, sdk/src/query/config-mutation.test.ts
Added context_coverage_gate to config key allowlist, query registry, and config-gates response. Registered new handlers (decisions.parse, check.decision-coverage-plan, check.decision-coverage-verify).
Workflow Specifications
get-shit-done/workflows/plan-phase.md, get-shit-done/workflows/verify-phase.md
Inserted mandatory decision-coverage gate after requirements validation in plan-phase (66 lines, blocking, skippable via config); added warning-only verification gate in verify-phase (52 lines, appends to VERIFICATION.md without affecting pass/fail).
Documentation
docs/ARCHITECTURE.md, docs/CONFIGURATION.md, docs/USER-GUIDE.md
Updated architecture diagrams to show new gates; added detailed configuration docs for both gates (covering matching logic, skip conditions, opt-out tags); added user guide documenting gate behavior, decision tagging, and config toggle.
Integration Tests
tests/bug-2492-context-coverage-gate.test.cjs
End-to-end regression suite validating gate presence in workflows, correct variable passing, blocking semantics in plan-phase, non-blocking in verify-phase, and SDK wiring (config key, registry entries).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant PlanPhase as Plan-Phase Workflow
    participant ContextMD as CONTEXT.md
    participant Plans
    participant CoveragePlan as Decision Coverage Gate
    participant StateFile as STATE.md

    User->>PlanPhase: Start planning
    PlanPhase->>ContextMD: Read <decisions> block
    ContextMD-->>PlanPhase: Extract decision list (D-01, D-02, ...)
    PlanPhase->>Plans: Load all *-PLAN.md files
    Plans-->>CoveragePlan: Decision text from must_haves, truths, body
    CoveragePlan->>CoveragePlan: Match each decision ID/phrase<br/>(strict ID or soft phrase match)
    alt All decisions covered
        CoveragePlan-->>PlanPhase: passed=true, coverage summary
        PlanPhase->>StateFile: Mark phase as planned
        PlanPhase-->>User: Planning complete
    else Decisions not covered
        CoveragePlan-->>PlanPhase: passed=false, list uncovered IDs
        PlanPhase-->>User: Decision gap detected<br/>(re-plan, mark discretion, or override)
        alt User chooses override
            PlanPhase->>StateFile: Record override for verify-phase
            PlanPhase->>StateFile: Mark phase as planned
        else User re-plans
            User->>Plans: Update plans with missing decisions
            PlanPhase->>Plans: Retry gate
        end
    end
Loading
sequenceDiagram
    participant VerifyPhase as Verify-Phase Workflow
    participant ContextMD as CONTEXT.md
    participant Plans
    participant Summaries as SUMMARY files
    participant Artifacts as Shipped Artifacts
    participant CoverageVerify as Decision Coverage Gate
    participant VerificationMD as VERIFICATION.md

    VerifyPhase->>ContextMD: Read <decisions> block
    ContextMD-->>CoverageVerify: Extract trackable decisions
    par Gather Evidence
        VerifyPhase->>Plans: Load plan contents
        Plans-->>CoverageVerify: Plan text
        VerifyPhase->>Summaries: Load SUMMARY files
        Summaries-->>CoverageVerify: Shipped artifact paths, commit subjects
        VerifyPhase->>Artifacts: Read files_modified content (bounded)
        Artifacts-->>CoverageVerify: Modified file contents
    end
    CoverageVerify->>CoverageVerify: Search for each decision<br/>(ID or phrase match)
    CoverageVerify-->>VerificationMD: Append Decision Coverage section<br/>(honored, not_honored list)
    CoverageVerify-->>VerifyPhase: blocking=false, coverage report
    VerifyPhase-->>VerifyPhase: Continue verification<br/>(decision gate does not affect<br/>pass/fail decision tree)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

area: core, size/XL, review: needs discussion

Suggested reviewers

  • glittercowboy

Poem

🐰 In CONTEXT bright, decisions dance,
Now plans must honor every chance!
Verify checks what truly shipped,
No decision gap shall slip away stripped.
The gates stand tall, both kind and keen,
hop hop — a workflow most serene! 🎯

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

trek-e and others added 3 commits April 22, 2026 23:15
…d and verified

Two gates close the loop between CONTEXT.md `<decisions>` and downstream
work, fixing #2492:

- Plan-phase **translation gate** (BLOCKING). After requirements
  coverage, refuses to mark a phase planned when a trackable decision
  is not cited (by id `D-NN` or by 6+-word phrase) in any plan's
  `must_haves`, `truths`, or body. Failure message names each missed
  decision with id, category, text, and remediation paths.

- Verify-phase **validation gate** (NON-BLOCKING). Searches plans,
  SUMMARY.md, files modified, and recent commit subjects for each
  trackable decision. Misses are written to VERIFICATION.md as a
  warning section but do not change verification status. Asymmetry is
  deliberate — fuzzy-match miss should not fail an otherwise green
  phase.

Shared helper `parseDecisions()` lives in `sdk/src/query/decisions.ts`
so #2493 can consume the same parser.

Decisions opt out of both gates via `### Claude's Discretion` heading
or `[informational]` / `[folded]` / `[deferred]` tags.

Both gates skip silently when `workflow.context_coverage_gate=false`
(default `true`).

Closes #2492

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…F8, F9, F10, F15)

- F1: replace `${context_path}` with `${CONTEXT_PATH}` in the plan-phase
  gate snippet so the BLOCKING gate receives a non-empty path. The
  variable was defined in Step 4 (`CONTEXT_PATH=$(_gsd_field "$INIT" ...)`)
  and the gate snippet referenced the lowercase form, leaving the gate to
  run with an empty path argument and silently skip.
- F15: wrap the SDK call with `jq -e '.data.passed == true' || exit 1` so
  failure halts the workflow instead of being printed and ignored. The
  verify-phase counterpart deliberately keeps no exit-1 (non-blocking by
  design) and now carries an inline note documenting the asymmetry.
- F10: tag the JSON example fence as `json` and the options-list fence as
  `text` (MD040).
- F8/F9: anchor the heading-presence test regexes to `^## 13[a-z]?\\.` so
  prose substrings like "Requirements Coverage Gate" mentioned in body
  text cannot satisfy the assertion. Added two new regression tests
  (variable-name match, exit-1 guard) so a future revert is caught.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nd config drift (review F3,F4,F5,F6,F7,F16,F18,F19)

- F3: forward `workstream` arg through both gate handlers so workstream-scoped
  `workflow.context_coverage_gate=false` actually skips. Added negative test
  that creates a workstream config disabling the gate while the root config
  has it enabled and asserts the workstream call is skipped.
- F4: restrict the plan-phase haystack to designated sections — front-matter
  `must_haves` / `truths` / `objective` plus body sections under headings
  matching `must_haves|truths|tasks|objective`. HTML comments and fenced
  code blocks are stripped before extraction so a commented-out citation or
  a literal example never counts as coverage. Verify-phase keeps the broader
  artifact-wide haystack by design (non-blocking).
- F5: reject decisions with fewer than 6 normalized words from soft-matching
  (previously only rejected when the resulting phrase was under 12 chars
  AFTER slicing — too lenient). Short decisions now require an explicit
  `D-NN` citation, with regression tests for the boundary.
- F6: walk every `*-SUMMARY.md` independently and use `matchAll` with the
  `/g` flag so multiple `files_modified:` blocks across multiple summaries
  are all aggregated. Previously only the first block in the concatenated
  string was parsed, silently dropping later plans' files.
- F7: validate every `files_modified` path stays inside `projectDir` after
  resolution (rejects absolute paths, `../` traversal). Cap each file read
  at 256 KB. Skipped paths emit a stderr warning naming the entry.
- F16: validate `workflow.context_coverage_gate` is boolean in
  `loadGateConfig`; warn loudly on numeric or other-shaped values and
  default to ON. Mirrors the schema-vs-loadConfig validation gap from
  #2609.
- F18: bump verify-phase `git log -n` cap from 50 to 200 so longer-running
  phases are not undercounted. Documented as a precision-vs-recall tradeoff
  appropriate for a non-blocking gate.
- F19: tighten `QueryResult` / `QueryHandler` to be parameterized
  (`<T = unknown>`). Drops the `as unknown as Record<string, unknown>`
  casts in the gate handlers and surfaces shape mismatches at compile time
  for callers that pass a typed `data` value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@trek-e trek-e force-pushed the feat/2492-context-coverage-gate branch from 89f75b2 to 4b54621 Compare April 23, 2026 04:21
…,F12,F13,F14,F17,F20)

- F11: strip fenced code blocks from CONTEXT.md before searching for
  `<decisions>` so an example block inside ``` ``` is not mis-parsed.
- F12: accept tab-indented continuation lines (previously required a leading
  space) so decisions split with `\t` continue cleanly.
- F13: parse EVERY `<decisions>` block in the file via `matchAll`, not just
  the first. CONTEXT.md may legitimately carry more than one block.
- F14: `decisions.parse` handler now resolves a relative path against
  `projectDir` — symmetric with the gate handlers — and still accepts
  absolute paths.
- F17: replace `ls "${PHASE_DIR}"/*-CONTEXT.md | head -1` in verify-phase.md
  with a glob loop (ShellCheck SC2012 fix). Also avoids spawning an extra
  subprocess and survives filenames with whitespace.
- F20: extend the unicode quote-stripping in the discretion-heading match
  to cover U+2018/2019/201A/201B and the U+201C-F double-quote variants
  plus backtick, so any rendering of "Claude's Discretion" collapses to
  the same key.

Each fix has a regression test in `decisions.test.ts`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@trek-e
Copy link
Copy Markdown
Collaborator Author

trek-e commented Apr 23, 2026

Adversarial review — all 20 findings resolved

Three follow-up commits on this branch (05b53881, 4b546216, b5998830) close every review finding. Branch was rebased on origin/main (which already included #2493 / #2517 / #2607 / #2551) cleanly with no conflicts before applying fixes. npm run test:coverage is green (5409/0/0); SDK gate + parser vitest is 36/36.

F# Severity Finding Commit Status
F1 Critical ${context_path} undefined — gate never ran 05b53881 Fixed
F2 Critical Branch 3 commits behind main; potential conflicts rebase Fixed (clean rebase)
F3 Critical workstream arg dropped by gate handlers 4b546216 Fixed
F4 Major Plan haystack matched anywhere; docs claimed must_haves/truths 4b546216 Fixed (restricted to designated sections)
F5 Major Soft-phrase length check inverted 4b546216 Fixed (word-count gate; <6-word decisions require id)
F6 Major files_modified only first block parsed 4b546216 Fixed (per-summary matchAll + /g)
F7 Major Path traversal / DoS via SUMMARY files_modified 4b546216 Fixed (root-containment + 256 KB cap)
F8 Major Commit Plans substring trap in test 05b53881 Fixed (anchored heading regex)
F9 Major Requirements Coverage Gate substring trap in test 05b53881 Fixed (anchored heading regex)
F10 Minor MD040 fence missing language tag 05b53881 Fixed (json + text)
F11 Minor <decisions> extractor matched inside fenced code b5998830 Fixed (strip fences first)
F12 Minor Continuation regex required space; dropped tabs b5998830 Fixed (/^[ \t]/)
F13 Minor Only first <decisions> block parsed b5998830 Fixed (matchAll)
F14 Minor decisions.parse ignored projectDir b5998830 Fixed (resolves relative paths)
F15 Major Plan-phase blocking gate had no exit 1 05b53881 Fixed (jq -e ... || exit 1)
F16 Minor No type validation on context_coverage_gate 4b546216 Fixed (boolean check + warn on number)
F17 Minor ls | head in verify-phase (SC2012) b5998830 Fixed (glob loop)
F18 Minor git log -n 50 ignored phase boundaries 4b546216 Tuned (200 + comment; non-blocking gate is precision-vs-recall)
F19 Nit as unknown as Record casts 4b546216 Fixed (QueryResult<T> parameterized)
F20 Nit Incomplete unicode quote stripping b5998830 Fixed (full U+2018-201F + backtick + double-quote)

Verify-phase gate is intentionally NOT wrapped in exit 1 — it remains non-blocking by design (rationale documented inline alongside the snippet). The duplicate decision parser between bin/lib/decisions.cjs and sdk/src/query/decisions.ts is acknowledged tech debt and out of scope for this PR; it will be unified in a follow-up.

Strong regression tests added for all three Critical findings:

  • F1: a workflow-test asserts the gate snippet uses ${CONTEXT_PATH} (uppercase) and explicitly does NOT use ${context_path} (lowercase).
  • F2: full-suite green post-rebase.
  • F3: a workstream-scoped negative test creates a feat-x workstream with context_coverage_gate: false, leaves the root config enabled, and asserts both handlers skip when called with workstream='feat-x' and run when called without.

@trek-e trek-e merged commit f30da83 into gsd-build:main Apr 23, 2026
10 of 12 checks passed
trek-e added a commit that referenced this pull request Apr 23, 2026
…2622)

* fix(#2613): preserve STATE.md frontmatter on write path (option 2)

`readModifyWriteStateMd` strips frontmatter before invoking the modifier,
so `syncStateFrontmatter` received body-only content and `existingFm`
was always `{}`. The preservation branch never fired, and every mutation
re-derived `status` (to `'unknown'` when body had no `Status:` line) and
`progress.*` (to 0/0 when the shipped milestone's phase directories were
archived), silently overwriting authoritative frontmatter values.

Option 2 — write-side analogue of #2495 READ fix: `buildStateFrontmatter`
reads the current STATE.md frontmatter from disk as a preservation
backstop. Status preserved when derived is `'unknown'` and existing is
non-unknown. Progress preserved when disk scan returns all zeros AND
existing has non-zero counts. Legitimate body-driven status changes and
non-zero disk counts still win.

Milestone/milestone_name already preserved via `getMilestoneInfo`'s
#2495 fix — regression test added to lock that in.

Adds 5 regression tests covering status preservation, progress
preservation, milestone preservation, legitimate status updates, and
disk-scan-wins-when-non-zero.

Closes #2613

* fix(sdk): double-cast WorkflowConfig to Record in loadGateConfig

TypeScript error on main (introduced in #2611) blocks the install-smoke
CI job: `WorkflowConfig` has no string index signature, so the direct
cast to `Record<string, unknown>` fails type-check. The SDK build fails,
`installSdkIfNeeded()` cannot install `gsd-sdk` from source, and the
smoke job reports a false-positive installer regression.

  src/query/check-decision-coverage.ts(236,16): error TS2352:
  Conversion of type 'WorkflowConfig' to type 'Record<string, unknown>'
  may be a mistake because neither type sufficiently overlaps with the
  other.

Apply the double-cast via `unknown` as the compiler suggests. Behavior
is unchanged — this was already a cast.
trek-e added a commit that referenced this pull request Apr 23, 2026
TypeScript error on main (introduced in #2611) blocks `npm run build`
in sdk/, which now runs as part of this PR's tarball build path. Apply
the double-cast via `unknown` as the compiler suggests.

Same fix as #2622; can be dropped if that lands first.
trek-e added a commit that referenced this pull request Apr 23, 2026
…2457)

* fix(sdk): decouple SDK from build-from-source install path, close #2441 and #2453

Ship sdk/dist prebuilt in the tarball and replace the npm-install-g
sub-install with a parent-package bin shim (bin/gsd-sdk.js). npm chmods
bin entries from a packed tarball correctly, eliminating the mode-644
failure (#2453) and the full class of NPM_CONFIG_PREFIX/ignore-scripts/
corepack/air-gapped failure modes that caused #2439 and #2441.

Changes:
- sdk/package.json: prepublishOnly runs `rm -rf dist && tsc && chmod +x
  dist/cli.js` (stale-build guard + execute-bit fix at publish time)
- package.json: add "gsd-sdk": "bin/gsd-sdk.js" bin entry; add sdk/dist
  to files so the prebuilt CLI ships in the tarball
- bin/gsd-sdk.js: new back-compat shim — resolves sdk/dist/cli.js relative
  to the package root and delegates via `node`, so all existing PATH call
  sites (slash commands, agents, hooks) continue to work unchanged (S1 shim)
- bin/install.js: replace installSdkIfNeeded() build-from-source + global-
  install dance with a dist-verify + chmod-in-place guard; delete
  resolveGsdSdk(), detectShellRc(), emitSdkFatal() helpers now unused
- .github/workflows/install-smoke.yml: add smoke-unpacked job that strips
  execute bit from sdk/dist/cli.js before install to reproduce the exact
  #2453 failure mode
- tests/bug-2441-sdk-decouple.test.cjs: new regression tests asserting all
  invariants (no npm install -g from sdk/, shim exists, sdk/dist in files,
  prepublishOnly has rm -rf + chmod)
- tests/bugs-1656-1657.test.cjs: update stale assertions that required
  build-from-source behavior (now asserts new prebuilt-dist invariants)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(release): bump to 1.38.2, wire release.yml to build SDK dist

- Bump version 1.38.1 -> 1.38.2 for the #2441/#2453 fix shipped in 0f6903d.
- Add `build:sdk` script (`cd sdk && npm ci && npm run build`).
- `prepublishOnly` now runs hooks + SDK builds as a safety net.
- release.yml (rc + finalize): build SDK dist before `npm publish` so the
  published tarball always ships fresh `sdk/dist/` (kept gitignored).
- CHANGELOG: document 1.38.2 entry and `--sdk` flag semantics change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: build SDK dist before tests and smoke jobs

sdk/dist/ is gitignored (built fresh at publish time via release.yml),
but both the test suite and install-smoke jobs run `bin/install.js`
or `npm pack` against the checked-out tree where dist doesn't exist yet.

- test.yml: `npm run build:sdk` before `npm run test:coverage`, so tests
  that spawn `bin/install.js` don't hit `installSdkIfNeeded()`'s fatal
  missing-dist check.
- install-smoke.yml (both smoke and smoke-unpacked): build SDK before
  pack/chmod so the published tarball contains dist and the unpacked
  install has a file to strip exec-bit from.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sdk): lift SDK runtime deps to parent so tarball install can resolve them

The SDK's runtime deps (ws, @anthropic-ai/claude-agent-sdk) live in
sdk/package.json, but sdk/node_modules is NOT shipped in the parent
tarball — only sdk/dist, sdk/src, sdk/prompts, and sdk/package.json are.
When a user runs `npm install -g get-shit-done-cc`, npm installs the
parent's node_modules but never runs `npm install` inside the nested
sdk/ directory.

Result: `node sdk/dist/cli.js` fails with ERR_MODULE_NOT_FOUND for 'ws'.
The smoke tarball job caught this; the unpacked variant masked it
because `npm install -g <dir>` copies the entire workspace including
sdk/node_modules (left over from `npm run build:sdk`).

Fix: declare the same deps in the parent package.json so they land in
<pkg>/node_modules, which Node's resolution walks up to from
<pkg>/sdk/dist/cli.js. Keep them declared in sdk/package.json too so
the SDK remains a self-contained package for standalone dev.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lockfile): regenerate package-lock.json cleanly

The previous `npm install` run left the lockfile internally inconsistent
(resolved esbuild@0.27.7 referenced but not fully written), causing
`npm ci` to fail in CI with "Missing from lock file" errors.

Clean regen via rm + npm install fixes all three failed jobs
(test, smoke, smoke-unpacked), which were all hitting the same
`npm ci` sync check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(deps): remove unused esbuild + vitest from root devDependencies

Both were declared but never imported anywhere in the root package
(confirmed via grep of bin/, scripts/, tests/). They lived in sdk/
already, which is the only place they're actually used.

The transitive tree they pulled in (vitest → vite → esbuild 0.28 →
@esbuild/openharmony-arm64) was the root of the CI npm ci failures:
the openharmony platform package's `optional: true` flag was not being
applied correctly by npm 10 on Linux runners, causing EBADPLATFORM.

After removal: 800+ transitive packages → 155. Lockfile regenerated
cleanly. All 4170 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sdk): pretest:coverage builds sdk; tighten shim test assertions

Add "pretest:coverage": "npm run build:sdk" so npm run test:coverage
works in clean checkouts where sdk/dist/ hasn't been built yet.

Tighten the two loose shim assertions in bug-2441-sdk-decouple.test.cjs:
- forwards-to test now asserts path.resolve() is called with the
  'sdk','dist','cli.js' path segments, not just substring presence
- node-invocation test now asserts spawnSync(process.execPath, [...])
  pattern, ruling out matches in comments or the shebang line

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address PR review — pretest:coverage + tighten shim tests

Review feedback from trek-e on PR 2457:

1. pretest:coverage + pretest hooks now run `npm run build:sdk` so
   `npm run test[:coverage]` in a clean checkout produces the required
   sdk/dist/ artifacts before running the installer-dependent tests.
   CI already does this explicitly; local contributors benefit.

2. Shim tests in bug-2441-sdk-decouple.test.cjs tightened from loose
   substring matches (which would pass on comments/shebangs alone) to
   regex assertions on the actual path.resolve call, spawnSync with
   process.execPath, process.argv.slice(2), and process.exit pattern.
   These now provide real regression protection for #2453-class bugs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: correct CHANGELOG entry and add [1.38.2] reference link

Two issues in the 1.38.2 CHANGELOG entry:
- installSdkIfNeeded() was described as deleted but it still exists in
  bin/install.js (repurposed to verify sdk/dist/cli.js and fix execute bit).
  Corrected the description to say 'repurposes' rather than 'deletes'.
- The reference-link block at the bottom of the file was missing a [1.38.2]
  compare URL and [Unreleased] still pointed to v1.37.1...HEAD. Added the
  [1.38.2] link and updated [Unreleased] to compare/v1.38.2...HEAD.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(sdk): double-cast WorkflowConfig to Record for strict tsc build

TypeScript error on main (introduced in #2611) blocks `npm run build`
in sdk/, which now runs as part of this PR's tarball build path. Apply
the double-cast via `unknown` as the compiler suggests.

Same fix as #2622; can be dropped if that lands first.

* test: remove bug-2598 test obsoleted by SDK decoupling

The bug-2598 test guards the Windows CVE-2024-27980 fix in the old
build-from-source path (npm spawnSync with shell:true + formatSpawnFailure
diagnostics). This PR removes that entire code path — installSdkIfNeeded
no longer spawns npm, it just verifies the prebuilt sdk/dist/cli.js
shipped in the tarball.

The test asserts `installSdkIfNeeded.toString()` contains a
formatSpawnFailure helper. After decoupling, no such helper exists
(nothing to format — there's no spawn). Keeping the test would assert
invariants of the rejected architecture.

The original #2598 defect (silent failure of npm spawn on Windows) is
structurally impossible in the shim path: bin/gsd-sdk.js invokes
`node sdk/dist/cli.js` directly via child_process.spawn with an
explicit argv array. No .cmd wrapper, no shell delegation.

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Tom Boucher <trekkie@nomorestars.com>
trek-e added a commit to gauravagnihotristla/get-shit-done that referenced this pull request Apr 23, 2026
Same fix as gsd-build#2622 / gsd-build#2457 rebase — TypeScript error on main
(introduced in gsd-build#2611) blocks `npm run build` in sdk/. Drop if gsd-build#2622
lands first.
trek-e added a commit that referenced this pull request Apr 23, 2026
…2631)

* ci: explicit rebase check + fail-fast SDK typecheck in install-smoke

Stale-base regression guard. Root cause: GitHub's `refs/pull/N/merge`
is cached against the PR's recorded merge-base, not current main. When
main advances after a PR is opened, the cache stays stale and CI runs
against the pre-advance tree. PRs hit this whenever a type error lands
on main and gets patched shortly after (e.g. #2611 + #2622) — stale
branches replay the broken intermediate state and report confusing
downstream failures for hours.

Observed failure mode: install-smoke's "Assert gsd-sdk resolves on PATH"
step fires with "installSdkIfNeeded() regression" even when the real
cause is `npm run build` failing in sdk/ due to a TypeScript cast
mismatch already fixed on main.

Fix:
- Explicit `git merge origin/main` step in both `install-smoke.yml` and
  `test.yml`. If the merge conflicts, emit a clear "rebase onto main"
  diagnostic and fail early, rather than let conflicts produce unrelated
  downstream errors.
- Dedicated `npm run build:sdk` typecheck step in install-smoke with a
  remediation hint ("rebase onto main — the error may already be fixed
  on trunk"). Fails fast with the actual tsc output instead of masking
  it behind a PATH assertion.
- Drop the `|| true` on `get-shit-done-cc --claude --local` so installer
  failures surface at the install step with install.js's own error
  message, not at the downstream PATH assertion where the message
  misleadingly blames "shim regression".
- `fetch-depth: 0` on checkout so the merge-base check has history.

* ci: address CodeRabbit — add rebase check to smoke-unpacked, fix fetch flag

Two findings from CodeRabbit's review on #2631:

1. `smoke-unpacked` job was missing the same rebase check applied to the
   `smoke` job. It ran on the cached `refs/pull/N/merge` and could hit
   the same stale-base failure mode the PR was designed to prevent. Added
   the identical rebase-check step.

2. `git fetch origin main --depth=0` is an invalid flag — git rejects it
   with "depth 0 is not a positive number". The intent was "fetch with
   full depth", but the right way is just `git fetch origin main` (no
   --depth). Removed the invalid flag and the `||` fallback that was
   papering over the error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add gates to ensure discuss-phase decisions are properly translated to plans and verified

1 participant