Skip to content

FE-705: Agent CLI capabilities and workflow documentation#132

Open
lunelson wants to merge 41 commits into
mainfrom
ln/fe-705-cli-capabilities-with-docs-and-skills-update
Open

FE-705: Agent CLI capabilities and workflow documentation#132
lunelson wants to merge 41 commits into
mainfrom
ln/fe-705-cli-capabilities-with-docs-and-skills-update

Conversation

@lunelson
Copy link
Copy Markdown
Contributor

@lunelson lunelson commented May 13, 2026

Summary

Adds the FE-705 agent-facing CLI capability substrate and probe harness, then reconciles the surrounding design/planning docs and ln-* skill workflow so the branch is reviewable as one implementation + methodology update.

What changed

  • Added server-side agent capability primitives for JSONL lifecycle, chat readiness/read access, turn response handling, and capability registration.
  • Added scripts/agent-probes/ with process-backed probe runner coverage, fixture-candidate validation, packaged smoke helpers, and model-backed LLM user policy seams.
  • Consolidated the conversational workspace design docs: archived the broad intent-spec synthesis, clarified multi-chat / side-chat / changeset-ledger authority, added strategy docs, and refreshed the design index.
  • Restructured memory/PLAN.md and added memory/SPEC_RESTRUCTURE.md to reduce planning conflicts and clarify the new frontier/sequencing model.
  • Expanded and tightened the local skill workflow: added d3k, ln-diagnose, and ln-prototype; refined ln-build, ln-scope, ln-review, ln-sync, and planning-pr; added the pre-release change posture to AGENTS.md.
  • Moved dev-workflow evolution rationale under docs/design/ln-skills/ so skill design notes are separate from executable skills and product specs.

Validation

  • npm run fix passes with the existing unrelated unused-variable warnings in interview-view tests.

lunelson added 30 commits May 13, 2026 13:17
Copy link
Copy Markdown
Contributor Author

lunelson commented May 13, 2026

@lunelson lunelson changed the title initial sync plus first status-/semantic-reconciliation of PLAN vs PLAN from fe-705 FE-705: Agent CLI capabilities and workflow documentation May 13, 2026
@lunelson lunelson marked this pull request as ready for review May 13, 2026 12:59
@lunelson lunelson self-assigned this May 13, 2026
@cursor
Copy link
Copy Markdown

cursor Bot commented May 13, 2026

PR Summary

Low Risk
Documentation-only changes that adjust agent workflow guidance; low risk aside from potentially shifting team process if the updated planning vocabulary or templates are adopted incorrectly.

Overview
Adds new agent skill docs for debugging and iteration workflows (d3k command guide, plus new ln-diagnose and ln-prototype skills).

Refactors the ln-* skill guidance to standardize frontier item vs slice vocabulary, with an updated ln-plan template that introduces Context, Sequencing by stable frontier id, and Frontier Definitions, and corresponding updates across ln-scope, ln-build, ln-oracles, ln-review, ln-spec, ln-spike, ln-sync, and AGENTS.md.

Reconciles surrounding design/archive docs (e.g., PLAN_HISTORY.md additions, design-doc authority/status notes, link fixes, and an audited/trimmed DEFERRED_RECONCILIATIONS.md) and narrows planning-pr into an advisory skill that recommends (not auto-creates) separate planning PRs only when explicitly needed.

Reviewed by Cursor Bugbot for commit 1bcd5a7. Bugbot is set up for automated code reviews on this repo. Configure here.

@lunelson lunelson requested a review from kostandinang May 13, 2026 12:59
@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented May 13, 2026

This pull request is abnormally large and would use a significant amount of tokens to review. If you still wish to review it, comment "augment review" and we will review it.

kostandinang added a commit that referenced this pull request May 13, 2026
memory/PLAN.md and docs/archive/PLAN_HISTORY.md changes from this branch
will land as a planning-only PR off main after Lu's #132 + #133 stack
merges, per the planning-pr convention (known merge conflicts on planning
docs + frontier-definitions migration triggers separate-PR recommendation).

This keeps PR #134 code-only and conflict-free with Lu's stack.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kostandinang
Copy link
Copy Markdown
Contributor

Code review

Found 2 issues:

  1. Temp workspace directory leaked in runProcessBackedProbe. The finally block only calls spawned.endStdin() and never removes the mkdtempSync(...) workspace at workspaceCwd. Tests manually rmSync(result.workspaceCwd), but production callers (e.g. packaged-smoke.ts) don't — every probe run leaves a /tmp/brunch-probe-workspace-* directory behind, accumulating without bound across CI/smoke runs.

turnBudget,
}: ProcessBackedProbeOptions): Promise<ProbeRunResult> {
const workspaceCwd = mkdtempSync(join(tmpdir(), 'brunch-probe-workspace-'));
const spawned = spawnProcess({ cwd: workspaceCwd, command, args, env });
const transport = createProcessJsonlTransport(spawned);
try {
const result = await runScriptedProbe({
transport,
scenario,
scriptedAnswers,
responsePolicy,
simulatedUserEvents,
turnBudget,
});
result.workspaceCwd = workspaceCwd;
if (preserveWorkspaceState) {
result.preservedWorkspaceStatePath = copyWorkspaceState({ workspaceCwd, outputDir });
}
writeProbeArtifacts(outputDir, result);
return result;
} finally {
spawned.endStdin();
}
}

  1. chat.ensureReady capability is registered with authority: 'runtime_replay', but the slice "Generate agent chat readiness" in this PR changes the handler to invoke streamInterviewer (a live LLM call). runtime_replay is documented as "writes replay/status artifacts tied to an existing durable unit" — i.e. deterministic and replay-safe. An LLM call is neither. Downstream adapters or audits that key off authority to decide whether a call is safe to retry/replay will draw the wrong conclusion. Either reclassify (likely commit_truth) or document why runtime_replay still holds.

},
{
id: 'chat.ensureReady',
authority: 'runtime_replay',
summary: 'Ensure an explicit chat has an answerable generated frontier.',
inputSchema: 'chat.ensureReady.input.v1',
outputSchema: 'chat.ensureReady.output.v1',
handler: null,
},

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@lunelson
Copy link
Copy Markdown
Contributor Author

augment review

@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented May 14, 2026

🤖 Augment PR Summary

Summary: This PR adds agent-facing CLI capability substrate and probe harness for the FE-705 frontier, plus reconciles surrounding design/planning docs and ln-* skill workflows.

Code changes:

  • Added server-side agent capability primitives: JSONL session protocol (agent-jsonl.ts), capability dispatcher (capabilities.ts), and expanded capability registry with executable contracts for spec/chat/turn operations
  • Added scripts/agent-probes/ with process-backed probe runner, fixture-candidate validation, packaged smoke helpers, and model-backed LLM-as-user policy seams
  • Updated CLI (cli.ts) to support brunch agent subcommand for JSONL stdin/stdout sessions
  • Extended build/lint/test tooling to include scripts/ directory

Documentation changes:

  • Added new agent skill docs: d3k, ln-diagnose, ln-prototype
  • Standardized frontier-item vs slice vocabulary across all ln-* skills and planning docs
  • Restructured memory/PLAN.md into conflict-resistant shape with Sequencing/Frontier Definitions sections
  • Consolidated design docs: archived broad synthesis, added strategy/runtime-cluster docs, refreshed design index
  • Moved dev-workflow evolution docs under docs/design/ln-skills/

Technical Notes: The JSONL capability adapter drives the real Brunch interview flow through Brunch-owned contracts; the probe runner exercises this surface only through a JSONL client, maintaining the import boundary that probe code must not import DB/product handlers directly.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

throw new CapabilityDispatchError(`Specification ${chat.specification_id} not found`, 'handler_failed');
}

const currentPhase = state.workflow.phases.grounding.status === 'closed' ? 'design' : 'grounding';
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currentPhase fallback only considers grounding vs design, but the workflow has four phases (grounding, design, requirements, criteria). If grounding is closed and the spec is in requirements or criteria, the idle-no-frontier phase will still report design. Consider reusing the existing getCurrentWorkflowPhase from src/shared/phase-close.ts or iterating workflowPhaseOrder to find the first unclosed phase.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

simulatedUserEvents,
turnBudget,
}: ProcessBackedProbeOptions): Promise<ProbeRunResult> {
const workspaceCwd = mkdtempSync(join(tmpdir(), 'brunch-probe-workspace-'));
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The temp workspace directory created by mkdtempSync is never cleaned up in runProcessBackedProbe. The finally block only calls spawned.endStdin() but never removes workspaceCwd. Tests manually rmSync the returned result.workspaceCwd, but production callers like packaged-smoke.ts do not, leaking a /tmp/brunch-probe-workspace-* directory on every probe run.

Severity: medium

Other Locations
  • scripts/agent-probes/packaged-smoke.ts:36

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

if (!turn) {
throw new CapabilityDispatchError(`Turn ${input.turnId} not found`, 'handler_failed');
}
if (turn.chat_id !== chat.id || turn.specification_id !== chat.specification_id) {
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turn.chat_id may be null for pre-multi-chat turns (the column is nullable per the schema). When turn.chat_id is null, the condition turn.chat_id !== chat.id is always true (since null !== number), causing the guard to reject legitimate turns that belong to the spec but were created before chat association was backfilled. Consider also checking turn.specification_id === chat.specification_id as a sufficient ownership proof when chat_id is null.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants