fix(workflows): skip markdown code blocks in $nodeId.output validation#1478
Conversation
The DAG-structure validator scans `node.when`, `node.prompt`, and `loop.prompt` strings for `$nodeId.output` references. Prompt bodies in builder-style workflows embed fenced and inline code as documentation for the LLM (e.g. `archon-workflow-builder` shows how to author a script node), and those literal `$<other-node>.output` mentions were being treated as real cross-node references. Result: `archon-workflow-builder` (a bundled default) failed to load, and `bun run cli workflow run archon-workflow-builder ...` reported "references unknown node '$other-node.output'". Strip triple-backtick fenced blocks and single-backtick inline code from prompt and loop.prompt before scanning. `when:` clauses are JS-like expressions and never carry markdown code, so they pass through unchanged. Real cross-node refs in prose continue to validate. Also wraps one bare `$nodeId.output` mention in `archon-workflow-builder.yaml` Rules section in inline backticks so it reads as documentation alongside the surrounding `$nodeId.output` mentions that already use this style. Closes coleam00#1413
📝 WalkthroughWalkthroughThis PR addresses false-positive validation errors when loading Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/workflows/src/loader.test.ts (1)
1863-1890: Add one more test: inline backticks insideloop.prompt.You covered fenced blocks in
loop.prompt; adding inline-backtick coverage there would complete the matrix and guard future regressions.Suggested test addition
+ it('should ignore $nodeId.output inside inline backtick code in loop.prompt', async () => { + const workflowDir = join(testDir, '.archon', 'workflows'); + await mkdir(workflowDir, { recursive: true }); + + await writeFile( + join(workflowDir, 'loop-inline.yaml'), + ` +name: loop-inline +description: Loop with inline code mention +nodes: + - id: my-loop + loop: + prompt: | + Use \`$other-node.output\` as an example placeholder. + until: DONE + max_iterations: 3 +` + ); + + const result = await discoverWorkflows(testDir, { loadDefaults: false }); + expect(result.errors).toHaveLength(0); + expect(result.workflows).toHaveLength(1); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/workflows/src/loader.test.ts` around lines 1863 - 1890, Add a new test case mirroring the fenced-code case but for inline backticks: create an it(...) titled "should ignore $nodeId.output inside inline backticks in loop.prompt" that writes a workflow YAML (e.g., loop-inline-backtick.yaml) into the same workflowDir with a loop.prompt containing inline-backtick usage like ` $other-node.output ` and then calls discoverWorkflows(testDir, { loadDefaults: false }) asserting result.errors.length === 0 and result.workflows.length === 1; follow the same setup/teardown pattern as the existing fenced-code test so the loader logic (discoverWorkflows and loop.prompt handling) is exercised for inline backticks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/workflows/src/loader.test.ts`:
- Around line 1863-1890: Add a new test case mirroring the fenced-code case but
for inline backticks: create an it(...) titled "should ignore $nodeId.output
inside inline backticks in loop.prompt" that writes a workflow YAML (e.g.,
loop-inline-backtick.yaml) into the same workflowDir with a loop.prompt
containing inline-backtick usage like ` $other-node.output ` and then calls
discoverWorkflows(testDir, { loadDefaults: false }) asserting
result.errors.length === 0 and result.workflows.length === 1; follow the same
setup/teardown pattern as the existing fenced-code test so the loader logic
(discoverWorkflows and loop.prompt handling) is exercised for inline backticks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 72a089a5-c99c-456b-bd13-f194622a9924
📒 Files selected for processing (4)
.archon/workflows/defaults/archon-workflow-builder.yamlpackages/workflows/src/defaults/bundled-defaults.generated.tspackages/workflows/src/loader.test.tspackages/workflows/src/loader.ts
Review SummaryVerdict: ready-to-merge Your fix for the workflow DAG validator false-positive bug is solid. The Blocking issuesNone. Suggested fixesNone. Minor / nice-to-have
Compliments
Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review. |
Summary
validateDagStructurescansnode.when,node.prompt, andloop.promptfor$nodeId.outputreferences using a regex that matches anywhere in the string. Prompt bodies in builder-style workflows (notably the bundledarchon-workflow-builder) embed fenced and inline code as documentation for the LLM, and those literal$<other-node>.outputmentions get false-matched as real cross-node references.archon-workflow-builderfails to load ondevHEAD.bun run cli workflow listreportsdag_structure_invalid: Node 'generate-yaml' references unknown node '$other-node.output', andbun run cli workflow run archon-workflow-builder ...fails opaquely with "failed to load". Any user workflow whoseprompt:body documents$<name>.outputsyntax inside fenced or inline code hits the same false positive.```triple-backtick fenced blocks and`single-backtick inline code from prompt andloop.promptstrings. Also wraps one bare$nodeId.outputmention inarchon-workflow-builder.yaml's Rules section in inline backticks so it reads as documentation, matching the style of the surrounding$nodeId.outputmentions on the same lines. Regeneratesbundled-defaults.generated.ts.when:clauses still scan unchanged — they are JS-like expressions and never carry markdown code. Real cross-node$nodeId.outputreferences in prose outside any code marker still validate (the new testshould still reject unknown $nodeId.output refs outside codecovers this). No runtime behavior change. No public API change.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
Label Snapshot
risk: lowsize: Sworkflowsworkflows:loaderChange Metadata
bugworkflowsLinked Issue
Validation Evidence (required)
Stash-bisect proof on
devHEAD7d067738:Before fix (
git stashapplied):After fix (
git stash pop):Workflow count goes from 32 to 33 (the previously-rejected
archon-workflow-buildernow loads), errorCount goes from 2 to 0.Security Impact (required)
Compatibility / Migration
Human Verification (required)
bun run cli workflow liston a clean checkout: before fix shows 2 errors and 32 workflows; after fix shows 0 errors and 33 workflows.loop.prompt.$nodeId.outputreferences that DO appear in prose (outside any code) still get validated — confirmed by the existingbad-when-refandbad-prompt-reftests, plus the new mixed-ref test.when:clauses are unmodified — the testshould reject a workflow where when: references an unknown node outputstill passes.bash-unknown-reftest still passes.$<id>.outputoccurrence inarchon-workflow-builder.yamland confirmed all the others are either real refs ($scan-codebase,$extract-intent) or already inside fenced/inline code that the new validator stripping handles.bun run cli workflow run archon-workflow-builder ...(requires a live AI call).Side Effects / Blast Radius (required)
packages/workflows/src/loader.tsonly. Bundled defaults regenerated to pick up the one-line yaml hardening.$nodeId.outputreference in backticks (single or fenced) would no longer be validated at load time. This is a deliberate trade-off: backtick-wrapping signals "render this literally to the LLM," and a real reference rendered literally would never resolve at runtime anyway, so missing it at load time only delays the failure to runtime. Users who want load-time validation simply remove the backticks or use the angle-bracket convention$<nodeId>.outputoutside fences.bun run check:bundledand existing workflows tests catch regressions.Rollback Plan (required)
git revert <commit-sha>. No state to clean up. Thearchon-workflow-builderfailure recurs immediately.bun run cli workflow listreports the originaldag_structure_invaliderror.Risks and Mitigations
`$missing-node.output`where the author meant a real ref but wrote it as documentation).prompt:text containing an unbalanced single backtick (e.g. one stray`) that causes the inline regex to consume more than intended./`[^`\n]*`/grequires a closing backtick on the same line; an unbalanced backtick simply won't match and passes through unchanged.Notes on relationship to #1402
PR #1402 (open) addresses the same root failure at a different layer — escaping the placeholders inside the YAML using the existing angle-bracket convention. Its body explicitly notes the validator-hardening direction as the suggested follow-up: "have the validator skip scanning inside triple-backtick code fences or String.raw template literals." This PR is that follow-up. The two PRs are complementary: #1402 hardens the bundled YAML, this PR hardens the validator so any future workflow author can put fenced/inline doc examples in a prompt body without breaking discovery. They can land in either order.
Summary by CodeRabbit
Bug Fixes
Tests