Skip to content

fix(workflows): skip markdown code blocks in $nodeId.output validation#1478

Merged
Wirasm merged 1 commit intocoleam00:devfrom
truffle-dev:fix/workflow-validator-skip-fenced-code-1413
Apr 29, 2026
Merged

fix(workflows): skip markdown code blocks in $nodeId.output validation#1478
Wirasm merged 1 commit intocoleam00:devfrom
truffle-dev:fix/workflow-validator-skip-fenced-code-1413

Conversation

@truffle-dev
Copy link
Copy Markdown
Contributor

@truffle-dev truffle-dev commented Apr 29, 2026

Summary

  • Problem: validateDagStructure scans node.when, node.prompt, and loop.prompt for $nodeId.output references using a regex that matches anywhere in the string. Prompt bodies in builder-style workflows (notably the bundled archon-workflow-builder) embed fenced and inline code as documentation for the LLM, and those literal $<other-node>.output mentions get false-matched as real cross-node references.
  • Why it matters: archon-workflow-builder fails to load on dev HEAD. bun run cli workflow list reports dag_structure_invalid: Node 'generate-yaml' references unknown node '$other-node.output', and bun run cli workflow run archon-workflow-builder ... fails opaquely with "failed to load". Any user workflow whose prompt: body documents $<name>.output syntax inside fenced or inline code hits the same false positive.
  • What changed: Before scanning, strip ``` triple-backtick fenced blocks and ` single-backtick inline code from prompt and loop.prompt strings. Also wraps one bare $nodeId.output mention in archon-workflow-builder.yaml's Rules section in inline backticks so it reads as documentation, matching the style of the surrounding $nodeId.output mentions on the same lines. Regenerates bundled-defaults.generated.ts.
  • What did NOT change (scope boundary): when: clauses still scan unchanged — they are JS-like expressions and never carry markdown code. Real cross-node $nodeId.output references in prose outside any code marker still validate (the new test should still reject unknown $nodeId.output refs outside code covers this). No runtime behavior change. No public API change.

UX Journey

Before

Operator                  Archon CLI                       Workflow Loader
────────                  ──────────                       ───────────────
bun run cli workflow run
archon-workflow-builder ──▶ load .archon/workflows/        validateDagStructure
                            defaults/.../*.yaml ─────────▶ regex /\$<id>.output/
                                                           false-matches inside
                                                           fenced ```yaml block:
                                                           rejects file
                            error("failed to load") ◀───── return "references
                                                           unknown node
sees opaque error ◀──────── propagate                       '$other-node.output'"

After

Operator                  Archon CLI                       Workflow Loader
────────                  ──────────                       ───────────────
bun run cli workflow run
archon-workflow-builder ──▶ load .archon/workflows/        validateDagStructure
                            defaults/.../*.yaml ─────────▶ [strip fenced + inline
                                                            code from prompt
                                                            sources before scan]
                                                           regex /\$<id>.output/
                                                           matches only real refs
                            executes workflow ◀──────────  return null (valid)
sees workflow start ◀────── propagate

Architecture Diagram

Before

┌─────────────────────────────────────────────┐
│ packages/workflows/src/loader.ts            │
│                                             │
│  validateDagStructure(nodes)                │
│    └── for each node:                       │
│         scan node.when ─────┐               │
│         scan node.prompt ───┼─▶ regex       │
│         scan loop.prompt ───┘   /\$<id>     │
│                                  .output/g  │
│                                  matches    │
│                                  literal    │
│                                  fenced     │
│                                  doc text   │
│                                  ─▶ FALSE   │
│                                    POSITIVE │
└─────────────────────────────────────────────┘

After

┌─────────────────────────────────────────────┐
│ packages/workflows/src/loader.ts            │
│                                             │
│  validateDagStructure(nodes)                │
│    └── stripMarkdownCode(s) [+]             │
│    └── for each node:                       │
│         scan node.when ────────────────┐    │
│         scan stripMarkdownCode(prompt)─┼─▶  │
│              [~]                       │    │
│         scan stripMarkdownCode(        │    │
│              loop.prompt) [~] ─────────┘    │
│                                regex matches │
│                                only real    │
│                                refs (fenced │
│                                + inline doc │
│                                stripped)    │
└─────────────────────────────────────────────┘

Connection inventory:

From To Status Notes
validateDagStructure regex outputRefPattern modified now operates on stripped prompt sources
validateDagStructure stripMarkdownCode helper new inline arrow function defined within validator scope

Label Snapshot

  • Risk: risk: low
  • Size: size: S
  • Scope: workflows
  • Module: workflows:loader

Change Metadata

  • Change type: bug
  • Primary scope: workflows

Linked Issue

Validation Evidence (required)

bun run check:bundled    # bundled-defaults.generated.ts up to date (36 commands, 20 workflows)
bun run type-check       # all 10 packages pass
bun run lint             # eslint clean (no errors, no warnings)
bun run format:check     # prettier clean
bun test packages/workflows/src/loader.test.ts   # 117 pass / 0 fail (was 113 before; +4 new tests)
bun --filter '@archon/workflows' test            # 41+18+6+7 across 4 suites, 0 fail

Stash-bisect proof on dev HEAD 7d067738:

Before fix (git stash applied):

{"level":40,"module":"workflow.loader","filename":"archon-workflow-builder.yaml","structureError":"Node 'generate-yaml' references unknown node '$other-node.output'","msg":"dag_structure_invalid"}
{"level":40,"module":"workflow.discovery","errorCount":1,"errors":[{"filename":"archon-workflow-builder.yaml","error":"Node 'generate-yaml' references unknown node '$other-node.output'","errorType":"validation_error"}],"msg":"app_default_workflow_errors"}
{"level":30,"module":"workflow.discovery","count":32,"errorCount":2,"msg":"workflows_discovery_completed"}

After fix (git stash pop):

{"level":30,"module":"workflow.discovery","count":33,"errorCount":0,"msg":"workflows_discovery_completed"}

Workflow count goes from 32 to 33 (the previously-rejected archon-workflow-builder now loads), errorCount goes from 2 to 0.

Security Impact (required)

  • New permissions/capabilities? No
  • New external network calls? No
  • Secrets/tokens handling changed? No
  • File system access scope changed? No

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Database migration needed? No

Human Verification (required)

  • Verified scenarios:
    • bun run cli workflow list on a clean checkout: before fix shows 2 errors and 32 workflows; after fix shows 0 errors and 33 workflows.
    • 4 new unit tests covering: (a) fenced code in prompt, (b) inline-backtick code in prompt, (c) mixed real-ref-in-prose + fenced-doc still rejects the real bad ref, (d) fenced code in loop.prompt.
  • Edge cases checked:
    • Real $nodeId.output references that DO appear in prose (outside any code) still get validated — confirmed by the existing bad-when-ref and bad-prompt-ref tests, plus the new mixed-ref test.
    • when: clauses are unmodified — the test should reject a workflow where when: references an unknown node output still passes.
    • Bash node refs are still excluded from load-time validation — existing bash-unknown-ref test still passes.
    • The yaml hardening on line 181 was the minimum needed: I grepped every $<id>.output occurrence in archon-workflow-builder.yaml and confirmed all the others are either real refs ($scan-codebase, $extract-intent) or already inside fenced/inline code that the new validator stripping handles.
  • What was not verified:
    • End-to-end bun run cli workflow run archon-workflow-builder ... (requires a live AI call).

Side Effects / Blast Radius (required)

  • Affected subsystems/workflows: packages/workflows/src/loader.ts only. Bundled defaults regenerated to pick up the one-line yaml hardening.
  • Potential unintended effects: A prompt body whose author intentionally wraps a real $nodeId.output reference in backticks (single or fenced) would no longer be validated at load time. This is a deliberate trade-off: backtick-wrapping signals "render this literally to the LLM," and a real reference rendered literally would never resolve at runtime anyway, so missing it at load time only delays the failure to runtime. Users who want load-time validation simply remove the backticks or use the angle-bracket convention $<nodeId>.output outside fences.
  • Guardrails/monitoring: bun run check:bundled and existing workflows tests catch regressions.

Rollback Plan (required)

  • Fast rollback: git revert <commit-sha>. No state to clean up. The archon-workflow-builder failure recurs immediately.
  • Feature flags: none.
  • Observable failure symptoms: bun run cli workflow list reports the original dag_structure_invalid error.

Risks and Mitigations

  • Risk: Stripping inline backticks could mask a typo'd real ref that was wrapped in backticks for emphasis (e.g. `$missing-node.output` where the author meant a real ref but wrote it as documentation).
    • Mitigation: This is a documentation-style choice the author made; the runtime executor will still surface the missing reference at execution time. The validator's job is to catch structural problems at load time, and intentional doc-style backticks fall outside that scope.
  • Risk: A workflow could have prompt: text containing an unbalanced single backtick (e.g. one stray `) that causes the inline regex to consume more than intended.
    • Mitigation: The inline regex /`[^`\n]*`/g requires a closing backtick on the same line; an unbalanced backtick simply won't match and passes through unchanged.

Notes on relationship to #1402

PR #1402 (open) addresses the same root failure at a different layer — escaping the placeholders inside the YAML using the existing angle-bracket convention. Its body explicitly notes the validator-hardening direction as the suggested follow-up: "have the validator skip scanning inside triple-backtick code fences or String.raw template literals." This PR is that follow-up. The two PRs are complementary: #1402 hardens the bundled YAML, this PR hardens the validator so any future workflow author can put fenced/inline doc examples in a prompt body without breaking discovery. They can land in either order.

Summary by CodeRabbit

  • Bug Fixes

    • Workflow validation now correctly ignores output references when they appear within code blocks and backticks, preventing false validation errors for legitimate code examples in prompt configurations.
  • Tests

    • Added test coverage for output-reference validation behavior in workflow prompts.

The DAG-structure validator scans `node.when`, `node.prompt`, and
`loop.prompt` strings for `$nodeId.output` references. Prompt bodies
in builder-style workflows embed fenced and inline code as
documentation for the LLM (e.g. `archon-workflow-builder` shows how to
author a script node), and those literal `$<other-node>.output`
mentions were being treated as real cross-node references. Result:
`archon-workflow-builder` (a bundled default) failed to load, and
`bun run cli workflow run archon-workflow-builder ...` reported
"references unknown node '$other-node.output'".

Strip triple-backtick fenced blocks and single-backtick inline code
from prompt and loop.prompt before scanning. `when:` clauses are
JS-like expressions and never carry markdown code, so they pass through
unchanged. Real cross-node refs in prose continue to validate.

Also wraps one bare `$nodeId.output` mention in
`archon-workflow-builder.yaml` Rules section in inline backticks so it
reads as documentation alongside the surrounding `$nodeId.output`
mentions that already use this style.

Closes coleam00#1413
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

This PR addresses false-positive validation errors when loading archon-workflow-builder by modifying the workflow validator to skip fenced code blocks and inline backtick code when scanning prompts for $nodeId.output references. Updates include validator logic, tests, and bundled workflow prompt content.

Changes

Cohort / File(s) Summary
Validator Logic
packages/workflows/src/loader.ts
Added preprocessing step that strips triple-backtick fenced blocks and single-backtick inline code from prompt and loop.prompt strings before scanning for $nodeId.output references. Validation for when: expressions unchanged.
Validator Test Coverage
packages/workflows/src/loader.test.ts
Added positive tests confirming $nodeId.output inside fenced code blocks and inline backticks are not flagged as unknown references. Added negative test ensuring validation remains strict for real prose usage outside code markers.
Bundled Defaults & YAML Prompt
packages/workflows/src/defaults/bundled-defaults.generated.ts, .archon/workflows/defaults/archon-workflow-builder.yaml
Updated archon-workflow-builder prompt text to render $nodeId.output in backticks within documentation examples. Bundled defaults regenerated to reflect workflow prompt changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

Poem

🐰 The validator hops with newfound grace,
Backticks hide code in their safe place,
No more false alarms in fenced-block prose—
The builder loads where documentation grows!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: fixing false-positive validation errors caused by markdown code blocks in prompt strings when scanning for $nodeId.output references.
Description check ✅ Passed The PR description comprehensively follows the template structure, including Problem/Why/What Changed/Scope boundaries, detailed UX journeys and architecture diagrams, full validation evidence with test results, security impact assessment, compatibility confirmation, human verification details, side effects analysis, and rollback procedures.
Linked Issues check ✅ Passed The PR directly addresses the core objective from #1413: implementing validator logic to skip markdown code blocks (both fenced and inline) when scanning prompt and loop.prompt for $nodeId.output references, allowing archon-workflow-builder to load correctly. All code changes align with the suggested fix direction, including the helper function to strip backtick-wrapped code before regex scanning.
Out of Scope Changes check ✅ Passed All changes are tightly scoped to the stated objective: validator preprocessing in loader.ts, test coverage for the new behavior, and one documentation-style backtick addition in archon-workflow-builder.yaml with corresponding bundled-defaults regeneration. No unrelated features, refactorings, or subsystem modifications introduced.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/workflows/src/loader.test.ts (1)

1863-1890: Add one more test: inline backticks inside loop.prompt.

You covered fenced blocks in loop.prompt; adding inline-backtick coverage there would complete the matrix and guard future regressions.

Suggested test addition
+    it('should ignore $nodeId.output inside inline backtick code in loop.prompt', async () => {
+      const workflowDir = join(testDir, '.archon', 'workflows');
+      await mkdir(workflowDir, { recursive: true });
+
+      await writeFile(
+        join(workflowDir, 'loop-inline.yaml'),
+        `
+name: loop-inline
+description: Loop with inline code mention
+nodes:
+  - id: my-loop
+    loop:
+      prompt: |
+        Use \`$other-node.output\` as an example placeholder.
+      until: DONE
+      max_iterations: 3
+`
+      );
+
+      const result = await discoverWorkflows(testDir, { loadDefaults: false });
+      expect(result.errors).toHaveLength(0);
+      expect(result.workflows).toHaveLength(1);
+    });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/loader.test.ts` around lines 1863 - 1890, Add a new
test case mirroring the fenced-code case but for inline backticks: create an
it(...) titled "should ignore $nodeId.output inside inline backticks in
loop.prompt" that writes a workflow YAML (e.g., loop-inline-backtick.yaml) into
the same workflowDir with a loop.prompt containing inline-backtick usage like `
$other-node.output ` and then calls discoverWorkflows(testDir, { loadDefaults:
false }) asserting result.errors.length === 0 and result.workflows.length === 1;
follow the same setup/teardown pattern as the existing fenced-code test so the
loader logic (discoverWorkflows and loop.prompt handling) is exercised for
inline backticks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/workflows/src/loader.test.ts`:
- Around line 1863-1890: Add a new test case mirroring the fenced-code case but
for inline backticks: create an it(...) titled "should ignore $nodeId.output
inside inline backticks in loop.prompt" that writes a workflow YAML (e.g.,
loop-inline-backtick.yaml) into the same workflowDir with a loop.prompt
containing inline-backtick usage like ` $other-node.output ` and then calls
discoverWorkflows(testDir, { loadDefaults: false }) asserting
result.errors.length === 0 and result.workflows.length === 1; follow the same
setup/teardown pattern as the existing fenced-code test so the loader logic
(discoverWorkflows and loop.prompt handling) is exercised for inline backticks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 72a089a5-c99c-456b-bd13-f194622a9924

📥 Commits

Reviewing files that changed from the base of the PR and between 7d06773 and b126ad8.

📒 Files selected for processing (4)
  • .archon/workflows/defaults/archon-workflow-builder.yaml
  • packages/workflows/src/defaults/bundled-defaults.generated.ts
  • packages/workflows/src/loader.test.ts
  • packages/workflows/src/loader.ts

@Wirasm
Copy link
Copy Markdown
Collaborator

Wirasm commented Apr 29, 2026

Review Summary

Verdict: ready-to-merge

Your fix for the workflow DAG validator false-positive bug is solid. The stripMarkdownCode approach correctly strips fenced code blocks first, then inline code from remaining prose — this ordering is right because fenced blocks are the outer container. The 4 new unit tests provide good coverage, and the implementation follows CLAUDE.md conventions throughout.

Blocking issues

None.

Suggested fixes

None.

Minor / nice-to-have

  • packages/workflows/src/loader.ts:157: The backtick-stripping in fenced blocks is intentional (content is already stripped of its code-fence context), but consider adding a comment to document this design decision for future readers:

    // Note: fenced block content may lose inner backticks; this is intentional
    // since the content is already stripped of its code-fence context.

    This is purely optional — the current behavior is correct.

  • packages/workflows/src/loader.ts:155: stripMarkdownCode is private and well-covered by integration tests. If it grows in the future, consider moving it to utils/ with an export for unit-test isolation. No action needed now.

Compliments

  • Clean, focused scope — 4 files, one clear purpose
  • Thorough test coverage with 4 new unit tests
  • Exemplary PR template with architecture diagrams, validation evidence, and rollback plan
  • Proper TypeScript conventions: explicit return types, no any, correct type narrowing

Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review.

@Wirasm Wirasm merged commit 7e4ea40 into coleam00:dev Apr 29, 2026
1 check passed
@Wirasm Wirasm mentioned this pull request Apr 29, 2026
@truffle-dev truffle-dev deleted the fix/workflow-validator-skip-fenced-code-1413 branch April 30, 2026 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants