feat(workflows): add archon-doc-agent bundled workflow#1287
feat(workflows): add archon-doc-agent bundled workflow#1287seanrobertwright wants to merge 3 commits intocoleam00:devfrom
Conversation
Adds a documentation-audit workflow that scopes files (diff vs $BASE_BRANCH, any git ref, or full repo), runs a heuristic scan for TODO/FIXME markers and export-surface candidates, classifies findings via Claude, edits source files to add missing doc comments and fix stale ones, scaffolds Markdown docs under /docs, and writes a run report with items flagged for human review. Designed to be safe-by-default: uncertain findings are skipped rather than auto-fixed, scope is honored per fix-source rules, and docs changes are left in the worktree diff for human review before commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a new Archon workflow Changes
Sequence Diagram(s)sequenceDiagram
participant User as User / Trigger
participant Runner as CI Workflow Runner
participant Git as Git Repository
participant Heuristic as Heuristic Scanner
participant AI as AI Audit Agent
participant Artifacts as Artifacts (artifacts/*)
participant Docs as Docs (docs/)
User->>Runner: invoke workflow (diff | ref | all)
Runner->>Git: compute scope (diff or full)
Git-->>Runner: artifacts/scope.txt
Runner->>Heuristic: scan scoped files
Heuristic-->>Artifacts: todos.txt, exports.txt
Runner->>AI: run AI-only audit (reads scope + heuristics)
AI-->>Artifacts: findings.json
Runner->>AI: doc-only fix step (reads findings, scope)
AI-->>Artifacts: fix-source-log.md (applied/skipped)
Runner->>Docs: generate/update docs/*.md (idempotent)
Docs-->>Artifacts: updated docs files
Runner->>Git: run git diff --stat and aggregate
Runner-->>Artifacts: report.md
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.archon/workflows/defaults/archon-doc-agent.yaml:
- Around line 257-293: Update the workflow to use the executor-provided
$DOCS_DIR variable instead of the hardcoded "docs/" path wherever the steps
reference creating, checking, or reporting files (notably the step "Check
whether `docs/` exists" and the list of files `docs/index.md`,
`docs/architecture.md`, `docs/api.md` and subsequent reporting lines around the
block that appears also at 316-340); change all literal "docs/" occurrences to
"$DOCS_DIR/" (or use "${DOCS_DIR}" where needed) and update the output/reporting
text that prints "docs/" to reference $DOCS_DIR so file creation, existence
checks, and git diff use the configured docs directory.
- Around line 132-177: The audit node (id: audit) currently includes Write in
denied_tools which prevents creation of $ARTIFACTS_DIR/findings.json needed by
downstream nodes (fix_source, generate_docs, report); remove Write from the
denied_tools array for the audit node so the agent can write the findings file,
or alternatively replace the broad denied_tools restriction with a path-scoped
policy that only disallows edits to source files while allowing writes to
$ARTIFACTS_DIR/* (ensure the audit node prompt still forbids modifying source
files verbally).
- Around line 24-35: Replace the unsafe raw interpolation ARG="$ARGUMENTS" in
the scope node with a safe here-document capture using a single-quoted delimiter
so the substituted $ARGUMENTS is treated as literal text and cannot inject shell
syntax; update the assignment that sets ARG (the ARG variable inside the scope
node) to read from a single-quoted heredoc (e.g., using a <<'DELIM' / DELIM
pattern) so the executor's substitution becomes the heredoc body rather than
executable shell code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7f715d74-ce70-41f6-a38a-3e91bf257f49
📒 Files selected for processing (1)
.archon/workflows/defaults/archon-doc-agent.yaml
| - id: audit | ||
| depends_on: [heuristic_scan] | ||
| context: fresh | ||
| denied_tools: [Write, Edit] | ||
| prompt: | | ||
| You are a documentation auditor. Your job is to classify documentation | ||
| problems and write a structured findings report. You MUST NOT modify | ||
| source files in this step — analysis only. | ||
|
|
||
| ## Inputs | ||
|
|
||
| Scope summary: $scope.output | ||
| Heuristic scan summary: $heuristic_scan.output | ||
|
|
||
| Full inputs on disk: | ||
| - $ARTIFACTS_DIR/scope.txt — list of files in audit scope | ||
| - $ARTIFACTS_DIR/todos.txt — TODO/FIXME/XXX/HACK markers | ||
| - $ARTIFACTS_DIR/exports.txt — candidate exported symbols per file | ||
|
|
||
| ## Method | ||
|
|
||
| 1. Read `$ARTIFACTS_DIR/scope.txt`. If empty, write an empty findings file | ||
| (see Output) and stop. | ||
| 2. For each file in scope (cap at ~25 files for a single pass — prioritize | ||
| files with the most heuristic hits): | ||
| a. Read the file. | ||
| b. Identify **undocumented exports**: public/exported symbols that lack | ||
| an adjacent doc comment in the language's native style (JSDoc for | ||
| JS/TS, docstring for Python, doc comment for Go, `///` for Rust, | ||
| etc.). Ignore trivial re-exports and type aliases that are self- | ||
| documenting (e.g., `export type UserId = string`). | ||
| c. Identify **stale comments**: comments whose content contradicts or | ||
| no longer matches the adjacent code. Classify each as: | ||
| - `stale` : comment describes behavior the code no longer has | ||
| - `outdated_example` : example in a comment references removed/renamed API | ||
| - `contradictory` : comment asserts something the code contradicts | ||
| - `orphan_ref` : comment references a symbol that isn't in the file | ||
| - `todo` : a TODO/FIXME worth surfacing (don't auto-fix) | ||
| d. When uncertain, mark the finding with `"uncertain": true` and | ||
| explain briefly — the human will review these. | ||
| 3. Do NOT flag internal (non-exported) helpers for missing docs. | ||
| 4. Do NOT invent fixes for things you haven't verified. | ||
|
|
||
| ## Output | ||
|
|
||
| Write a single JSON file to `$ARTIFACTS_DIR/findings.json` with this shape: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check for workflow nodes that deny Write while instructing the agent to write artifact files.
rg -n -C6 'denied_tools:\s*\[[^]]*Write|Write.*denied_tools|findings\.json|ARTIFACTS_DIR/.+\.json' --glob '*.yaml' --glob '*.yml' .archonRepository: coleam00/Archon
Length of output: 28942
Remove Write from denied_tools in the audit node.
The audit node denies Write but explicitly instructs the agent to write $ARTIFACTS_DIR/findings.json. This prevents artifact creation while downstream nodes (fix_source, generate_docs, report) depend on the findings file. Allowing Write is necessary for the workflow to function; if you need to restrict edits to source files only, use path-scoped tool policies instead.
🛠️ Proposed fix
- id: audit
depends_on: [heuristic_scan]
context: fresh
- denied_tools: [Write, Edit]
+ denied_tools: [Edit]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - id: audit | |
| depends_on: [heuristic_scan] | |
| context: fresh | |
| denied_tools: [Write, Edit] | |
| prompt: | | |
| You are a documentation auditor. Your job is to classify documentation | |
| problems and write a structured findings report. You MUST NOT modify | |
| source files in this step — analysis only. | |
| ## Inputs | |
| Scope summary: $scope.output | |
| Heuristic scan summary: $heuristic_scan.output | |
| Full inputs on disk: | |
| - $ARTIFACTS_DIR/scope.txt — list of files in audit scope | |
| - $ARTIFACTS_DIR/todos.txt — TODO/FIXME/XXX/HACK markers | |
| - $ARTIFACTS_DIR/exports.txt — candidate exported symbols per file | |
| ## Method | |
| 1. Read `$ARTIFACTS_DIR/scope.txt`. If empty, write an empty findings file | |
| (see Output) and stop. | |
| 2. For each file in scope (cap at ~25 files for a single pass — prioritize | |
| files with the most heuristic hits): | |
| a. Read the file. | |
| b. Identify **undocumented exports**: public/exported symbols that lack | |
| an adjacent doc comment in the language's native style (JSDoc for | |
| JS/TS, docstring for Python, doc comment for Go, `///` for Rust, | |
| etc.). Ignore trivial re-exports and type aliases that are self- | |
| documenting (e.g., `export type UserId = string`). | |
| c. Identify **stale comments**: comments whose content contradicts or | |
| no longer matches the adjacent code. Classify each as: | |
| - `stale` : comment describes behavior the code no longer has | |
| - `outdated_example` : example in a comment references removed/renamed API | |
| - `contradictory` : comment asserts something the code contradicts | |
| - `orphan_ref` : comment references a symbol that isn't in the file | |
| - `todo` : a TODO/FIXME worth surfacing (don't auto-fix) | |
| d. When uncertain, mark the finding with `"uncertain": true` and | |
| explain briefly — the human will review these. | |
| 3. Do NOT flag internal (non-exported) helpers for missing docs. | |
| 4. Do NOT invent fixes for things you haven't verified. | |
| ## Output | |
| Write a single JSON file to `$ARTIFACTS_DIR/findings.json` with this shape: | |
| - id: audit | |
| depends_on: [heuristic_scan] | |
| context: fresh | |
| denied_tools: [Edit] | |
| prompt: | | |
| You are a documentation auditor. Your job is to classify documentation | |
| problems and write a structured findings report. You MUST NOT modify | |
| source files in this step — analysis only. | |
| ## Inputs | |
| Scope summary: $scope.output | |
| Heuristic scan summary: $heuristic_scan.output | |
| Full inputs on disk: | |
| - $ARTIFACTS_DIR/scope.txt — list of files in audit scope | |
| - $ARTIFACTS_DIR/todos.txt — TODO/FIXME/XXX/HACK markers | |
| - $ARTIFACTS_DIR/exports.txt — candidate exported symbols per file | |
| ## Method | |
| 1. Read `$ARTIFACTS_DIR/scope.txt`. If empty, write an empty findings file | |
| (see Output) and stop. | |
| 2. For each file in scope (cap at ~25 files for a single pass — prioritize | |
| files with the most heuristic hits): | |
| a. Read the file. | |
| b. Identify **undocumented exports**: public/exported symbols that lack | |
| an adjacent doc comment in the language's native style (JSDoc for | |
| JS/TS, docstring for Python, doc comment for Go, `///` for Rust, | |
| etc.). Ignore trivial re-exports and type aliases that are self- | |
| documenting (e.g., `export type UserId = string`). | |
| c. Identify **stale comments**: comments whose content contradicts or | |
| no longer matches the adjacent code. Classify each as: | |
| - `stale` : comment describes behavior the code no longer has | |
| - `outdated_example` : example in a comment references removed/renamed API | |
| - `contradictory` : comment asserts something the code contradicts | |
| - `orphan_ref` : comment references a symbol that isn't in the file | |
| - `todo` : a TODO/FIXME worth surfacing (don't auto-fix) | |
| d. When uncertain, mark the finding with `"uncertain": true` and | |
| explain briefly — the human will review these. | |
| 3. Do NOT flag internal (non-exported) helpers for missing docs. | |
| 4. Do NOT invent fixes for things you haven't verified. | |
| ## Output | |
| Write a single JSON file to `$ARTIFACTS_DIR/findings.json` with this shape: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.archon/workflows/defaults/archon-doc-agent.yaml around lines 132 - 177, The
audit node (id: audit) currently includes Write in denied_tools which prevents
creation of $ARTIFACTS_DIR/findings.json needed by downstream nodes (fix_source,
generate_docs, report); remove Write from the denied_tools array for the audit
node so the agent can write the findings file, or alternatively replace the
broad denied_tools restriction with a path-scoped policy that only disallows
edits to source files while allowing writes to $ARTIFACTS_DIR/* (ensure the
audit node prompt still forbids modifying source files verbally).
There was a problem hiding this comment.
Pull request overview
Adds a new bundled default workflow (archon-doc-agent) intended to audit stale comments and undocumented exported symbols, apply in-scope doc fixes, scaffold/update /docs, and emit a run report for human review.
Changes:
- Introduces a new bundled workflow YAML with a 6-node DAG (scope → heuristic scan → AI audit → source doc fixes → docs generation → reporting).
- Adds scope selection modes (diff vs
$BASE_BRANCH, arbitrary ref, or full repo viaall) and writes intermediate artifacts (scope.txt,todos.txt,exports.txt,findings.json, logs, report).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - id: audit | ||
| depends_on: [heuristic_scan] | ||
| context: fresh | ||
| denied_tools: [Write, Edit] | ||
| prompt: | | ||
| You are a documentation auditor. Your job is to classify documentation | ||
| problems and write a structured findings report. You MUST NOT modify | ||
| source files in this step — analysis only. | ||
|
|
There was a problem hiding this comment.
The audit node is configured with denied_tools: [Write, Edit], but the prompt requires writing $ARTIFACTS_DIR/findings.json. With Write denied, the agent can’t reliably create that file (and downstream fix_source depends on it). Also, this setup does not actually enforce read-only analysis because Bash remains available and can still modify the repo. Consider switching to a restrictive allowlist like allowed_tools: [Read, Write] (and explicitly disallow Edit/Bash), and clarify that the only permitted write is to $ARTIFACTS_DIR/findings.json.
…ction The scope bash node interpolated $ARGUMENTS directly with ARG="$ARGUMENTS". Archon's substituteWorkflowVariables (dag-executor.ts:290) replaces $ARGUMENTS via plain regex replacement with no shell escaping — only $nodeId.output is auto-quoted via shellQuote. Crafted user input containing an unescaped double quote followed by shell metacharacters could break out of the ARG= assignment and execute arbitrary commands. This is a real concern when $ARGUMENTS flows from a platform adapter (Slack/Telegram/GitHub comment) rather than the CLI. Capture $ARGUMENTS via a single-quoted heredoc so the substituted value is treated as literal text. Verified with an injection payload that the literal string is preserved and no shell command runs. Addresses PR review feedback on coleam00#1287. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The generate_docs and report node prompts hardcoded `docs/` paths. $DOCS_DIR is a first-class Archon variable (substituted in dag-executor.ts via resolvedDocsDir) configured via `docs.path` in .archon/config.yaml; hardcoding `docs/` broke users who'd chosen a different documentation directory. Replaced literal `docs/` with `$DOCS_DIR` in generate_docs scaffolding, report `git diff --stat` invocation, and the review instructions. README.md / CHANGELOG.md references at repo root are unchanged — those are not configured via docs.path. Addresses PR review feedback on coleam00#1287. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Review responses — addressed in commits ✅ Finding 1 ($DOCS_DIR) — valid, fixed in
|
|
hey @seanrobertwright could you please explain the difference between this and the "docs-impact-agent"? i see no reason to have two agents for this use case, unless there is a clear gap? |
|
Rasmus, I will confirm when I get home, but my installation did not show such a workflow. I will followup on this. |
Rasmus, after discussion in the hangout today - the "docs-impact-agent" is not a workflow, but it was explained that it is used to develop Archon. My proposal was to have a document generation / updater for ANY project / repo the user is working on. Unless, that agent can be invoked directly? I might need more clarification on this. |
|
the doc impact agent is a generic doc generation command that can be inserted at any node as a generic doc updater for any project |
Summary
archon-doc-agentthat scopes files (git diff / any ref / full repo), runs a heuristic scan, classifies findings via Claude, edits source to add missing doc comments and fix stale ones, scaffolds Markdown docs under/docs, and writes a run report with uncertain items flagged for human review..archon/workflows/defaults/.Updates since first review
Two follow-up commits address review findings on the original PR. See this comment for the full disposition (including one finding declined with evidence).
eb1af823—fix(workflows): harden archon-doc-agent scope node against shell injection. The scope bash node originally interpolated$ARGUMENTSdirectly withARG="$ARGUMENTS". Archon'ssubstituteWorkflowVariables(dag-executor.ts:290) does plain regex replacement with no shell escaping; only$nodeId.outputis auto-quoted. Crafted input containing an unescaped double quote followed by shell metacharacters could break out of theARG=assignment and execute arbitrary commands — a real concern when$ARGUMENTSflows from a platform adapter (Slack/Telegram/GitHub comment). Capture now uses a single-quoted heredoc so the substituted value is treated as literal text. Verified with a live injection test: payload"; echo INJECTED; echo "survives as literal and no command runs.f8b02bec—fix(workflows): honor $DOCS_DIR config in archon-doc-agent. Thegenerate_docsandreportnode prompts hardcodeddocs/paths.$DOCS_DIRis a first-class Archon variable (substituted indag-executor.tsviaresolvedDocsDir), configurable viadocs.pathin.archon/config.yaml. Hardcodingdocs/broke users who had chosen a different documentation directory. Replaced literaldocs/with$DOCS_DIRacross scaffolding,git diff --statinvocation, and the review instructions.README.md/CHANGELOG.mdreferences at the repo root are unchanged — those are not configured viadocs.path.archon validate workflows archon-doc-agentpasses after both commits.UX Journey
Before
After
User-facing addition:
archon workflow run archon-doc-agent [ref|all].Architecture Diagram
Before
After
No new modules, no new module-to-module edges.
Connection inventory:
.archon/workflows/defaults/archon-doc-agent.yamlLabel Snapshot
risk: lowsize: Sworkflowsworkflows:defaultsChange Metadata
featureworkflowsLinked Issue
(No linked issue — new workflow contribution.)
Validation Evidence (required)
bun run validateintentionally skipped: this PR adds a single YAML data file under.archon/workflows/defaults/. No TypeScript, no imports, no tests reference it. Type-check / lint / format / unit tests have no code paths to exercise. The workflow-schema validator above is the applicable check.Security Impact (required)
denied_tools: [Write, Edit]to enforce read-only analysis.docs/and (when fixes are applied) to source files in the execution directory. Same access profile asarchon-architectandarchon-refactor-safely.Compatibility / Migration
Human Verification (required)
What was personally validated beyond CI:
archon workflow run archon-doc-agent --branch docs/audit-smoke-test "HEAD~5"againstcoleam00/Archonitself.scope.txt; no out-of-scope edits.packages/providers/src/community/pi/provider.tsthat falsely claimed "v1 capabilities are all false" — updated to accurately list supported/unsupported caps.startServerdocumenting itsprocess.exit(1)side effect on fatal init failure.ClaudeProviderDefaultsmissing interface-level JSDoc while all fields are individually documented — correctly flagged rather than edited.$ARTIFACTS_DIR/report.mdwith the sections documented in the workflow (Summary, Source edits, Docs changes, Flagged for human review, How to review).allmode usesgit ls-files; filter stripsnode_modules/dist/build/__pycache__/etc.all-mode scan against this full repo (bounded smoke only); non-TS/Python/Go/Rust language coverage (heuristic export-scan currently handles JS/TS/Py/Go/Rust — other languages fall back to AI-only audit).Side Effects / Blast Radius (required)
archon-doc-agentworkflow discoverable to all Archon users after next release. Does not change behavior of any existing workflow.archon-architect,archon-refactor-safely). The workflow defaults to worktree isolation via--branch; fix_source does not commit, so changes are always reviewable viagit diffbefore acceptance.denied_tools: [Write, Edit]; fix_source is instructed to respect the scope file and skip uncertain findings; report surfaces skipped items.Rollback Plan (required)
git revert <merge-sha>or delete the file.archon/workflows/defaults/archon-doc-agent.yaml. No state, no migration, no cache to clear.defaults.loadDefaultWorkflows: falsein.archon/config.yaml.archon workflow listwould stop showingarchon-doc-agent.Risks and Mitigations
/docs/*.mdon first run.generate_docsnode instructs the AI to only claim what it can verify in code and to writeTODO: describe Xstubs for anything it is uncertain about, which the report surfaces for human review.fix_sourceis instructed to skip any finding markeduncertain: true; all edits land in the worktree diff for human review before commit; no auto-commit.$ARTIFACTS_DIR/scope.txt;fix_sourceinstruction forbids editing files outside it. Smoke test confirmed no scope drift on the verification run.🤖 Generated with Claude Code
Summary by CodeRabbit