Skip to content

feat(tools): #1246 #1247 #1248 — coding-agent feedback surface (research, tool guidance, ruff diagnostics)#1261

Open
ohdearquant wants to merge 2 commits into
mainfrom
show/lionagi-sweep/coding-tools
Open

feat(tools): #1246 #1247 #1248 — coding-agent feedback surface (research, tool guidance, ruff diagnostics)#1261
ohdearquant wants to merge 2 commits into
mainfrom
show/lionagi-sweep/coding-tools

Conversation

@ohdearquant

Copy link
Copy Markdown
Owner

Coding-harness feedback-surface slice from the lionagi-sweep show — the SWE-bench engagement thesis (harness > model). Critic APPROVE (CRIT:0 MAJ:0 MIN:3), claims verified against source; 63 tests pass; pre-commit ruff green.

Shipped

Deferred (drafted in the research doc)

Closes #1246
Closes #1248
Refs #1247 (first slice shipped; AST remainder drafted)

🤖 Generated with Claude Code

ohdearquant and others added 2 commits June 3, 2026 14:30
…nostics

Give the coding agent an IDE-grade feedback loop across reader/editor/bash
and a new static-analysis check tool.

- #1246: research OSS harnesses (OpenCode, mini-swe-agent) — agent loop,
  tool set, permission model, persistence. docs/research/coding-harnesses-2026-06.md
  with pinned commit SHAs, file:line citations, and a prioritized fold-in
  list. Drafts the remaining AST surface for the #1247 follow-up.
- #1248: tune reader/editor/bash guidance — error messages now explain why
  a failure happened and how to recover (failed edit → line-prefix/whitespace
  diagnostics + re-read; bash → cwd= over `cd &&`, PATH, timeout, truncation).
  Tests assert the recovery text and that recovery paths work.
- #1247: AST/static-analysis feedback tool — code_check shells out to ruff
  (optional dep, shutil.which guard, never raises) returning structured
  file:line:col diagnostics. Composes with the editor (edit → check). Test
  exercises a known-bad snippet end to end.

Tests: uv run pytest tests/tools/test_check.py tests/tools/test_guidance.py
tests/tools/test_reader.py — 63 passed. Critic verdict: APPROVE (CRIT:0 MAJ:0 MIN:3).

Refs #1246 #1247 #1248

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sweep results: research tool, tool guidance, ruff diagnostics modules
plus codebase-wide UP038/formatting fixes applied by the sweep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant