DSPX 3382 mlkem scenarios#463
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements a multi-instance test harness refactor, enabling concurrent local platform and KAS instances on disjoint port ranges, alongside Claude agent skills for scenario and feature design. It also adds support for pure ML-KEM key wrapping (mlkem:768 and mlkem:1024) and allows installing platform binaries at specific git refs using git worktrees. The review feedback highlights opportunities to improve robustness, such as handling missing git executables or network errors during remote tag queries, verifying worktree paths exist before setting them as working directories, removing redundant git fetch operations, and consistently specifying encoding="utf-8" on file read/write operations to avoid platform-dependent decoding issues.
| from git import Git | ||
|
|
||
| results: list[dict[str, Any]] = [] | ||
| raw = Git().ls_remote(SDK_GIT_URLS["platform"], tags=True) |
There was a problem hiding this comment.
If git is not installed on the system, or if there is a network outage, Git().ls_remote will raise git.exc.GitCommandNotFound or git.exc.GitCommandError. These exceptions are not caught in cli_versions.py (which only catches RegistryUnreachableError), leading to an unhandled traceback and crash. We should wrap the Git().ls_remote call in a try...except block and raise RegistryUnreachableError.
| raw = Git().ls_remote(SDK_GIT_URLS["platform"], tags=True) | |
| from git.exc import GitCommandError, GitCommandNotFound | |
| try: | |
| raw = Git().ls_remote(SDK_GIT_URLS["platform"], tags=True) | |
| except GitCommandNotFound as e: | |
| raise RegistryUnreachableError(f"git executable not found: {e}") from e | |
| except GitCommandError as e: | |
| raise RegistryUnreachableError(f"failed to query remote git platform tags: {e}") from e |
| worktree = binary.parent | ||
| version_file = binary.parent / ".version" | ||
| if version_file.exists(): | ||
| for line in version_file.read_text().splitlines(): | ||
| if line.startswith("worktree="): | ||
| worktree = Path(line.split("=", 1)[1].strip()) | ||
| break |
There was a problem hiding this comment.
If the repository has been moved to another location on disk, the absolute path recorded in .version will point to a non-existent directory. Setting cwd=worktree in subprocess.run or ProcessManager.start will then crash with a FileNotFoundError. We should check if the candidate worktree path exists before using it, falling back to binary.parent if it doesn't. Also, read_text() should specify encoding="utf-8".
| worktree = binary.parent | |
| version_file = binary.parent / ".version" | |
| if version_file.exists(): | |
| for line in version_file.read_text().splitlines(): | |
| if line.startswith("worktree="): | |
| worktree = Path(line.split("=", 1)[1].strip()) | |
| break | |
| worktree = binary.parent | |
| version_file = binary.parent / ".version" | |
| if version_file.exists(): | |
| for line in version_file.read_text(encoding="utf-8").splitlines(): | |
| if line.startswith("worktree="): | |
| candidate = Path(line.split("=", 1)[1].strip()) | |
| if candidate.exists(): | |
| worktree = candidate | |
| break |
| worktree = binary.parent # safe fallback | ||
| version_file = binary.parent / ".version" | ||
| if version_file.exists(): | ||
| for line in version_file.read_text().splitlines(): | ||
| if line.startswith("worktree="): | ||
| worktree = Path(line.split("=", 1)[1].strip()) | ||
| break |
There was a problem hiding this comment.
If the repository has been moved to another location on disk, the absolute path recorded in .version will point to a non-existent directory, causing a crash when setting cwd=worktree. We should check if the candidate worktree path exists before using it, falling back to binary.parent if it doesn't. Also, read_text() should specify encoding="utf-8".
| worktree = binary.parent # safe fallback | |
| version_file = binary.parent / ".version" | |
| if version_file.exists(): | |
| for line in version_file.read_text().splitlines(): | |
| if line.startswith("worktree="): | |
| worktree = Path(line.split("=", 1)[1].strip()) | |
| break | |
| worktree = binary.parent # safe fallback | |
| version_file = binary.parent / ".version" | |
| if version_file.exists(): | |
| for line in version_file.read_text(encoding="utf-8").splitlines(): | |
| if line.startswith("worktree="): | |
| candidate = Path(line.split("=", 1)[1].strip()) | |
| if candidate.exists(): | |
| worktree = candidate | |
| break |
| # Worktrees from a bare clone have no `origin` remote, so reset | ||
| # to the just-fetched ref in the bare repo instead of `git pull`. | ||
| _run(["git", "-C", str(worktree_path), "fetch", "origin", branch]) | ||
| _run(["git", "-C", str(worktree_path), "reset", "--hard", "FETCH_HEAD"]) |
There was a problem hiding this comment.
The comment states that worktrees from a bare clone have no origin remote, but the code immediately runs git fetch origin. Since the bare repository has already fetched the updates (at line 72), the ref is already updated in the shared database. Running git fetch again inside the worktree is redundant and slow. We can simplify this by resetting directly to the branch.
| # Worktrees from a bare clone have no `origin` remote, so reset | |
| # to the just-fetched ref in the bare repo instead of `git pull`. | |
| _run(["git", "-C", str(worktree_path), "fetch", "origin", branch]) | |
| _run(["git", "-C", str(worktree_path), "reset", "--hard", "FETCH_HEAD"]) | |
| # Worktrees from a bare clone share the bare repo's database, so we | |
| # reset to the bare repo's just-fetched ref instead of fetching again. | |
| _run(["git", "-C", str(worktree_path), "reset", "--hard", branch]) |
| _run(["git", f"--git-dir={bare_repo_path}", "fetch", "origin", ref, "--tags"]) | ||
| _run(["git", "-C", str(worktree_path), "checkout", "--force", "FETCH_HEAD"]) |
There was a problem hiding this comment.
The bare repository has already fetched the updates (at line 140 or 145). Running git fetch again is redundant and slow. We can simplify this by checking out the ref directly.
| _run(["git", f"--git-dir={bare_repo_path}", "fetch", "origin", ref, "--tags"]) | |
| _run(["git", "-C", str(worktree_path), "checkout", "--force", "FETCH_HEAD"]) | |
| _run(["git", "-C", str(worktree_path), "checkout", "--force", ref]) |
| from ruamel.yaml import YAML | ||
|
|
||
| y = YAML(typ="safe") | ||
| raw = y.load(scenario_path.read_text()) |
There was a problem hiding this comment.
| print(f" Warning: {msg} (will retry at runtime)") | ||
| raise InstallError(msg) | ||
| dist_dir.mkdir(parents=True, exist_ok=True) | ||
| (dist_dir / ".version").write_text(f"{module}@{tag}\n") |
There was a problem hiding this comment.
| def _record_version(dist_dir: Path, ref: str, worktree: Path) -> None: | ||
| """Write a `.version` metadata file alongside the binary.""" | ||
| sha = _git_rev_parse(worktree, "HEAD") | ||
| (dist_dir / ".version").write_text(f"ref={ref}\nsha={sha}\nworktree={worktree}\n") |
There was a problem hiding this comment.
write_text() is called without specifying an encoding, which can lead to platform-dependent encoding errors. It is safer to specify encoding="utf-8".
| (dist_dir / ".version").write_text(f"ref={ref}\nsha={sha}\nworktree={worktree}\n") | |
| (dist_dir / ".version").write_text(f"ref={ref}\nsha={sha}\nworktree={worktree}\n", encoding="utf-8") |
| subprocess.CalledProcessError, | ||
| OSError, | ||
| ) as e: | ||
| out.write_text(json.dumps(_snapshot(status="partial"), indent=2) + "\n") |
There was a problem hiding this comment.
write_text() is called without specifying an encoding, which can lead to platform-dependent encoding errors. It is safer to specify encoding="utf-8".
| out.write_text(json.dumps(_snapshot(status="partial"), indent=2) + "\n") | |
| out.write_text(json.dumps(_snapshot(status="partial"), indent=2) + "\n", encoding="utf-8") |
| typer.echo(f" Wrote partial manifest to {out}", err=True) | ||
| raise typer.Exit(1) | ||
|
|
||
| out.write_text(json.dumps(_snapshot(), indent=2) + "\n") |
There was a problem hiding this comment.
26cafa1 to
962c03a
Compare
2970620 to
1949128
Compare
X-Test Failure Report |
1 similar comment
X-Test Failure Report |
3601aa4 to
b12abe7
Compare
d787998 to
ad4b4a3
Compare
X-Test Failure Report |
b12abe7 to
7ec8c8c
Compare
ad4b4a3 to
256b593
Compare
X-Test Failure Report |
1 similar comment
X-Test Failure Report |
7ec8c8c to
8be00f9
Compare
256b593 to
5e2c74b
Compare
X-Test Failure Report |
1 similar comment
X-Test Failure Report |
8be00f9 to
02b100c
Compare
X-Test Failure Report✅ java@v0.15.0-main |
02b100c to
9a28f3c
Compare
Adds two new pytest CLI options so xtest can be driven by a scenarios.yaml
and run against a specific otdf-local instance.
--scenario PATH When set, defaults --sdks-encrypt, --sdks-decrypt,
and --containers from the scenario's `sdks` and
`suite` blocks. Options explicitly passed on the
CLI always override.
--instance NAME Propagated to OTDF_LOCAL_INSTANCE_NAME so child
`otdf-local` invocations within the test see the
same instance the scenario expects.
If otdf-sdk-mgr is not installed (minimal pytest environments), the
--scenario flag silently no-ops via an ImportError guard. The flag
shape is invariant either way so CI configs don't fork.
This is the consumer side of the PR 3 / scenario-driven flow: the
authoritative entry point remains `otdf-local scenario run <path>`,
which sets these flags for you; this PR lets pytest accept them
directly when running scenario-aware sessions outside the wrapper.
Refs: https://virtru.atlassian.net/browse/DSPX-3302
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds five Claude Code skills under tests/.claude/skills/ that together
turn a Jira bug ticket into a running reproduction, plus a downstream-
installable plugin manifest under .claude/plugin/.
Why
---
The end-to-end goal of DSPX-3302 is to make bug reproduction approachable
for QA, downstream-product engineers, and CI. PRs 1-4 build the plumbing
(shared schema, platform installer, multi-instance otdf-local, xtest
conftest hooks). This PR is the user-facing surface: a Claude can pull
context from Jira, draft an xtest/scenarios/<jira-key>.yaml (and, when
needed, an xtest/bug_<jira_key>_test.py), bring the environment up at
the right version pins, run the scenario's pytest selection, and tear
down.
Skills
------
scenario-from-bug-report
Pulls the Jira issue and its comments via `acli jira workitem view
--fields '*all' --json` and `acli jira workitem comment list`,
extracts version pins / KAS topology / container type / feature
flags, then writes xtest/scenarios/<jira-key-lowercased>.yaml
validated against otdf_sdk_mgr.schema.Scenario. Drafts a new
xtest/bug_<id>_test.py only when no existing pytest covers the
case; never silently lands assertions.
scenario-up
Runs `otdf-sdk-mgr install scenario`, then `otdf-local instance
init --from-scenario`, then `otdf-local --instance <name> up`, and
polls status until healthy. Surfaces logs rather than retrying
blindly when something stays unhealthy.
scenario-run
Invokes `otdf-local scenario run <path>` and classifies the
result: "bug reproduced" / "not reproduced" / "unrelated failure".
Cites the evidence line and points at per-service logs.
scenario-tear-down
Stops the instance and optionally removes the directory after
explicit user confirmation.
instance-status
Lists known instances, their port bases, health, and flags port
collisions.
Jira-safety
-----------
Permissions in both .claude/settings.json and the plugin manifest
allow only read+comment via acli jira: workitem view, workitem search,
workitem comment list, workitem comment create, plus a handful of
read-only project/board/sprint queries. edit, delete, transition,
assign, archive, link create, watcher add are all denied. The
plugin.json carries a permission_notes block explaining the policy.
Plugin manifest
---------------
.claude/plugin/plugin.json declares the skill names, runtime
requirements (uv, go, git, docker, acli), and the canonical permission
allowlist, so downstream first/third-party integrators can install
this plugin into their own Claude Code setups.
Refs: https://virtru.atlassian.net/browse/DSPX-3302
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…m-ticket (DSPX-3302) Headless dogfooding (run-1 on DSPX-2719) showed the bug-only framing was too narrow — the common workflow is writing tests for new features first (TDD), not reproducing version-pinned bugs. - Rename and rewrite the skill to branch on Jira Issue Type. Bug follows the old expected/actual flow; Story/Task uses ref pins (`main`, feature branch, PR SHA via `gh pr view --json headRefOid`) for forward-looking regression gates; Spike bails out rather than fabricating. Mandates `acli workitem comment list` and steers away from cli.sh greps (both were run-1 gaps). - New `scenario-matrix` sibling skill: write N scenario files from a base × N refs (PRs/branches/releases). Schema/installer support was already there via `PlatformPin.source.ref` and `install_platform_source(ref)` — no other changes needed. - `scenario-run` output classification generalized from "bug reproduced / not reproduced" to "expected / unexpected outcome", with explicit branches for bug-repro vs TDD interpretations. - `scenario-up` description and `plugin.json` (description, skills array, requirements) updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
For features (or bugs) that touch more than one OpenTDF repo — platform
plus the Go / Java / JS SDKs — feature-design captures the work as a
single spec at xtest/features/<name>.yaml plus the tests-side artifacts
that land first (feature_type entry in tdfs.py, scenario, draft test).
The model matches the team's existing pattern: tests-side artifacts
merge first, dormant under a `supports("<feature>")` gate, and each
per-repo PR activates the gate by adding `supports <feature>` to its
cli.sh. PRs land async, in any order; no cross-PR lockstep needed.
- `feature-design` SKILL: propose-then-iterate authoring from a Jira
ticket (or free-form description). Drafts a complete spec on the
first pass, asks one composite redirect question, then writes the
spec + patches tdfs.py + invokes scenario-from-ticket internally
to produce the dormant scenario and draft test. Bails on Spike or
unclear tickets rather than fabricating.
- `xtest/features/{README,CLAUDE}.md`: progressive-disclosure docs —
human-facing README and agent-facing CLAUDE.md.
- `xtest/README.md` gains a brief "Test artifact directories" section
pointing at scenarios/ and features/.
- `settings.json` + `plugin.json`: Write(xtest/features/**) allowlist,
feature-design added to plugin skills array.
The complementary feature-orchestrate skill (fanning out per-repo
subagents to draft impl PRs in each touched repo) is a follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…PX-3302) Headless dogfooding (runs 1 and 2 of scenario-from-ticket on DSPX-2719) surfaced two real gaps: - The `Skill` tool was denied on both runs because the allowlist didn't cover it, so the body of SKILL.md wasn't injected on invocation; the agent had to manually `Read` the skill file ~25 turns in, wasting time and biasing exploration toward grepping unrelated files first. Add `Skill(*)` to settings.json and per-skill `Skill(<name>)` entries to plugin.json (the latter enumerates exactly what downstream installs get, since they shouldn't inherit a wildcard). - `acli jira workitem comment list` requires `--key <KEY>` (the subcommand differs from `view`, which takes the key positionally). Both scenario-from-ticket and feature-design had the wrong form; corrected, with a one-line note about the asymmetry so the next agent doesn't paraphrase. Verified via run-3 on DSPX-2719: 41 turns / 5m16s / $1.07 (vs run-1's 48 turns / 6m44s / $1.27). Skill tool returned success on first call, both acli commands ran cleanly, the Story/Task branch produced `source.ref: main` pins correctly (no more incorrectly defaulting to `dist: lts`), and the agent's `actual:` field correctly enumerated all three test-infrastructure prerequisites including a `with_ecdsa_binding` parameter that run-1's scenario missed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…emas (DSPX-3302)
Headless runs of scenario-from-ticket kept trying `python3 -c "from
otdf_sdk_mgr.schema import Scenario; ..."` to introspect Pydantic model
shape while authoring scenarios. That form isn't in the plugin's Bash
allowlist (deliberately — it's arbitrary code execution), so the agent
fell back to Reading schema.py source. Static, committed JSON Schemas
give the same information declaratively without needing a python verb
in the allowlist at all.
- `otdf-sdk-mgr schema dump [--out-dir]`: writes
`xtest/schema/{scenario,instance}.schema.json` from
`Model.model_json_schema()`, sorted-keys + trailing newline so output
is byte-stable. Add new models to `SCHEMAS` in cli_schema.py and they
get picked up automatically.
- `xtest/schema/` is committed with the generated files plus brief
README/CLAUDE.md (progressive-disclosure, mirroring xtest/features/).
- `test_schema_sync.py` parametrizes over `SCHEMAS` and fails if any
committed file drifts from the live model — the safety net for
"someone edited a Pydantic model without regenerating."
- `scenario-from-ticket` SKILL.md Step 5 now points at
`xtest/schema/scenario.schema.json` as the canonical field list.
- `xtest/README.md` lists the new directory alongside `scenarios/` and
`features/`.
No allowlist changes needed — `Bash(uv run otdf-sdk-mgr *)` already
covers the dump subcommand, and `Read` is unrestricted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… only Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Rewrite all seven scenario/instance skills to follow plugin-dev's skill-development conventions (third-person trigger descriptions, imperative body voice, progressive disclosure into references/ and scripts/), and close the friction items surfaced during the pure-mlkem session: - instance-status: cross-worktree port + docker-compose probe so sibling-worktree services don't go undetected. - scenario-up: bootstrap-pr-worktree.sh pre-flight for missing kas-*.pem / keys / opentdf.yaml on fresh PR worktrees, plus a partial-install guard against silent empty SDK arrays in installed.json. Documents OTDFCTL_HEADS and PLATFORM_VERSION workarounds. - scenario-run: source-build pytest fallback when installed.json is empty, and a new "assertion-stricter-than-implementation" classifier bucket for aspirational expectations. - scenario-from-ticket: auto-pin tickets with linked GitHub PRs to source.ref:<headRefOid>; YAML templates extracted to references/yaml-templates.md. - scenario-tear-down: shared-docker probe across worktrees. - scenario-matrix: dedup workaround note pending DSPX-3417. - feature-design: lightly retouched, cross-link to scenario-doctor. - New scenario-doctor skill: diff running-vs-intended state via scripts/diff-running-vs-intended.sh; verbose recipes in references/probe-recipes.md. Inline links to DSPX-3415..3419 mark each documented workaround as temporary so it can be removed when the corresponding CLI fix lands. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…orrect schema - scenario-up: drop Step 2.5 bootstrap script (instance init now self-provisions keys + opentdf.yaml per 74492a4); add an `otdf-local env` sanity check before handing off to scenario-run. - scenario-run: replace stale `suite.select` references with `suite.targets` (list); update Step 1b source-build fallback template to unpack targets positionally. - scenario-doctor: Step 2 now invokes the read-only verifier against the instance dir instead of the platform worktree. - scenario-from-ticket: one stale `suite.select` mention fixed. - Move bootstrap-pr-worktree.sh → scenario-doctor/scripts/check-instance-seed.sh and rewrite as a read-only verifier (no cp/rmdir side effects). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Ensures consistency with other otdf-local commands (up, down, status, logs) that already show the --instance parameter. While scenario run has a default that reads from the YAML, explicitly including --instance makes examples self-documenting and prevents confusion. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Systemic fix for the workflow where `otdf-sdk-mgr install` builds binaries but `otdf-local` doesn't use them, falling back to stale `go run` from source. **Phase 1: otdf-sdk-mgr auto-converts source → dist** - After successful platform build, `install scenario` now updates instance.yaml - Converts `platform.source.ref` → `platform.dist: <slug>` - Uses dump_instance() to persist the change - User sees: "Updating instance to use platform dist: refs--heads--..." **Phase 2: otdf-local warns on source mode** - Before launching platform, checks if `instance.platform.source` is set - Emits structured warning with instance name, ref, and fix command - Helps users diagnose why changes aren't appearing **Phase 3: scenario-doctor detects source mode** - diff-running-vs-intended.sh now checks for platform.source before diff table - Emits actionable warning: "run otdf-sdk-mgr install scenario <path>" - Fixed bash compatibility issue (requires bash 5+ for associative arrays) This ensures the "install → use" workflow works correctly without manual intervention. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Format bash code block in scenario-tear-down SKILL.md - Format cross-worktree-probe.sh (case statement indentation, pipe operators) - Format diff-running-vs-intended.sh (compound commands, line continuations) - All changes follow .editorconfig: 2-space indent, bash variant, function braces on same line - Passes shellcheck validation Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This reverts commit be25ebb.
This reverts commit 1e90caf.
This reverts commit 336ea1b.
keep_padding = true caused bizarre column-alignment when shfmt expanded
inline { ... } blocks, aligning body content to the column of the opening
brace rather than using normal indentation.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Scenario YAML files can now configure KAS preview settings via the features dict, avoiding manual edits to generated opentdf.yaml files that don't persist through service restarts. - Documents KasPin.features field with open-ended description for forward compatibility with new platform preview settings - Updates pure-mlkem scenario to enable hybrid_tdf_enabled for ML-KEM - Adds FEATURES.md guide with examples and precedence rules - Regenerates JSON schemas from Pydantic models Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
cd203bb to
3c72360
Compare
9a28f3c to
0409534
Compare
|
X-Test Failure Report |
3c72360 to
ede44c7
Compare



Stack (
a60d3302):Generated by
wgo stack. Edit text above or below this block, not inside it.