Skip to content

feat(xtest): Adds dpop verification tests#485

Draft
dmihalcik-virtru wants to merge 22 commits into
DSPX-3302-06-feature-orchestratefrom
DSPX-3397-kc26-dpop
Draft

feat(xtest): Adds dpop verification tests#485
dmihalcik-virtru wants to merge 22 commits into
DSPX-3302-06-feature-orchestratefrom
DSPX-3397-kc26-dpop

Conversation

@dmihalcik-virtru

@dmihalcik-virtru dmihalcik-virtru commented Jun 8, 2026

Copy link
Copy Markdown
Member

Tests-side scaffolding for DSPX-3397 — Keycloak v26 upgrade and comprehensive RFC 9449 DPoP support across the platform and all SDKs.

DPoP (Demonstrating Proof-of-Possession) binds OAuth access tokens to a per-client asymmetric key so a stolen bearer token can't be replayed. KC v26.4.0 promotes DPoP from preview to fully supported. This effort upgrades the OpenTDF stack to match, removes the AB-2235 workaround that disables DPoP on the Email Gateway client, and proves the new posture end-to-end.

What this PR contains

  • xtest/features/DSPX-3397.yaml — feature spec consumed by feature-orchestrate (5 cells, no proto changes)
  • xtest/scenarios/DSPX-3397.yaml — runtime scenario (KC26 instance + services.kas.dpop.require_nonce: true)
  • xtest/test_dpop.py — dormant tests: happy-path roundtrip, server-issued nonce retry (RFC 9449 §8), and four direct-HTTP negative skeletons (Bearer-on-DPoP, tampered htu, replayed jti, tampered/expired nonce)
  • xtest/tdfs.py — registers dpop in feature_type
  • (Pending in this branch) bump otdf-local and CI to default to Keycloak 26; flesh out the negative-test direct-HTTP plumbing

Tests land dormant — they stay skipped until each per-repo SDK PR adds a supports dpop case to its CLI shim, at which point the corresponding lane activates.

Sibling PRs (this feature spans four repos)

More detail

  • Specification & motivation: spec/DSPX-3397.md (on this branch)
  • Reference material: spec/DSPX-3397-refs/{keycloak-v26-release-notes.md, rfc9449-dpop-spec.md}
  • Acceptance criteria, edge cases (clock skew, URI normalization, gRPC), and the DPoP proof JWT contracts live in the spec.

🤖 Generated with Claude Code

Stack (a60d3302):

Generated by wgo stack. Edit text above or below this block, not inside it.

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 796d28f8-adbd-4d0a-b72a-df894ad99afa

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DSPX-3397-kc26-dpop

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dmihalcik-virtru dmihalcik-virtru changed the base branch from main to DSPX-3302-06-feature-orchestrate June 8, 2026 15:05

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a multi-instance test harness refactor and lays the groundwork for Keycloak v26 and DPoP support, introducing several Claude Code skills, named test environment instances, and draft integration tests. The review feedback highlights critical robustness and cross-platform issues, including Windows compatibility failures with os.getuid(), potential orchestrator crashes from unhandled subprocess exceptions when calling the claude CLI or go install, and unhandled git exceptions during remote tag resolution.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

I am having trouble creating individual review comments. Click here to see my feedback.

otdf-local/src/otdf_local/utils/keys.py (272-283)

high

Calling os.getuid() and os.getgid() directly will raise an AttributeError on Windows environments since these functions are Unix-specific. To ensure cross-platform compatibility, conditionally apply the --user argument only when these attributes are available on the os module.

    user_args = ["--user", f"{os.getuid()}:{os.getgid()}"] if hasattr(os, "getuid") else []
    result = subprocess.run(
        [
            "docker",
            "run",
            "--rm",
            "-v",
            f"{key_dir.resolve()}:/keys",
            "--entrypoint",
            "keytool",
            *user_args,
            "keycloak/keycloak:25.0",

otdf-sdk-mgr/src/otdf_sdk_mgr/cli_orchestrate.py (380-390)

high

If the claude executable is missing from the system's PATH or fails to start, subprocess.run will raise a FileNotFoundError or OSError. Since this is executed within a ThreadPoolExecutor, any unhandled exception will propagate when fut.result() is called, crashing the entire orchestrator loop. Wrap the execution in a try-except block to catch these exceptions and return a failed CellResult gracefully.

    try:
        with transcript.open("w", encoding="utf-8") as out:
            completed = subprocess.run(
                cmd,
                cwd=wt,
                stdout=out,
                stderr=subprocess.STDOUT,
                timeout=timeout_s,
            )
    except subprocess.TimeoutExpired:
        return CellResult(cell, wt, transcript, False, None, f"timed out after {timeout_s}s")
    except FileNotFoundError as e:
        return CellResult(cell, wt, transcript, False, None, f"claude CLI not found: {e}")
    except OSError as e:
        return CellResult(cell, wt, transcript, False, None, f"failed to start subagent: {e}")

otdf-sdk-mgr/src/otdf_sdk_mgr/registry.py (215)

medium

Git().ls_remote can raise a git.exc.GitCommandError or other git-related exceptions if the remote is unreachable or git is missing. These exceptions do not inherit from RegistryUnreachableError and will bypass the except RegistryUnreachableError block in cli_versions.py, resulting in unhandled tracebacks. Wrap the call to catch GitError and raise RegistryUnreachableError to maintain consistent error handling.

    from git.exc import GitError
    try:
        raw = Git().ls_remote(SDK_GIT_URLS["platform"], tags=True)
    except GitError as e:
        raise RegistryUnreachableError(f"failed to query platform remote tags: {e}") from e

otdf-sdk-mgr/src/otdf_sdk_mgr/checkout.py (93-94)

medium

Running git fetch origin <branch> inside a worktree added from a bare clone is redundant because the bare clone was already fetched on line 72. Additionally, if the worktree remote configuration is different or missing, this extra fetch can fail. It is cleaner and safer to reset directly to the fetched branch ref, matching the pattern used in platform_installer.py.

            _run(["git", "-C", str(worktree_path), "reset", "--hard", branch])

otdf-sdk-mgr/src/otdf_sdk_mgr/installers.py (49-60)

medium

subprocess.run can raise a FileNotFoundError if the go executable is not installed on the system. This is not caught by except InstallError in cli_install.py, leading to raw tracebacks. Wrap the execution in a try-except block to catch FileNotFoundError and OSError and raise InstallError instead.

    try:
        result = subprocess.run(
            ["go", "install", f"{module}@{tag}"],
            capture_output=True,
            text=True,
        )
    except FileNotFoundError as e:
        raise InstallError(f"go executable not found: {e}") from e
    except OSError as e:
        raise InstallError(f"failed to run go install: {e}") from e
    if result.returncode != 0:
        msg = f"go install failed for {module}@{tag}: {result.stderr.strip()}"
        if module == GO_MODULE_PATH_PLATFORM:
            raise InstallError(
                f"{msg}\nThe platform module path {module}@{tag} may not be published yet."
            )
        raise InstallError(msg)

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

X-Test Failure Report

Comment thread spec/DSPX-3397.md

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use DSPX-* file naming, make it descriptive

dmihalcik-virtru and others added 8 commits June 10, 2026 11:20
Upgrades the DSPX-3397 specification to detail a generic requirement
to update to Keycloak 26 and implement comprehensive DPoP support
across OpenTDF Java SDK, Web SDK, Platform services, and integration
tests.

Includes references summarizing:
- Keycloak 26 DPoP features and configuration settings
- RFC 9449 technical specifications (headers, claims, and flows)
Signed-off-by: Dave Mihalcik <dmihalcik@virtru.com>
Adds the tests-side artifacts for the Keycloak 26 + DPoP rollout:

- xtest/features/DSPX-3397.yaml: feature spec with five cells (tests,
  platform-service, platform-go-sdk, java-sdk, web-sdk). No platform-proto
  (DPoP is HTTP-header only) and no depends_on edges; each cell can land
  its draft PR independently and activate the dormant tests by adding a
  `supports dpop` case to its cli.sh.
- xtest/scenarios/DSPX-3397.yaml: standalone-KAS scenario selecting
  test_dpop.py against a KC26-enforced realm.
- xtest/tdfs.py: register "dpop" in feature_type (alphabetical, between
  connectrpc and ecwrap).
- xtest/test_dpop.py: draft tests gated on pfs/sdk supports("dpop"):
  happy-path roundtrip, server-issued DPoP-Nonce retry (RFC 9449 §8), and
  four direct-HTTP negative skeletons (Bearer-on-DPoP, tampered htu,
  replayed jti, tampered/expired nonce) to flesh out alongside the
  platform-service PR.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ccepts colons

The orchestrator's ruamel-based loader rejects unquoted scalars that look
like nested mappings (e.g. `Authorization: Bearer <token>`,
`token_type=DPoP`, `services.kas.dpop.require_nonce`). Re-quote all
todo entries so the spec round-trips through orchestrate run.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First orchestrate dispatch had platform-service and platform-go-sdk
subagents stop at clarifying questions instead of committing. Embed the
decisions directly in the cell todos so a `--force --only` re-dispatch
picks them up:

- platform-service: use the standard RFC 9449 challenge-then-accept
  pattern for nonces (first request without nonce gets 401+DPoP-Nonce,
  client retries with nonce, succeeds). Strict-from-start would break
  interop with every standard DPoP client.
- platform-go-sdk: implement the resource-side proof generation as an
  http.RoundTripper (idiomatic Go, composes into any *http.Client).
  Extend the existing TokenAddingInterceptor only for symmetry.

Both todos now also say "make decisions and proceed; do not stop to
clarify" so subagents commit instead of asking.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Dave Mihalcik <dmihalcik@virtru.com>
Review against the platform middleware (PR #3582 normalizeURL) and the Go
SDK transport (PR #3581 normalizeURI) showed the test was building `htu`
as a path-only string, while both the server and the production client
build it as the full <scheme>://<host><path> URL. Consequence: the
replayed-jti and tampered-nonce tests never reached their replay/tamper
assertions — every proof 401'd on htu mismatch at the first request.

This commit:

- Replaces every `htu=_rewrap_htu()` callsite with `htu=rewrap_call.url`
  (already the full normalized URL); drops the misleading helper.
- Changes the tampered-htu test to use a well-formed full URL with the
  wrong path so it exercises a real tamper, not a malformed value.
- Strengthens `_assert_unauthorized` to also assert the
  `WWW-Authenticate: DPoP` challenge header, so a misconfigured realm
  returning 401 for unrelated reasons doesn't silently "pass" the test.
- Adds a fresh-proof-same-jti sub-case to test_dpop_rejects_replayed_jti
  to exercise the stronger RFC 9449 §11.1 attack vector (server must
  remember jti values across requests, not just dedupe identical bytes).
- Renames test_dpop_rejects_tampered_or_expired_nonce →
  test_dpop_rejects_tampered_nonce (expiry case deferred for now).
- Switches _get_dpop_access_token to `pytest.skip(...)` when KC hands
  back a Bearer token, so misconfigured local envs produce a clear skip
  instead of an opaque AssertionError dumping the token response.
- Drops the redundant _env helper (os.getenv already does the same).
- Replaces hand-rolled _key_access_object/_policy_binding with
  KeyAccessObject.model_dump(exclude_none=True) — Pydantic field names
  already match Connect-RPC's JSON shape for kas.proto's KeyAccess.
- Moves time_now() above DPoPKey for top-to-bottom readability.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
dmihalcik-virtru and others added 13 commits June 10, 2026 11:20
Without this, test_dpop.py is collected nowhere in CI and the dormant
DPoP tests never even attempt to skip-or-run as each SDK lands its
`supports dpop` case. Add it alongside test_tdfs.py / test_policytypes.py
in both the all-SDK and focused-SDK invocations — the existing
sdk.skip_if_unsupported("dpop") gate keeps it inert on platforms/SDKs
that haven't shipped DPoP yet.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
test_dpop.py uses the attribute_single_kas_grant fixture, which depends
on kas_entry_alpha (alpha KAS at port 8181). Alpha isn't started until
the additional-KAS block that runs before the "Run attribute based
configuration tests" step, so test_dpop.py in the standard-xtests step
would either fail to wire up the fixture or produce TDFs the test can't
roundtrip.

Move it from lines 531/542 (standard xtests) to line 646 (attribute
step) alongside test_abac.py / test_pqc.py — both of those also require
the additional KAS instances.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…main

Scenario was running against `main` everywhere — useless for end-to-end
verification of the DSPX-3397 work, since main has no DPoP support yet
and the scenario would skip the whole test_dpop.py suite.

- Platform + alpha KAS now pin to DSPX-3397-platform-service so the
  middleware under test is actually present.
- Each SDK is listed twice in encrypt and decrypt: once at `main`
  (compatibility — verifies the DPoP-validating platform still serves
  pre-DPoP clients; main pairs skip test_dpop.py via the supports gate
  but run the rest of the attribute step) and once at its DSPX-3397
  branch (happy path).
- Cross-SDK pairs (encrypt with one, decrypt with another) fall out of
  the encrypt/decrypt matrix.
- Refreshed `actual:` to point at the five open PRs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…gates

PlatformFeatureSet.skip_if_unsupported currently always pytest.skip()s when a
feature isn't in the detected set. For features whose gate code (e.g. semver
detection) isn't wired up yet, that hides real client failures: every parametrization
silently skips, including ones whose SDK doesn't implement the feature at all.

Add --require-features (and matching XTEST_REQUIRE_FEATURES env var, and
suite.require_features in scenarios.yaml) that routes listed-but-missing
features through pytest.fail() instead. Use it during TDD on partially-landed
features to surface "client doesn't implement X" as a red signal, then flip back
to skip once the gate detection lands.

Plumbing:
- xtest/tdfs.py: new require_features set on PlatformFeatureSet; new
  is_feature_type TypeGuard; skip_if_unsupported routes via require_features.
- xtest/conftest.py: --require-features option (reuses is_type_or_list_of_types
  validator) + scenario.suite.require_features fallback + env-var bridge.
- otdf-sdk-mgr/schema.py: additive list[str] field on Suite.
- otdf-local/cli_scenario.py: _build_pytest_args forwards suite.require_features
  as --require-features.

Verified end-to-end against the running DSPX-3397 instance: baseline shows
SKIPPED; --require-features dpop and XTEST_REQUIRE_FEATURES=dpop both flip to
FAILED; --require-features dpopp is rejected at argparse; suite.require_features
in scenarios.yaml propagates correctly through otdf-local scenario run.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… lane

Two platform feature gates are now probed via /.well-known/opentdf-configuration
instead of guessed by semver — branch builds report stale versions so the
well-known endpoint is the only honest source:
- dpop: presence of dpop_supported_alg_values
- dpop_nonce_challenge: dpop_nonce_required == true

cli.sh for go/java/js grow a `dpop | dpop_nonce_challenge` case that probes
`help encrypt` for the --dpop flag. Same probe covers both gates; nonce-mode
support reuses the same plumbing.

test_dpop.py's CLIENTID default flips to opentdf-dpop (the DPoP-bound
Keycloak client provisioned alongside the Bearer-only opentdf client).
test_dpop_server_issued_nonce_retry additionally gates on
dpop_nonce_challenge so it skips when only the base DPoP middleware is on.

Scenario DSPX-3397 now declares both features as required so the lane
fails (red bar) rather than skipping while per-repo PRs finish landing.

Also includes a one-line port of the otdf-local platform.py logger fix
(already on DSPX-3382-mlkem-scenarios; the warning block referenced a
missing self.logger attribute and broke `otdf-local up` for any
source-pinned instance).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ution bugs

Four targeted fixes that let `install scenario` actually refresh and rebuild
every artifact in a scenario that uses git-ref pins (branches, PR heads, SHAs).
Together they close the source-build path so the SDK matrix lights up the same
way the platform pin already did.

1. cli_scenario.py — pick install_release vs install_source per entry.
   The SDK loop unconditionally called `install_release(sdk, version)` and
   silently dropped any version that didn't resolve to a published artifact
   (the DSPX-3417 gap). Now mutable refs route through a new `install_source`
   helper that wraps `cmd_tip` and returns the dist path. Released semver
   versions stay on the existing download path. `entry.source` field is
   preserved as metadata for the go-from-platform vs go-from-otdfctl-repo
   disambiguation inside cmd_tip — not used as the routing switch since
   `is_mutable_ref` is the better signal.

2. cli_scenario.py — don't clobber Scenario YAMLs.
   The `source → dist` substitution dumped the inner Instance back over the
   path argument. For a Scenario YAML that wiped sdks/suite/expected/actual
   on every install. Gated the dump on `scenario is None` (only Instance-only
   YAMLs); installed.json already records the dist path for downstream
   tooling.

3. platform_installer.py — `_resolve_platform_ref` left branch names alone.
   `DSPX-3397-platform-service` was being treated as a release version and
   prefixed with `service/v`, producing `service/vDSPX-3397-platform-service`
   which git can't resolve. Only apply the infix when the input parses as
   semver (DSPX-3418 fix); plain branch names now pass through unchanged.

4. checkout.py — FETCH_HEAD across bare/worktree boundary.
   `checkout_go_from_platform` fetched into the bare repo then ran
   `git -C <worktree> checkout --force FETCH_HEAD`, but FETCH_HEAD lives in
   the bare's git dir and isn't visible from the worktree. Mirrors the
   pattern checkout_sdk_branch already uses for non-go SDKs: reset --hard
   to the named ref instead.

Verified end-to-end: `install scenario --skip-scripts
xtest/scenarios/DSPX-3397.yaml` (which mixes go from platform monorepo,
java/js from their own repos, all on branches) now builds 1 platform pin +
6 SDK pins and writes a complete installed.json — previously stopped at
the platform pin with empty SDK arrays.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…phaned worktrees

A Ctrl-C'd `git clone --bare` leaves a directory with `config`/`HEAD`/`objects`
but no `refs/`, and git rejects it as "not a git repository" forever after.
Probe with `rev-parse --is-bare-repository` before trusting an existing bare
clone; rm-rf and re-clone on failure.

Re-cloning the bare wipes per-worktree admin dirs at `<bare>/worktrees/<slug>/`,
which orphans any sibling worktree directories (their `.git` file points to a
missing admin location). `_ensure_worktree` / `_drop_orphaned_worktree` now
probe with `rev-parse --git-dir`; an orphan is rmtree'd and re-added rather
than failing on `git -C <worktree> reset --hard`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ig time

Platform and KAS binaries both run with `cwd=<platform-worktree>`, but
both opentdf-dev.yaml and opentdf-kas-mode.yaml templates use relative
key paths (`kas-private.pem`, `./keys/kas-private.pem`) intended to
resolve under `instances/<id>/keys/`. The previous workaround was a
manual yq edit per instance — brittle, and broken across worktrees
because absolute paths from one tree leaked into the next.

`rewrite_crypto_keys_to_absolute` resolves each entry under
`server.cryptoProvider.standard.keys` against the instance keys dir and
drops entries whose backing file doesn't exist (e.g. PQC keys not
generated in the bootstrap bundle). Called from
`_provision_instance_dir` (platform config) and `_generate_config`
(per-KAS config).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
attribute_single_kas_grant returns a bare Attribute, not the
(Attribute, list[str]) tuple the test annotated and unpacked. The
existing usages only need attr.value_fqns, so collapse the type
annotation and drop the tuple destructure across all 6 test methods.

scenarios/DSPX-3397.yaml: targets are passed verbatim to pytest with
cwd=xtest/, so write 'test_dpop.py' rather than 'xtest/test_dpop.py'
which gets joined to 'xtest/xtest/test_dpop.py' and fails collection.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…its imports

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…upported

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-06-feature-orchestrate branch from fb4bb87 to 13424da Compare June 10, 2026 15:20
@github-actions

Copy link
Copy Markdown

X-Test Failure Report

@sonarqubecloud

Copy link
Copy Markdown

@github-actions

Copy link
Copy Markdown

X-Test Failure Report

Replaces the wishful test_dpop_rejects_bearer_scheme_on_dpop_token (which
expected 401 with no DPoP proof, a different scenario) with one that
exercises the actual SDK-drift case: a DPoP-bound token presented under
Bearer scheme with a valid DPoP proof attached.

Current platform behavior is to accept the request (200) and emit a WARN
log per RFC 9449 §7.1. The test asserts both, plus a compliant DPoP-scheme
control. A TODO references DSPX-3573 for flipping to hard rejection once
all SDKs are compliant.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants