Skip to content

feat(xtest): otdf-local multi-instance refactor#452

Open
dmihalcik-virtru wants to merge 17 commits into
mainfrom
DSPX-3302-03-multi-instance
Open

feat(xtest): otdf-local multi-instance refactor#452
dmihalcik-virtru wants to merge 17 commits into
mainfrom
DSPX-3302-03-multi-instance

Conversation

@dmihalcik-virtru

@dmihalcik-virtru dmihalcik-virtru commented May 15, 2026

Copy link
Copy Markdown
Member

Summary

Refactors otdf-local from a single-instance CLI to a multi-instance harness. Each named instance under tests/instances/<name>/ owns its own opentdf.yaml, keys, KAS configs, and port range, and references platform binaries managed by otdf-sdk-mgr (PR #451).

Settings — gains instance_name, instance_dir, instances_root. Per-instance paths activate when instance.yaml exists; legacy behavior is preserved without it.

Ports — parameterize on instance.ports.base via a KAS_OFFSETS table so two instances on different bases coexist.

ServicesPlatformService / KASService use the pinned xtest/platform/dist/<dist>/service binary when an instance is loaded; go run ./service path runs unchanged otherwise. KAS features (ec_tdf_enabled, etc.) come from instance.yaml.

New CLI surface:

  • Top-level --instance NAME
  • otdf-local instance init <name> [--from-scenario PATH] [--ports-base N] [--platform DIST] — scaffolds directory, auto-generates keys and opentdf.yaml with a fresh root key
  • otdf-local instance ls [--json], otdf-local instance rm <name> -y
  • otdf-local scenario run <path> — translates scenario suite block to pytest args

Other:

  • otdf-local/pyproject.toml declares otdf-sdk-mgr as a uv workspace dependency
  • .gitignore covers /instances/, xtest/scenarios/*.installed.json, .claude/tmp/
  • 5 new unit tests in test_multi_instance.py

Test plan

  • cd otdf-local && uv run pytest tests/ -m 'not integration' — 27 passing
  • uv run otdf-local instance init demo --from-scenario <path> — directory layout correct
  • uv run otdf-local instance ls --json — enumerates instance
  • uv run pyright — 0 errors

Jira: https://virtru.atlassian.net/browse/DSPX-3302

🤖 Generated with Claude Code

Stack (a60d3302):

Generated by wgo stack. Edit text above or below this block, not inside it.

Summary by CodeRabbit

  • New Features

    • Multi-instance CLI: init, list, and remove instances; select active instance via --instance.
    • Run test scenarios against a chosen instance with a new scenario run command.
  • Improvements

    • Per-instance ports, configs and environment are used for services and Docker commands.
    • Platform and KAS honor instance-pinned binaries/configs.
    • Automated local TLS and JKS truststore generation for testing.
  • Tests

    • New tests for scenario→pytest arg translation and multi-instance behavior.
  • Chores

    • Added .gitignore rules and project dependency configuration.

@coderabbitai

coderabbitai Bot commented May 15, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Implements multi-instance support: Settings load per-instance manifests to derive ports and paths; CLI adds instance and scenario subcommands; services (platform, KAS, docker) and key utilities become instance-aware; tests cover scenario→pytest arg translation and multi-instance smoke checks.

Changes

Multi-instance test harness

Layer / File(s) Summary
Dependency and gitignore configuration
otdf-local/pyproject.toml, .gitignore
Adds otdf-sdk-mgr to runtime dependencies with local source mapping and ignores /instances/, xtest/scenarios/*.installed.json, and .claude/tmp/ in VCS.
Port arithmetic with base offset
otdf-local/src/otdf_local/config/ports.py
Adds KAS_OFFSETS and platform_port_for; get_kas_port computes KAS ports as base + offset and all_kas_names() derives names from KAS_OFFSETS.
Settings refactor for multi-instance awareness
otdf-local/src/otdf_local/config/settings.py
Settings optionally loads instance.yaml, exposes tests_root, instances_root, instance_dir, instance_yaml, makes platform_dir nullable, adds has_instance(), load_instance(), get_platform_port(), resolve_binary_worktree(), and routes per-instance logs_dir/keys_dir/config_dir/kas paths; ensures instance kas dir creation.
Root CLI wiring and instance option
otdf-local/src/otdf_local/cli.py
Lazily registers instance and scenario subapps, adds --instance to export selection and clear settings cache, updates platform health/readiness to use instance-aware ports, and guards PLATFORM_DIR/SCHEMA_FILE exports.
Instance management CLI (init, ls, rm)
otdf-local/src/otdf_local/cli_instance.py
New instance Typer app with init (from scenario or minimal), _provision_instance_dir (ensure keys, copy template, inject root_key), _validate_port_uniqueness, ls (JSON/human output), and rm (confirm+remove). Validates instance names and creates kas/, keys/, logs/.
Scenario execution CLI
otdf-local/src/otdf_local/cli_scenario.py
New scenario Typer app with _build_pytest_args and run to load scenarios.yaml, resolve/export instance, build pytest argv (targets, containers, -k/-m, SDK tokens, extra args) and invoke uv run pytest in xtest_root.
Docker service instance environment
otdf-local/src/otdf_local/services/docker.py
Adds _compose_env() copying os.environ and injecting per-instance KEYS_DIR; start, stop, and get_container_status pass this env into subprocess.run.
KAS service instance pinning and config
otdf-local/src/otdf_local/services/kas.py
KASService derives ports from Settings.get_kas_port, prefers per-instance pin mode == "key_management" for KM detection, resolves pinned binary/worktree for config generation, merges per-instance kas_pin.features, runs pinned binaries with proper cwd; KASManager restricts managed names to instance manifest and exposes get_instance_names().
Platform service instance pinning and config
otdf-local/src/otdf_local/services/platform.py
PlatformService derives port from instance base, resolves instance worktree for config/template selection, patches existing per-instance config with set_nested, installs golden keys into instance key dir, and runs pinned binary with instance cwd or falls back to legacy go run.
Key and certificate generation for TLS and truststore
otdf-local/src/otdf_local/utils/keys.py
Adds generate_localhost_cert() and generate_ca_jks() (PKCS#12 export + keytool in Docker), extends ensure_keys_exist() to require TLS and ca.jks, and writes absolute golden key paths with 0o600 permissions in setup_golden_keys().
Scenario-to-pytest argument translation tests
otdf-local/tests/test_cli_scenario.py
Unit tests for _build_pytest_args covering empty/multiple targets, containers flag, -k/-m/extra args, and SDK encrypt/decrypt token formatting.
Multi-instance infrastructure smoke tests
otdf-local/tests/test_multi_instance.py
Smoke tests for Ports offset arithmetic, Settings behavior with and without persisted instance.yaml, per-instance logs_dir/keys_dir path resolution, and platform_port_for() behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • opentdf/tests#450: Multi-instance CLI consumes shared otdf_sdk_mgr.schema models for loading Instance/Scenario YAML and translating scenario SDK pins.
  • opentdf/tests#451: Adds scenario install artifacts (*.installed.json) that this PR now ignores and wires via otdf-sdk-mgr.
  • opentdf/tests#427: Related changes to key generation and Docker/keytool handling affecting per-instance key handling.

Suggested reviewers

  • sujankota
  • pflynn-virtru

Poem

🐰 I hopped through instances, one by one,
Ports tucked in burrows, configs all spun.
Keys and certs snug in a tiny den,
Scenarios run, then I hop off again.
🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.58% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(xtest): otdf-local multi-instance refactor' accurately and concisely summarizes the main change: refactoring otdf-local from a single-instance to a multi-instance harness.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DSPX-3302-03-multi-instance

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a multi-instance refactor for the otdf-local CLI, enabling the management of isolated test environments. It introduces new instance and scenario subcommands, updates the configuration system to be instance-aware, and integrates with otdf-sdk-mgr for binary management. Service launchers for KAS and the platform now support per-instance port offsets and directory structures. Review feedback highlights a potential TypeError in KAS feature handling and suggests a more direct approach for updating Pydantic model metadata.

Comment thread otdf-local/src/otdf_local/services/kas.py Outdated
Comment thread otdf-local/src/otdf_local/cli_instance.py Outdated

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a multi-instance architecture for the otdf-local CLI, allowing for the management and execution of isolated test environments. Key updates include new subcommands for instance and scenario handling, offset-based port allocation, and instance-specific directory structures for logs and configurations. Feedback from the review suggests several improvements: adding a null check for KAS features to avoid runtime errors, using Pydantic's model_copy for cleaner metadata updates, adopting shlex.join for safer command display, and adding missing type hints to enhance code maintainability.

Comment thread otdf-local/src/otdf_local/services/kas.py Outdated
Comment thread otdf-local/src/otdf_local/cli_instance.py Outdated
Comment thread otdf-local/src/otdf_local/cli_scenario.py Outdated
Comment thread otdf-local/src/otdf_local/config/settings.py Outdated

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a multi-instance test harness capability, allowing for the management and execution of isolated OpenTDF environments with distinct configurations, port ranges, and platform versions. Key additions include new CLI subcommands for instance management (init, ls, rm) and scenario execution, an instance-aware settings system, and integration with otdf-sdk-mgr to resolve versioned binaries. Feedback identifies a critical issue where the up command still relies on static port constants, which will break health checks for non-default instances. Additionally, improvements were suggested regarding safer dictionary handling for KAS features and more idiomatic use of Pydantic's model_copy.

Comment thread otdf-local/src/otdf_local/cli.py
Comment thread otdf-local/src/otdf_local/cli_instance.py Outdated
Comment thread otdf-local/src/otdf_local/services/kas.py Outdated
@github-actions

Copy link
Copy Markdown

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from c6a7895 to ebc0c15 Compare May 15, 2026 16:35
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from c69afd6 to a8ef24a Compare May 15, 2026 16:36
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from ebc0c15 to 14e5c1e Compare May 15, 2026 16:57
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from a8ef24a to 78b2ca6 Compare May 15, 2026 16:58
dmihalcik-virtru added a commit that referenced this pull request May 21, 2026
#450)

## Summary

First PR in a five-part stack that introduces a multi-instance test
harness and a Claude plugin for OpenTDF bug reproduction. This PR adds
*only* the shared Pydantic schema in `otdf-sdk-mgr` — no consumers yet.

- Adds `otdf_sdk_mgr.schema` with v2 models: `Scenario`, `Instance`,
`PlatformPin`, `KasPin`, `SdkPin`, `ScenarioSdks`, `Suite`, etc.
- `ScenarioSdks.encrypt` / `.decrypt` mirror xtest's existing
`--sdks-encrypt` / `--sdks-decrypt` convention so a→b-only scenarios are
first-class.
- `python -m otdf_sdk_mgr.schema validate <path>` validates either a
Scenario or an Instance file based on its `kind:`.
- Adds `pydantic` + `ruamel.yaml` to `otdf-sdk-mgr/pyproject.toml`.
- 6 unit tests covering round-trips, pin invariants, and unknown-field
rejection.

## Stack

1. [**This PR**](#450) — Shared
schema
2. [Platform installer + `install
scenario`](#451) in `otdf-sdk-mgr`
(builds on this)
3. `otdf-local` [multi-instance
refactor](#452) + new CLI
subcommands
4. `xtest/conftest.py`
[integration](#453) (`--scenario`,
`--instance`)
5. [Claude plugin](#454)
(`.claude/skills/`, settings, plugin manifest)
6. #455

## Test plan

- [x] `cd otdf-sdk-mgr && uv run pytest tests/test_schema.py` — all 6
pass
- [x] `uv run python -m otdf_sdk_mgr.schema validate <path>` accepts a
valid scenarios.yaml and rejects unknown fields

Jira: https://virtru.atlassian.net/browse/DSPX-3302

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added schema validation for OpenTDF Scenario and Instance YAML
configurations with a new CLI command.
* Introduced strict validation with cross-field constraints for SDK and
platform configurations.

* **Documentation**
  * Updated supported container formats from `nano` to `ztdf-ecwrap`.

* **Dependencies**
* Updated core package dependencies to support enhanced validation
capabilities.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/opentdf/tests/pull/450?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from 78b2ca6 to e196e43 Compare May 21, 2026 15:38
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from 14e5c1e to 9993b12 Compare May 21, 2026 15:38
@github-actions

Copy link
Copy Markdown

X-Test Failure Report

@github-actions

Copy link
Copy Markdown

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from ec1f655 to 13b5c96 Compare May 22, 2026 01:46
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch 2 times, most recently from 5b1c928 to a1bcecc Compare May 22, 2026 13:50
@github-actions

Copy link
Copy Markdown

@github-actions

Copy link
Copy Markdown

X-Test Failure Report

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from e7d13f5 to 6832d58 Compare May 28, 2026 12:46
@github-actions

Copy link
Copy Markdown

X-Test Failure Report

Comment thread otdf-local/src/otdf_local/utils/keys.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
otdf-local/src/otdf_local/cli_instance.py (2)

55-82: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

--force still leaves stale bootstrap assets in place.

Line 65 only bypasses the existence check. Both init paths still flow into _provision_instance_dir() without a force signal, and that helper returns immediately once opentdf.yaml exists while also calling ensure_keys_exist() in non-force mode. Re-running instance init --force can therefore overwrite instance.yaml but keep the old config and key bundle.

Suggested fix
-        _init_from_scenario(name, from_scenario, instance_dir)
+        _init_from_scenario(name, from_scenario, instance_dir, force=force)
@@
-        _init_minimal(name, instance_dir, ports_base, platform_dist)
+        _init_minimal(name, instance_dir, ports_base, platform_dist, force=force)
def _provision_instance_dir(instance_dir: Path, instance: Instance, *, force: bool = False) -> None:
    keys_dir = instance_dir / "keys"
    keys_dir.mkdir(mode=0o700, parents=True, exist_ok=True)
    ensure_keys_exist(keys_dir, force=force)

    config_path = instance_dir / "opentdf.yaml"
    if config_path.exists() and not force:
        return

    ...
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@otdf-local/src/otdf_local/cli_instance.py` around lines 55 - 82, The init
command's --force currently only skips the existence check but does not pass
force into provisioning, so stale keys/config remain; update
_provision_instance_dir to accept a force: bool parameter and use it when
calling ensure_keys_exist(keys_dir, force=force) and when deciding whether to
return early on existing opentdf.yaml (i.e., if config_path.exists() and not
force: return), and then update callers _init_from_scenario(...) and
_init_minimal(...) (and the code in cli_instance.init that invokes them) to
forward the force flag so re-running instance init --force fully overwrites keys
and config.

36-85: ⚠️ Potential issue | 🟠 Major

Add the required otdf-local lint/typecheck results (Ruff is OK; pyright output is still missing)

otdf-local passes:

  • ruff check . (“All checks passed!”)
  • ruff format --check . (“30 files already formatted”)

pyright -p . could not be run in the current environment (pyright: command not found), so attach the actual pyright output (or run it via the project’s intended command/tooling so the repo rule is satisfied).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@otdf-local/src/otdf_local/cli_instance.py` around lines 36 - 85, Run the
project's type checker (pyright) for the otdf-local package and attach the
actual pyright output to the PR: execute the repository's intended pyright
invocation (e.g., pyright -p . or the project's configured script) from the repo
root, save the full stdout/stderr results, and add those results to the PR (or a
CI artifact) so the linter/typecheck requirement is satisfied; mention in the PR
comment that checks were run alongside the existing ruff results and reference
the init function and helper functions (_init_from_scenario, _init_minimal,
_validate_port_uniqueness) if any type errors point to them.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@otdf-local/src/otdf_local/utils/keys.py`:
- Around line 305-332: The bootstrap key-existence checks and per-key
regeneration guards must treat the Keycloak CA cert as part of the bundle: add a
Path variable for "keycloak-ca.pem" (e.g. ca_cert = key_dir / "keycloak-ca.pem")
and include ca_cert.exists() in the big early-exit condition alongside
rsa_private, rsa_cert, ec_private, ec_cert, localhost_key, localhost_cert, and
ca_jks; also update the per-resource guards so generate_localhost_cert(...) is
run if force is True or localhost_key/localhost_cert OR ca_cert are missing, and
run generate_ca_jks(...) if force is True or ca_jks or ca_cert are missing.
Reference generate_localhost_cert, generate_ca_jks and the existing
rsa/ec/localhost/ca_jks Path variables when making the changes.

---

Outside diff comments:
In `@otdf-local/src/otdf_local/cli_instance.py`:
- Around line 55-82: The init command's --force currently only skips the
existence check but does not pass force into provisioning, so stale keys/config
remain; update _provision_instance_dir to accept a force: bool parameter and use
it when calling ensure_keys_exist(keys_dir, force=force) and when deciding
whether to return early on existing opentdf.yaml (i.e., if config_path.exists()
and not force: return), and then update callers _init_from_scenario(...) and
_init_minimal(...) (and the code in cli_instance.init that invokes them) to
forward the force flag so re-running instance init --force fully overwrites keys
and config.
- Around line 36-85: Run the project's type checker (pyright) for the otdf-local
package and attach the actual pyright output to the PR: execute the repository's
intended pyright invocation (e.g., pyright -p . or the project's configured
script) from the repo root, save the full stdout/stderr results, and add those
results to the PR (or a CI artifact) so the linter/typecheck requirement is
satisfied; mention in the PR comment that checks were run alongside the existing
ruff results and reference the init function and helper functions
(_init_from_scenario, _init_minimal, _validate_port_uniqueness) if any type
errors point to them.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 462d694e-3055-4abc-9370-5af355563c17

📥 Commits

Reviewing files that changed from the base of the PR and between 1d05dbe and 200f430.

📒 Files selected for processing (5)
  • otdf-local/src/otdf_local/cli.py
  • otdf-local/src/otdf_local/cli_instance.py
  • otdf-local/src/otdf_local/cli_scenario.py
  • otdf-local/src/otdf_local/services/kas.py
  • otdf-local/src/otdf_local/utils/keys.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • otdf-local/src/otdf_local/cli.py

Comment thread otdf-local/src/otdf_local/utils/keys.py
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from 200f430 to 3ba16e3 Compare June 9, 2026 17:22
@github-actions

Copy link
Copy Markdown

X-Test Failure Report

dmihalcik-virtru and others added 17 commits June 10, 2026 14:37
Refactors otdf-local from a single-instance CLI (one platform checkout,
fixed ports, hardcoded six KAS instances) into a multi-instance harness
where each named instance under tests/instances/<name>/ owns its own
opentdf.yaml, keys, KAS configs, and port range.

Why
---

A single bug report often describes a *combination* — platform v0.9.0
with Java SDK 0.7.8 and a KAS at a pre-release. Today a developer has
to hand-edit configs and re-checkout the platform to reproduce. After
this change:

  otdf-local instance init java-078 --from-scenario .../scenario.yaml
  otdf-local --instance java-078 up

brings up exactly the topology the scenario describes, using platform
binaries that otdf-sdk-mgr already provisioned (each instance, and each
KAS within an instance, can reference a different pinned version). Two
instances on disjoint ports.base can coexist on a developer laptop.

What changes
------------

otdf-local now depends on otdf-sdk-mgr via a uv path source so both
tools share the canonical Scenario/Instance schema.

Settings (otdf_local.config.settings):
  - New instance_name (env-overridable via OTDF_LOCAL_INSTANCE_NAME),
    instance_dir, instances_root, instance_yaml properties.
  - platform_dir becomes optional; legacy sibling-discovery only kicks
    in when no per-instance configuration is present.
  - platform_binary_for(dist) resolves to the otdf-sdk-mgr-managed
    xtest/platform/dist/<dist>/service binary.
  - keys_dir, logs_dir, config_dir, platform_config, and
    get_kas_config_path switch to per-instance paths whenever
    instance.yaml exists; legacy behavior is preserved otherwise.
  - load_instance() reads the per-instance manifest via the shared
    Pydantic model.

Ports (otdf_local.config.ports):
  - KAS_OFFSETS exposes the offset table (alpha=+101, beta=+202, ...,
    km2=+606) so multiple instances on different bases get disjoint
    port ranges. The legacy 8080-based constants are preserved as
    defaults.
  - get_kas_port(name, base=...) computes the port relative to base.

Services (otdf_local.services.platform / .kas):
  - PlatformService.start() and KASService.start() use the pinned dist
    binary at xtest/platform/dist/<dist>/service when an instance is
    loaded, with cwd set to the recorded worktree so the binary finds
    its embedded resources. Legacy `go run ./service` path runs
    unchanged when no instance is active.
  - KASService.is_key_management defers to the manifest's `mode` field
    instead of the legacy name-based heuristic; per-KAS features (e.g.
    ec_tdf_enabled) pass through to opentdf.yaml.
  - KASManager constructs only the KAS instances listed in
    instance.yaml's kas: map. start_standard / start_km filter on
    is_key_management so subset topologies still work.

utils.keys.setup_golden_keys:
  - Writes key files into the target directory (per-instance keys_dir
    or legacy platform_dir) and uses absolute paths in the generated
    keys_config so the binary finds them regardless of cwd.

CLI:
  - New top-level --instance option threads through every command via
    OTDF_LOCAL_INSTANCE_NAME.
  - New `instance` subcommand group: init [--from-scenario PATH],
    ls --json, rm.
  - New `scenario` subcommand: `run <path>` translates the scenario's
    suite block into `pytest --sdks-encrypt ... --sdks-decrypt ...
    --containers ...` under xtest/ with OTDF_LOCAL_INSTANCE_NAME set.

Tests (otdf-local/tests/test_multi_instance.py):
  - Port arithmetic at default and alternate bases.
  - Settings round-trip with and without an instance.yaml.
  - platform_binary_for resolves under the otdf-sdk-mgr-managed
    xtest/platform/ tree.

.gitignore additions:
  - tests/instances/ (per-instance config and logs)
  - xtest/scenarios/*.installed.json (provisioning records)
  - .claude/tmp/

Backward compatibility:
  - `otdf-local up` with no --instance flag keeps working against a
    sibling platform/ checkout.

Refs: https://virtru.atlassian.net/browse/DSPX-3302

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this change, `otdf-local instance init` only wrote `instance.yaml`
and empty subdirs. Anyone running a fresh instance had to manually copy
keys from another worktree, run `init-temp-keys.sh` by hand, and copy
`opentdf-dev.yaml` into the instance dir before `up` would succeed —
otherwise Keycloak crash-looped on a missing `truststore.jks`, and
pytest failed with `OT_ROOT_KEY environment variable is not set`.

Changes:
- utils/keys.py: add `generate_localhost_cert()` and `generate_ca_jks()`
  to produce the Keycloak TLS pair + JKS truststore (matches the
  platform's `init-temp-keys.sh`). `generate_ca_jks()` runs `keytool`
  inside the `keycloak/keycloak:25.0` image so a local JDK isn't
  required. `ensure_keys_exist()` now generates the full bootstrap
  bundle, idempotently.
- cli_instance.py: `_init_from_scenario` and `_init_minimal` call a new
  `_provision_instance_dir()` helper that runs `ensure_keys_exist()` and
  copies the platform's `opentdf-dev.yaml` (or `opentdf-example.yaml`)
  into the instance dir, overriding `services.kas.root_key` with a
  freshly generated value so every instance owns its own root key.
- services/platform.py: `_generate_config()` preserves an existing
  per-instance `opentdf.yaml`, only patching logger + golden-key fields
  in place, so the init-time `root_key` survives restarts.
- services/docker.py: docker-compose subprocesses are now run with
  `KEYS_DIR=<instance>/keys` so the compose file's `${KEYS_DIR:-./keys}`
  mounts resolve to the per-instance bundle.

Users can now run:

  otdf-local instance init <name> --from-scenario path/to/scenario.yaml
  otdf-local --instance <name> up
  eval $(otdf-local --instance <name> env)
  cd xtest && uv run pytest ...

with no manual key-copying, no editing of `opentdf.yaml`, and no
shell-script fallback. Verified end-to-end against `pure-mlkem.yaml`
(PR opentdf/platform#3537): all 9 services come up healthy on the first
try and `env` exports `OT_ROOT_KEY`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…chema

`_build_pytest_args` read `suite.select` and treated `suite.containers`
as a string, but the Pydantic Suite model exposes `targets: list[str]`
and `containers: list[ContainerKind]`. Any user invoking
`otdf-local scenario run` hit AttributeError. Also wires `suite.kexpr`
through as `-k`; it was silently dropped.

Adds unit tests covering empty/multi targets, container join, kexpr,
markers + extra args, and SDK token forwarding.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…leanup

- `up` command now uses `settings.get_platform_port()` and iterates
  `kas_manager._instances` with `settings.get_kas_port()` for health checks
  so non-default instances with a different `ports.base` work correctly
- Add `Settings.get_platform_port()` alongside the existing `get_kas_port()`
- Simplify metadata name update: `instance.metadata.name = name` (frozen=False)
- Use `shlex.join(cmd)` for display in cli_scenario.py
- Add `"Instance | None"` return type to `load_instance` via TYPE_CHECKING
- Drop unused `Path` import in cli.py, stale `os` import in test file

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Guard platform_dir None-access in env command; replace non-existent
PlatformPin.image attribute with "unknown" fallback in ls command.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- cli_scenario: set OTDF_LOCAL_INSTANCE_NAME + clear settings cache before
  get_settings() so scenario-driven instance name is picked up
- cli_instance: add _validate_instance_name() to guard against path traversal
  in init/rm; add --force flag to init to prevent silent overwrite
- kas: add get_instance_names() public method; replace _instances access in cli
- keys: generate_ca_jks() now imports cert only (keytool -importcert) so ca.jks
  is a proper truststore; ensure_keys_exist() guards include cert files alongside
  private keys to catch partial-init broken state

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reverts the keytool -importcert change from the previous commit.
The PKCS12 + importkeystore approach mirrors init-temp-keys.sh in the
platform repo exactly (lines 65-90); Keycloak requires this form of
ca.jks and the cert-only truststore broke it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from 6d83353 to b441b38 Compare June 10, 2026 18:37
@github-actions

Copy link
Copy Markdown

X-Test Failure Report

@sonarqubecloud

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
otdf-local/src/otdf_local/config/ports.py (1)

1-60: ⚠️ Potential issue | 🔴 Critical

Run the required Python quality gates for otdf-local (pyright is missing)
ruff check and ruff format --check passed for otdf-local/, but pyright otdf-local did not run because pyright is not found (/bin/bash: pyright: command not found). Ensure pyright is installed/available and rerun the quality gates before committing.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@otdf-local/src/otdf_local/config/ports.py` around lines 1 - 60, CI failed
because the Pyright type checker is not installed/runnable, so update the repo
so `pyright` is available and the quality gate runs; install Pyright as a
project/tooling dependency (e.g., add to repo dev dependencies or install via
npm/yarn in the CI image) or ensure the CI runner has Pyright on PATH, then
re-run the type checks (verify it covers otdf_local.config.ports.Ports and its
methods like get_kas_port, platform_port_for, all_kas_names, standard_kas_names,
km_kas_names, is_km_kas) and fix any type errors reported.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@otdf-local/src/otdf_local/config/ports.py`:
- Around line 30-35: The get_kas_port function lets any integer base be passed
so base + offset can fall outside valid TCP port range; update get_kas_port (and
use KAS_OFFSETS) to validate that base is an int within 1..65535 and that
computed_port = base + offset is also within 1..65535 and raise a ValueError
with a clear message showing the invalid base or computed_port when out of
range; perform these checks before returning the port so callers fail fast with
an informative error.

In `@otdf-local/src/otdf_local/services/kas.py`:
- Around line 60-68: KASService._instance_paths currently only handles
KasPin.dist and returns None for source-pinned KAS; update _instance_paths to
also check KasPin.source and resolve it the same way platform/source pins are
handled in cli_instance.py: if pin.dist use
self.settings.resolve_binary_worktree(pin.dist), else if pin.source resolve the
pinned paths via the same resolver used for platform/source pins (the code path
in cli_instance.py), returning the resolved (binary, worktree) tuple instead of
None. Ensure you reference KasPin.source and KasPin.dist inside
KASService._instance_paths and call the appropriate settings resolver
consistently.

---

Outside diff comments:
In `@otdf-local/src/otdf_local/config/ports.py`:
- Around line 1-60: CI failed because the Pyright type checker is not
installed/runnable, so update the repo so `pyright` is available and the quality
gate runs; install Pyright as a project/tooling dependency (e.g., add to repo
dev dependencies or install via npm/yarn in the CI image) or ensure the CI
runner has Pyright on PATH, then re-run the type checks (verify it covers
otdf_local.config.ports.Ports and its methods like get_kas_port,
platform_port_for, all_kas_names, standard_kas_names, km_kas_names, is_km_kas)
and fix any type errors reported.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ee88be39-2115-41a7-9e50-e11592cdec3f

📥 Commits

Reviewing files that changed from the base of the PR and between 200f430 and b441b38.

⛔ Files ignored due to path filters (1)
  • otdf-local/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • .gitignore
  • otdf-local/pyproject.toml
  • otdf-local/src/otdf_local/cli.py
  • otdf-local/src/otdf_local/cli_instance.py
  • otdf-local/src/otdf_local/cli_scenario.py
  • otdf-local/src/otdf_local/config/ports.py
  • otdf-local/src/otdf_local/config/settings.py
  • otdf-local/src/otdf_local/services/docker.py
  • otdf-local/src/otdf_local/services/kas.py
  • otdf-local/src/otdf_local/services/platform.py
  • otdf-local/src/otdf_local/utils/keys.py
  • otdf-local/tests/test_cli_scenario.py
  • otdf-local/tests/test_multi_instance.py
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (8)
  • otdf-local/src/otdf_local/services/docker.py
  • otdf-local/src/otdf_local/cli_scenario.py
  • otdf-local/src/otdf_local/services/platform.py
  • otdf-local/tests/test_cli_scenario.py
  • otdf-local/pyproject.toml
  • otdf-local/src/otdf_local/cli.py
  • otdf-local/src/otdf_local/utils/keys.py
  • otdf-local/src/otdf_local/config/settings.py

Comment on lines +30 to +35
def get_kas_port(cls, name: str, *, base: int = 8080) -> int:
offset = cls.KAS_OFFSETS.get(name)
if offset is None:
raise ValueError(f"Unknown KAS instance: {name}")
return getattr(cls, attr)
return base + offset

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate base and computed KAS ports are within valid TCP range.

get_kas_port accepts any integer base, so base + offset can become <1 or >65535, which pushes failure to later service startup instead of failing fast here.

Proposed fix
 `@classmethod`
 def get_kas_port(cls, name: str, *, base: int = 8080) -> int:
+    if not (1 <= base <= 65535):
+        raise ValueError(f"Invalid base port: {base}")
     offset = cls.KAS_OFFSETS.get(name)
     if offset is None:
         raise ValueError(f"Unknown KAS instance: {name}")
-    return base + offset
+    port = base + offset
+    if not (1 <= port <= 65535):
+        raise ValueError(f"Computed port out of range for {name}: {port}")
+    return port
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@otdf-local/src/otdf_local/config/ports.py` around lines 30 - 35, The
get_kas_port function lets any integer base be passed so base + offset can fall
outside valid TCP port range; update get_kas_port (and use KAS_OFFSETS) to
validate that base is an int within 1..65535 and that computed_port = base +
offset is also within 1..65535 and raise a ValueError with a clear message
showing the invalid base or computed_port when out of range; perform these
checks before returning the port so callers fail fast with an informative error.

Comment on lines +60 to +68
def _instance_paths(self) -> tuple[Path, Path] | None:
"""Return (binary, worktree) for an instance-pinned KAS, or None."""
instance = self.settings.load_instance()
if instance is None:
return None
pin = instance.kas.get(self._kas_name)
if pin is None or pin.dist is None:
return None
return self.settings.resolve_binary_worktree(pin.dist)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check KasPin schema definition to see if it supports source field
ast-grep --pattern $'class KasPin($_):
  $$$
'
# Also check if there's a source field in KasPin or its parents
rg -n "class KasPin" -A 30

Repository: opentdf/tests

Length of output: 4541


Handle KasPin.source in KASService._instance_paths
KasPin defines both dist and source and enforces that exactly one of them is set. But otdf-local/src/otdf_local/services/kas.py (lines 60-68) only resolves when pin.dist is present, returning None when pin.dist is None, so source-pinned KAS instances can’t resolve their pinned binary/worktree. Update _instance_paths to also resolve via pin.source (similar to how cli_instance.py handles platform pins).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@otdf-local/src/otdf_local/services/kas.py` around lines 60 - 68,
KASService._instance_paths currently only handles KasPin.dist and returns None
for source-pinned KAS; update _instance_paths to also check KasPin.source and
resolve it the same way platform/source pins are handled in cli_instance.py: if
pin.dist use self.settings.resolve_binary_worktree(pin.dist), else if pin.source
resolve the pinned paths via the same resolver used for platform/source pins
(the code path in cli_instance.py), returning the resolved (binary, worktree)
tuple instead of None. Ensure you reference KasPin.source and KasPin.dist inside
KASService._instance_paths and call the appropriate settings resolver
consistently.

@github-actions

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant