Skip to content

Bump harnesses verifiers minimum#1552

Open
xeophon wants to merge 8 commits into
mainfrom
harnesses-verifiers-dev158
Open

Bump harnesses verifiers minimum#1552
xeophon wants to merge 8 commits into
mainfrom
harnesses-verifiers-dev158

Conversation

@xeophon

@xeophon xeophon commented Jun 5, 2026

Copy link
Copy Markdown
Member

Replacement for closed #1545.

Stacked on #1544 by branch ancestry, while still targeting main: harnesses-verifiers-dev158 is one commit on top of harnesses-0.1.2-main (50bf7c27).

This bumps the standalone harnesses package dependency from verifiers>=0.1.15.dev11 to verifiers>=0.1.15.dev158, so harnesses==0.1.2 only installs with a verifiers build that includes the Harness.load_program_config hook used by the versioned harness config changes.

Addresses #1544 (comment).

Verification:

  • UV_FROZEN=1 uv build --out-dir /tmp/harnesses-verifiers-dev158-restacked packages/harnesses
  • Confirmed wheel metadata contains Requires-Dist: verifiers>=0.1.15.dev158

Note

Medium Risk
Breaking config renames (package/release/rlm_* fields) affect eval TOML and Python callers; dependency floor ties releases to verifiers dev158+.

Overview
Raises harnesses’ minimum verifiers dependency to >=0.1.15.dev158 and bumps environment packages to harnesses>=0.1.2, so harnesses==0.1.2 only installs against a verifiers build that includes Harness.load_program_config.

Harness.__init__ now resolves programs via load_program_config instead of calling config.program.resolve() directly; command harnesses (OpenCode, Pi, MiniSWEAgent, Terminus2) override it to pass a top-level version into program.resolve(version=...). Per-program package / release fields are removed in favor of HarnessConfig.version; docs show [eval.harness] with id + version.

RLMProgramConfig and rlm_swe_v1 rename rlm_* knobs (rlm_toolstools, rlm_repo_urlrepo_url, etc.). Tests assert harness.program_config for resolved install scripts.

Reviewed by Cursor Bugbot for commit 2dcd264. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Move version configuration to harness-level in verifiers and harnesses packages

  • Moves version/release fields from program config dataclasses (MiniSWEAgentProgramConfig, OpenCodeProgramConfig, PiProgramConfig, Terminus2ProgramConfig) to their parent harness config dataclasses as a top-level version field.
  • Adds load_program_config as an extension point on the base Harness class so subclasses can inject the harness-level version during program resolution.
  • Renames rlm_*-prefixed fields in RLMProgramConfig and RlmSweProgramConfig (e.g. rlm_toolstools, rlm_exec_timeoutexec_timeout) and updates utility functions in rlm_utils.py to match.
  • Bumps the verifiers minimum from >=0.1.15.dev11 to >=0.1.15.dev158 and harnesses minimum from >=0.1.1 to >=0.1.2 across environment packages.
  • Risk: Breaking config change — existing configs using program.release, program.package, program.harbor_package, or rlm_tools/rlm_exec_timeout field names will no longer be valid.

Macroscope summarized 2dcd264.

Comment thread verifiers/envs/experimental/composable/harnesses/opencode.py Outdated
@macroscopeapp

macroscopeapp Bot commented Jun 5, 2026

Copy link
Copy Markdown

Approvability

Verdict: Needs human review

This PR restructures the configuration API across multiple harness implementations, moving version specification from ProgramConfig to HarnessConfig and introducing a new load_program_config method pattern. The cross-cutting nature of these API changes to shared infrastructure warrants human review.

You can customize Macroscope's approvability policy. Learn more.

@xeophon xeophon force-pushed the harnesses-verifiers-dev158 branch from 224ebdc to 2dcd264 Compare June 5, 2026 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant