Skip to content

Add together-sandbox skill #16

Open
necoline wants to merge 3 commits into
mainfrom
add-together-sandbox-skill
Open

Add together-sandbox skill #16
necoline wants to merge 3 commits into
mainfrom
add-together-sandbox-skill

Conversation

@necoline
Copy link
Copy Markdown

@necoline necoline commented May 26, 2026

Summary

  • Adds new together-sandbox skill (singular) for the together-sandbox Python SDK, sandbox environments for RL training, SFT data generation, and coding agent rollouts
  • Distinct from the existing together-sandboxes (plural) skill, which covers the Code Interpreter API for managed Python notebook execution
  • Grounded in the GRPO RL workload profile: snapshot creation, batch sandbox fan-out, multi-turn exec, reward file collection, and lifecycle cleanup

Files added

File Purpose
skills/together-sandbox/SKILL.md Skill definition with routing, workflow, and high-signal rules
skills/together-sandbox/agents/openai.yaml UI metadata for OpenAI/Codex surfaces
skills/together-sandbox/references/api-reference.md Full SDK reference (snapshots, sandboxes, exec, files, lifecycle)
skills/together-sandbox/references/rl-patterns.md GRPO training patterns (golden image, batch fan-out, rollouts, reward collection)
skills/together-sandbox/scripts/sandbox_lifecycle.py End-to-end lifecycle demo (create, exec, files, shutdown)
skills/together-sandbox/scripts/parallel_fanout.py Batch creation and parallel execution demo
quality/trigger-evals/together-sandbox.json 6 trigger eval cases (3 positive, 3 negative)

Test plan

  • python3 scripts/quick_validate.py skills/together-sandbox passes
  • python3 scripts/quality_check.py passes
  • Scripts run successfully against live Sandbox API (blocked on API key provisioning)
  • Run ./scripts/publish.sh to regenerate AGENTS.md and README.md (left for maintainer)

🤖 Generated with Claude Code

necoline and others added 2 commits May 26, 2026 17:30
New skill covering the together-sandbox Python SDK for isolated container
environments used in RL training, SFT data generation, and coding agent
rollouts. Distinct from together-sandboxes (Code Interpreter API).

Includes:
- SKILL.md with routing, workflow, and high-signal rules
- references/api-reference.md (full SDK surface)
- references/rl-patterns.md (GRPO training patterns)
- scripts/sandbox_lifecycle.py (create, exec, files, shutdown)
- scripts/parallel_fanout.py (batch creation for RL)
- agents/openai.yaml
- trigger-evals (3 positive, 3 negative)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verified all claims against togethercomputer/together-sandbox source code.

Fixes:
- execs.exec() does not exist; replaced with execs.create() + execs.get()
  polling pattern throughout all files
- Added run_exec() helper to wrap the two-step create+poll flow
- HttpError does not exist; replaced with RuntimeError
- autostart parameter is actually autorun
- user parameter is actually uid/gid (int, not str)
- get_output() returns list[ExecStdout] with .output/.exit_code attributes,
  not a dict with string keys
- Snapshot model has no alias field; documented this limitation
- SSE stream uses camelCase (exitCode) not snake_case
- Documented ExecItem and ExecStdout model fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@necoline necoline changed the title Add together-sandbox skill for gVisor container execution Add together-sandbox skill May 26, 2026
Ran both scripts against live Sandbox API — both pass end-to-end.

Corrections from live testing:
- execs.exec() DOES exist (convenience method wrapping create+stream)
  Returns {"exit_code": int, "output": str} — reverted all files
- autostart is correct (not autorun)
- user is str ("1000:1000"), not uid/gid ints
- sandbox.shutdown() is a classmethod — use sdk.sandboxes.shutdown(id)
- sandbox.hibernate() is a classmethod — use sdk.sandboxes.hibernate(id)
- Removed run_exec() helper (unnecessary since exec() exists)

Verified against live API:
- sandbox_lifecycle.py: snapshot create, sandbox start, DNS config,
  exec, file write/read, directory list, shutdown, snapshot delete
- parallel_fanout.py: 4 concurrent sandboxes, parallel exec, cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant