Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions quality/trigger-evals/together-sandbox.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
[
{
"query": "Create isolated sandbox environments from a Docker image and run multi-turn bash commands for an RL training loop on Together AI.",
"should_trigger": true
},
{
"query": "I need to fan out 32 parallel sandboxes for GRPO reward computation using the together-sandbox SDK.",
"should_trigger": true
},
{
"query": "Set up a golden image snapshot with dependencies installed, then create ephemeral sandboxes from it for coding agent rollouts.",
"should_trigger": true
},
{
"query": "Run Python remotely on Together AI with session reuse and generate a matplotlib chart.",
"should_trigger": false
},
{
"query": "Fine-tune a Llama model on my custom dataset with LoRA on Together AI.",
"should_trigger": false
},
{
"query": "Provision a multi-node H100 cluster on Together AI and run a distributed training job.",
"should_trigger": false
}
]
80 changes: 80 additions & 0 deletions skills/together-sandbox/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
name: together-sandbox
description: "Isolated gVisor sandboxes for RL training, SFT data generation, and coding agent rollouts on Together AI. Create snapshots from Docker images, run multi-turn bash commands, read and write files, and manage sandbox lifecycle via the together-sandbox Python SDK. Reach for it whenever the user needs isolated container execution for agent environments, reward computation, or parallel code evaluation rather than managed Python notebooks or raw GPU clusters."
---

# Together Sandbox

## Overview

Use Together Sandbox when the user needs isolated container environments for executing untrusted code, running RL training loops, or orchestrating coding agent rollouts.

Typical fits:

- GRPO/RL training with parallel sandbox-based reward computation
- SFT data generation with verified trajectory collection
- Coding agent rollouts (multi-turn bash execution in isolated environments)
- Batch evaluation of code against test suites (SWE-bench, Terminal-Bench)
- Any workflow requiring Docker-image-based environments with file I/O and command execution

## When This Skill Wins

- The user needs isolated container execution from a Docker image, not just a Python notebook
- Sandboxes must survive multi-turn command sequences (1-10 sequential bash commands)
- The workflow requires parallel sandbox fan-out (8-256+ concurrent environments)
- Files need to be uploaded to or downloaded from the sandbox filesystem
- The user references RL training, GRPO, reward computation, verifier environments, or coding agents

## Hand Off To Another Skill

- Use `together-sandboxes` (plural) for managed Python notebook execution via the Code Interpreter API
- Use `together-gpu-clusters` for multi-node GPU compute or distributed training jobs
- Use `together-dedicated-containers` for custom containerized inference workers
- Use `together-fine-tuning` for model training jobs (LoRA, DPO, full fine-tuning)
- Use `together-chat-completions` if the user only needs inference, not code execution

## Quick Routing

- **Create a sandbox and run commands**
- Start with [scripts/sandbox_lifecycle.py](scripts/sandbox_lifecycle.py)
- **SDK reference (snapshots, sandboxes, exec, files)**
- Read [references/api-reference.md](references/api-reference.md)
- **RL training patterns (GRPO batch, reward collection, golden images)**
- Read [references/rl-patterns.md](references/rl-patterns.md)
- **Parallel fan-out for batch evaluation**
- Start with [scripts/parallel_fanout.py](scripts/parallel_fanout.py)

## Workflow

1. Install the SDK: `pip install "together-sandbox @ git+https://github.com/togethercomputer/together-sandbox.git#subdirectory=together-sandbox-python"`.
2. Create a snapshot from a Docker image or Dockerfile using `sdk.snapshots.create()`.
3. Create a sandbox from the snapshot using `sdk.sandboxes.create()` with CPU, memory, and disk specs.
4. Start the sandbox using `sdk.sandboxes.start()`, which returns a connected `Sandbox` object.
5. Configure DNS inside the sandbox as the first exec (sandboxes have no DNS by default).
6. Execute commands with `sandbox.execs.exec()` and read/write files with `sandbox.files`.
7. For RL: collect reward files from the sandbox filesystem after test execution.
8. Shut down with `sdk.sandboxes.shutdown(sandbox.id)` or hibernate with `sdk.sandboxes.hibernate(sandbox.id)`.

## High-Signal Rules

- The SDK is async-native. All methods are `async`. Use `asyncio.run()` or an async context.
- The SDK is not yet on PyPI. Install from GitHub: `pip install "together-sandbox @ git+https://github.com/togethercomputer/together-sandbox.git#subdirectory=together-sandbox-python"`.
- Sandboxes have no DNS by default. Run `echo "nameserver 1.1.1.1" > /etc/resolv.conf` as your first exec or all network calls will fail.
- Tools installed via pip land in `/root/.local/bin`, which is not on PATH. Run `export PATH="/root/.local/bin:$PATH"` before using them.
- `sandbox.execs.exec("bash", ["-c", "your command"])` runs a command to completion and returns `{"exit_code": int, "output": str}`. For shell commands, always wrap with `bash -c`.
- `sdk.sandboxes.create()` returns a `SandboxModel` (metadata). `sdk.sandboxes.start()` returns a connected `Sandbox` with exec and file access. These are two separate steps.
- The SDK handles authentication automatically. Set `TOGETHER_API_KEY` and the two-auth system (management API + in-sandbox Pint API) is abstracted away.
- Ephemeral sandboxes (`ephemeral=True`) auto-delete on stop and cannot hibernate. Use for disposable training runs.
- For parallel fan-out, use `asyncio.gather()` to create and start multiple sandboxes concurrently.
- Shutdown and hibernate use the SDK namespace, not the sandbox instance: `sdk.sandboxes.shutdown(sandbox.id)`, not `sandbox.shutdown()`.

## Resource Map

- **API reference**: [references/api-reference.md](references/api-reference.md)
- **RL workflow patterns**: [references/rl-patterns.md](references/rl-patterns.md)
- **Sandbox lifecycle script**: [scripts/sandbox_lifecycle.py](scripts/sandbox_lifecycle.py)
- **Parallel fan-out script**: [scripts/parallel_fanout.py](scripts/parallel_fanout.py)

## Official Docs

- [Together Sandbox SDK](https://github.com/togethercomputer/together-sandbox)
4 changes: 4 additions & 0 deletions skills/together-sandbox/agents/openai.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
interface:
display_name: "Together Sandbox"
short_description: "Together AI isolated container execution for RL and coding agents"
default_prompt: "Use $together-sandbox to create isolated gVisor sandboxes from Docker images, execute multi-turn bash commands, and manage files for RL training, agent rollouts, and code evaluation on Together AI."
Loading
Loading