fix(agent): render OpenAI tool-call arguments as a mapping for chat templates by EazyReal · Pull Request #2063 · THUDM/slime

EazyReal · 2026-06-12T04:21:42Z

Problem

In the OpenAI HTTP adapter (slime/agent/adapters/openai.py), _normalize_tool_call builds the assistant message that _translate_chat_messages collects into chain.chat_messages, which render_token_ids then feeds to tokenizer.apply_chat_template. That message is the render boundary. It serialized tool_calls[].function.arguments to a JSON string via json_arguments. Qwen-family chat templates iterate tool_call.arguments.items(), which expects a mapping — given a string they either iterate it character-by-character or raise a template error, so the rendered prompt diverges from the tokens the policy actually generated. In a token-capturing rollout that is a rollout/train token desync: the policy is trained on tokens it never sampled.

Before vs After

Same input — an assistant turn whose tool call carries arguments as the JSON string echoed on the OpenAI wire:

openai._normalize_tool_call(
    {"function": {"name": "lookup", "arguments": '{"q": "slime"}'}}
)

Before — arguments stays a JSON string, so the rendered assistant message is:

{"type": "function",
 "function": {"name": "lookup", "arguments": '{"q": "slime"}'}}   # str

The Qwen template's arguments.items() then runs on a string → AttributeError: 'str' object has no attribute 'items' (or, on templates that tolerate it, per-character iteration that emits a garbled tool block). Either way the rendered prompt no longer matches the sampled tokens.

After — the wire string is decoded back to the mapping the template expects:

{"type": "function",
 "function": {"name": "lookup", "arguments": {"q": "slime"}}}     # dict

arguments.items() now yields [("q", "slime")] and the tool call renders correctly.

Hostile inputs that are valid JSON but not a mapping are wrapped so .items() can never crash:

openai._normalize_tool_call(
    {"function": {"name": "lookup", "arguments": "[1, 2]"}}
)
# arguments -> {"_raw_arguments": [1, 2]}

Falsy non-dict values are real argument payloads, not "no arguments", and are preserved the same way — only None (or an empty wire string) maps to {}:

dict_arguments(0)     # -> {"_raw_arguments": 0}
dict_arguments([])    # -> {"_raw_arguments": []}
dict_arguments(None)  # -> {}

Fix

Add a small pure helper dict_arguments(value) -> dict in slime/agent/adapters/common.py that decodes echoed wire strings to a mapping at the render boundary:

dict passes through unchanged;
str is json.loads-decoded (an empty wire string → {}: the OpenAI wire encodes "no arguments" as an empty / "{}" string);
None → {} (no arguments); every other non-dict outcome — including falsy values like 0, False, and [] — funnels through a {"_raw_arguments": ...} sentinel (already a convention in slime/agent/parsing.py), so the -> dict contract holds for all inputs, .items() can never crash, and no argument payload is ever silently dropped.

Switch the single render-boundary call site in _normalize_tool_call from json_arguments to dict_arguments.

Why this is the right fix

Default-path safe. The dict-passthrough and empty/None → {} branches mean callers already passing a mapping (or no arguments) get bit-identical output; only echoed JSON-string arguments change, and they now match what the chat template expects.
Lossless. Truthiness is the wrong gate for "no arguments": 0, False, and [] are payloads the model emitted. Gating the sentinel on is None instead of truthiness means the normalization never rewrites a real payload to an empty call.
Outbound wire contract preserved. _openai_tool_calls (the response path) is untouched and still emits function.arguments as a JSON string, exactly as the OpenAI spec requires. Only the internal render boundary changed.
No new abstraction. It is one pure function reusing the existing {"_raw_arguments": ...} sentinel from parsing.py; the -> dict return type guarantees .items() is always safe.
CI-verifiable. tests/test_agent_adapters.py::test_openai_render_tool_call_arguments_are_dicts (marked unit) asserts the rendered arguments are the decoded mapping {"q": "slime"}, that "[1, 2]" wraps to {"_raw_arguments": [1, 2]}, and that falsy non-dict values are preserved losslessly (0 → {"_raw_arguments": 0}, [] → {"_raw_arguments": []}, None → {}). The file already runs in the CPU agent-adapter-test job (num_gpus: 0), so the test executes with no workflow change.

…emplates The OpenAI adapter's `_normalize_tool_call` builds the assistant message that is later fed to `tokenizer.apply_chat_template`. It serialized `tool_calls[].function.arguments` to a JSON *string* (via `json_arguments`) before rendering. Qwen-family chat templates iterate `tool_call.arguments.items()`, so a string argument mis-renders (per-character iteration / template error) and the rendered prompt diverges from what the model actually saw. In a token-capturing rollout this yields a rollout/train token desync. Fix: add a small pure helper `dict_arguments(value) -> dict` in `adapters/common.py` that decodes echoed wire strings back to a mapping at the render boundary: - dict passthrough; - `json.loads` for strings (empty wire string -> `{}`); - only `None` means "no arguments"; every other non-dict value -- including falsy ones like `0`, `False`, `[]` -- funnels through the `{"_raw_arguments": ...}` sentinel (already a repo convention, see `agent/parsing.py`), so the `-> dict` contract holds for all inputs without silently dropping argument payloads. Switch the single render-boundary call site in `_normalize_tool_call` from `json_arguments` to `dict_arguments`. The outbound wire path (`_openai_tool_calls`) is unchanged and still emits `arguments` as a JSON string, as the OpenAI spec requires. Verifiable via `tests/test_agent_adapters.py::test_openai_render_tool_call_arguments_are_dicts` (marked `unit`), which asserts the rendered tool-call `arguments` is the decoded mapping `{"q": "slime"}`, that non-dict values wrap in the sentinel, and that falsy non-dict values (`0`, `[]`) are preserved rather than dropped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

jingshenghang · 2026-06-17T02:25:59Z

Hi, our refactoring of the agent framework has been merged into the main branch. The Codex/OpenAI testing and validation were insufficient; we welcome pull requests based on our latest refactor.

#2005

EazyReal force-pushed the upstream-pr/agent-toolcall-args-mapping branch 2 times, most recently from b91e834 to f794e94 Compare June 12, 2026 06:21

EazyReal force-pushed the upstream-pr/agent-toolcall-args-mapping branch from f794e94 to 0b6a708 Compare June 12, 2026 08:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): render OpenAI tool-call arguments as a mapping for chat templates#2063

fix(agent): render OpenAI tool-call arguments as a mapping for chat templates#2063
EazyReal wants to merge 1 commit into
THUDM:mainfrom
EazyReal:upstream-pr/agent-toolcall-args-mapping

EazyReal commented Jun 12, 2026 •

edited

Loading

Uh oh!

jingshenghang commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EazyReal commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Before vs After

Fix

Why this is the right fix

Uh oh!

jingshenghang commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

EazyReal commented Jun 12, 2026 •

edited

Loading