Skip to content

Improve Server OAI Responses API streaming compatibility#24957

Open
boondocklabs wants to merge 5 commits into
ggml-org:masterfrom
boondocklabs:responses-protocol
Open

Improve Server OAI Responses API streaming compatibility#24957
boondocklabs wants to merge 5 commits into
ggml-org:masterfrom
boondocklabs:responses-protocol

Conversation

@boondocklabs

Copy link
Copy Markdown
Contributor

Overview

Improve OpenAI Responses API streaming compatibility by aligning emitted SSE events with the documented OpenAI event schema.

This change refactors sequence number handling to use a shared counter stored in task_result_state, ensuring monotonic sequence_number values across both partial and final response events. It also adds and corrects required event fields that were previously missing or incomplete.

Specifically, this change:

  • Moves sequence number ownership to task_result_state
  • Exposes the shared counter to partial and final response serializers via oai_seq_num_ptr
  • Ensures all SSE events emitted for a response stream use a single monotonically increasing sequence number source
  • Adds missing required fields to Responses API streaming events
  • Aligns event payload structure more closely with the OpenAI Responses API documentation
  • Improves compatibility with OpenAI SDKs and third-party clients that validate event schemas

Additional information

While testing OpenAI Responses API streaming, some clients failed to deserialize events correctly because emitted events did not fully match the documented schema and expected event ordering.

In addition to fixing sequence number generation, this PR updates event payloads to include required fields and metadata expected by OpenAI-compatible clients. The goal is to improve interoperability with SDKs and consumers that rely on strict adherence to the Responses API event definitions.

This work was validated against the OpenAI Responses API documentation and tested with the Rig streaming client, which was previously unable to successfully consume the generated event stream due to schema and event sequencing incompatibilities.

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES — ChatGPT was used to discuss OpenAI Responses API event semantics, documented schema requirements, sequence number handling, and to assist in drafting portions of the implementation and PR description. All code changes were reviewed, modified, tested, and validated by the submitter.

@boondocklabs boondocklabs requested a review from a team as a code owner June 23, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant