Skip to content

feat: chat runtime - pause/resume, SSE transport, React bindings#5

Open
marslavish wants to merge 23 commits into
mainfrom
feat/chat-runtime
Open

feat: chat runtime - pause/resume, SSE transport, React bindings#5
marslavish wants to merge 23 commits into
mainfrom
feat/chat-runtime

Conversation

@marslavish
Copy link
Copy Markdown

@marslavish marslavish commented Apr 27, 2026

Builds on feat/features-complete. Adds the chat-runtime layer on top of the redesigned core: pausable tool execution, an SSE-serializable run handle, a headless React hook, a Next.js reference demo, and shared test infrastructure.

Summary

  • @agentic-kit/agent — pausable tools, AgentRunHandle (events / ReadableStream / SSE Response), maxSteps, decision lookup by toolCallId.
  • @agentic-kit/react (new package) — useChat hook that POSTs to an SSE endpoint and folds events into messages, streaming snapshot, pending decisions, and executing tools.
  • apps/nextjs-chat-demo (new) — Next.js App Router demo wiring agent.prompt(...).toResponse() to useChat, with a tool-approval UI.
  • agentic-kitinjectDeferralResults helper for the "user types instead of approving" flow; cross-fetch dropped in the OpenAI adapter in favor of native fetch.
  • Test infra — shared helpers under tools/test/ (scripted provider, SSE stub, fixtures), SSE parser tests, run-handle tests (443 LOC), useChat tests (1011 LOC).

What's New

@agentic-kit/agent — pause/resume + SSE

  • Pausable tools. Tools declare an optional decision JSON Schema. When the agent reaches a call with no attached decision, it emits tool_decision_pending and stops. Attach the decision to the matching toolCall block and call continue() to resume.
  • AgentRunHandle returned by prompt() / continue(), consumable exactly once as:
    • await handle — run to completion
    • handle.events() — async iterator of AgentEvents
    • handle.toReadableStream()ReadableStream<AgentEvent>
    • handle.toResponse() — SSE Response ready to return from a Next.js / Hono / Express handler
  • parseSSEStream() exported from the package for clients consuming toResponse().
  • maxSteps cap on model invocations per run (resets in prompt(), persists across continue()); stopReason: 'completed' | 'max_steps' on agent_end.
  • Decision lookup by id. continue() and the underlying loop walk the message log backwards to find the most recent un-decided toolCall matching a given toolCallId, so callers may append unrelated messages between the pause and the response.

@agentic-kit/react — new package

  • Single hook useChat({ api, body?, initialMessages?, fetch?, on* }).
  • State: messages, streamingMessage, isStreaming, pendingDecisions: ReadonlyMap<string, ToolDecisionPendingEvent>, executingToolCallIds: ReadonlySet<string>, error.
  • Actions: send, sendMessages, setMessages (array or updater), respondWithDecision(toolCallId, value), abort().
  • abort() finalizes any visible streamed text as an assistant message and drops orphan toolCall blocks so the next call doesn't re-pause.
  • Callbacks: onMessage, onFinish, onDecisionPending, onToolExecutionStart/End, onError.
  • Headless — no UI, no run store, no runId. State lives in the message log.

agentic-kitinjectDeferralResults

For the case where the user types a new message while a tool is paused: synthesizes a stand-in toolResult for every toolCall that lacks both a decision and a paired result, so the server picks up a well-formed transcript.

import { injectDeferralResults, createUserMessage } from 'agentic-kit';

await sendMessages([
  ...injectDeferralResults(messages),
  createUserMessage(text),
]);

apps/nextjs-chat-demo

  • /api/chat/route.ts constructs an Agent, applies prior messages, and returns agent.prompt(...).toResponse().
  • Client uses useChat with chat-input, chat-messages, tool-call-card, tool-approval-card components.

Test infrastructure

  • tools/test/ — repo-internal helpers (no package.json, imported via tsconfig paths). Scripted provider, SSE stub, fixtures, shared index.
  • Provider unit suites refactored onto the shared helpers; default pnpm test stays deterministic and offline.
  • New suites: sse.test.ts (parser), run-handle.test.ts (443 LOC), use-chat.test.ts (1011 LOC under jsdom), inject-deferral-results.test.ts.
  • @agentic-kit/react is the only package on jsdom; everything else stays on node.

Cleanup

  • cross-fetch removed from the OpenAI adapter — runtimes are expected to provide fetch.
  • Packages expose a source export condition so workspace consumers can resolve TypeScript directly.

Test Plan

  • pnpm install && pnpm build && pnpm test is green across packages
  • apps/nextjs-chat-demo boots, streams a chat turn, and a paused tool can be approved/denied via respondWithDecision
  • Abort mid-stream preserves visible text and clears orphan toolCalls; next send() does not re-pause
  • injectDeferralResults flow: pause a tool, send a fresh user message instead of deciding, verify the next request carries synthesized stand-in results

@marslavish marslavish changed the base branch from main to feat/features-complete April 27, 2026 14:44
@marslavish marslavish changed the title feat: chat runtime foundation — pausable tools and test infra (WIP) feat: pause/resume runtime, run store, useChat Apr 27, 2026
@marslavish marslavish changed the title (WIP) feat: pause/resume runtime, run store, useChat feat: pause/resume runtime, run store, useChat May 12, 2026
@marslavish marslavish changed the title feat: pause/resume runtime, run store, useChat feat: chat runtime - pause/resume, SSE transport, React bindings May 12, 2026
@marslavish marslavish changed the base branch from feat/features-complete to main May 12, 2026 02:09
return new Response(sse, responseInit);
}

then<TResult1 = void, TResult2 = never>(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be too paranoid here but this is likely a design flaw

DefaultAgentRunHandle opts into Promise assimilation by extending PromiseLike<void>, which lets a caller write await handle as shorthand for
"run to completion and discard events." That ergonomic sugar is the source of every problem below.

The relevant code

// packages/agent/src/run-handle.ts:10-14
export interface AgentRunHandle extends PromiseLike<void> {
  events(): AsyncIterable<AgentEvent>;
  toReadableStream(): ReadableStream<AgentEvent>;
  toResponse(init?: ResponseInit): Response;
}
// packages/agent/src/run-handle.ts:62-70
then(onfulfilled, onrejected) {
  if (!this.startedAs) {
    this.startSink();                                  // ← side effect on observation
  }
  return this.completion!.then(onfulfilled, onrejected);
}
// packages/agent/src/run-handle.ts:164-171
private startSink(): void {
  this.ensureNotStarted('sink');
  this.startedAs = 'sink';
  this.completion = this.bind(null, abortController.signal);  // push = null → events dropped
}

The key insight: any thenable in JavaScript can be silently consumed by the Promise machinery — await, Promise.resolve, returning from an
async function, Promise.all, even some logger middleware. The runtime calls .then() on it without asking. And in this class, .then() is
not an observation — it mutates state (startedAs = 'sink') and dispatches the bind.


Hazard A — Any assimilator silently starts the run as a sink

const handle = agent.prompt('hi');

await handle;                  // ← starts run, events dropped
Promise.resolve(handle);       // ← same
async function factory() { return handle; }
await factory();               // ← same

All three call handle.then() under the hood. The run begins with push = null, so every event the agent emits goes nowhere.


Hazard B — After assimilation, the documented API throws

const handle = agent.prompt('hi');
someLogger.info({ handle });        // looks harmless
await Promise.resolve(handle);      // ← caller didn't realize this consumed it

handle.events();                    // throws "already consumed via sink"
handle.toReadableStream();          // throws
handle.toResponse();                // throws

The ensureNotStarted guard at run-handle.ts:72-81 enforces single-use: once startedAs is set, every other consumption mode is dead. The
handle is now a paperweight that has already burned its run.


Hazard C — TypeScript erases the handle reference

async function makeHandle() {
  return agent.prompt('hi');                  // returns AgentRunHandle
}

const result = await makeHandle();
// typeof result is `void`, not AgentRunHandle — verified by tsc
result.events();                              // compile error: events does not exist on void

Because AgentRunHandle extends PromiseLike<void>, the await unwraps it to void. The IDE will complain when you try to use the handle, but
only after you've already lost the reference and triggered the bind. The type system reflects the hazard but doesn't prevent it.


Hazard D — The assimilator deadlocks on the full agent run

This is the part that goes beyond "the run starts as a sink." Look at line 69:

return this.completion!.then(onfulfilled, onrejected);

this.completion is the promise returned by bind(null, …), which resolves only when the entire agent run finishes — through all LLM
streaming, all tool calls, all turns, until agent_end. Anything that assimilated the handle is now waiting on that.

async function getHandle() { return agent.prompt('hi'); }

const handle = await getHandle();
// ← blocks for 30+ seconds while the LLM streams, AND no events
//   are observable because the run is already in sink mode.
//   In a Next.js route, the response is held open with nothing to send.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants