feat(api): /sessions filters by type and paginates by cursor#677
Merged
Conversation
GET /bots/{bot_id}/sessions now accepts:
- types=chat,discuss,acp_agent (defaults to user-facing types)
- limit=50 (1..200)
- cursor=<opaque>
Response adds `next_cursor`. System-internal session types (heartbeat,
schedule, subagent) no longer leak to the default list.
Underlies the SSE refactor that follows.
9 tasks
The sqlc-generated SQLite query mixed `sqlc.slice(types)` with explicit
numbered placeholders (`?N`). At runtime sqlc-sqlite expands the slice
into anonymous `?` placeholders that SQLite auto-numbers from the next
free index — which collides with the explicit `?N` we used for the
cursor and limit binds. The result was either a `datatype mismatch`
failure or a silently misrouted bind, so paging was effectively broken.
The same SQL also compared the cursor (RFC3339Nano) against
`updated_at`, which SQLite stores as `YYYY-MM-DD HH:MM:SS` via
`CURRENT_TIMESTAMP`. Lexicographic comparison ('T' > ' ') means every
row looked "less than" any cursor — keyset pagination devolved into
"start from the head every time".
Both go away by hand-rolling the SQLite shim:
- The query uses only sequential anonymous `?`, so binding order is
unambiguous regardless of how many types the caller passes.
- The cursor timestamp is funneled through
`strftime('%Y-%m-%d %H:%M:%S', ?)` so it lines up with the stored
text format.
A new integration test exercises the SQLite shim against an in-memory
database and walks both pages, covering the placeholder collision and
the cursor-format mismatch in a single check.
The non-admin path filtered the DB result by `canAccessSession` and then derived `next_cursor` from the filtered slice. If a full page of DB rows was discarded by the filter, `len(sessions) < limit` ⇒ empty `next_cursor` ⇒ the caller stopped paging even though older rows still existed on disk. The cursor's job is to record the DB position to resume from; filter survivorship has nothing to do with it. next_cursor now always reflects the DB resume position when the DB returned a full page, regardless of whether the permission filter shrank that page. Also tightens cursor decoding to reject a syntactically-valid base64 payload whose UUID portion does not parse, so the failure surfaces as a 400 from the handler instead of a 500 from the service layer.
The new paged /sessions endpoint orders by `updated_at DESC, id DESC` with `bot_id = ? AND deleted_at IS NULL`. None of the existing bot_sessions indexes cover that access path, so the planner falls back to a full scan of the bot's sessions plus an in-memory sort — fine on small fixtures, painful on real installs. Add a partial composite index on (bot_id, updated_at DESC, id DESC) filtered to active rows. Updates both baseline migrations and ships an incremental migration pair for environments already on the schema.
`toSessionFromPagedRow` and `toSessionFromUserPagedRow` were byte-identical projections of two sqlc-generated row structs that happen to share the same shape. Funnel them through a common `pagedColumns` struct and a single `sessionFromPagedColumns` helper so the projection lives in one place. Express `IsUserFacingType` in terms of `UserFacingSessionTypes` via `slices.Contains` so the constant stays the single source of truth for which session types are user-facing.
- prefer errors.New over fmt.Errorf for static error messages - close rows.Close errors explicitly in the hand-rolled shim - drop unused method receiver on the test grants stub
A SessionCursor with only the timestamp or only the id populated cannot position a keyset query, but the previous AND-based IsZero let such a half-built cursor through into pagedCursorParams where the empty half became a malformed binding. The HTTP decoder already rejects empty halves, so this only affects internal callers that construct a cursor by hand; flip the predicate to OR so every callsite funnels through the no-cursor path instead.
Drop the strftime() wrapping in the paged session SQL: it silently truncated sub-second precision from the Go side, while the Go formatter was producing nanoseconds — confusing to read and pointless to compute twice. Format the cursor timestamp directly at second precision so it matches SQLite's TEXT storage from CURRENT_TIMESTAMP. Also make parseSQLiteTimestamp return its error rather than collapse unknown formats to an invalid Timestamptz; format drift should surface loudly per the project's no-fake-robustness rule. Callers wrap with the offending column name and raw text.
RFC3339Nano on the wire is preserved by Postgres but truncated to second precision by SQLite; document that here so future readers don't try to "fix" the asymmetry.
The existing paged test seeds rows a minute apart, so the (updated_at = ? AND id < ?) branch of the cursor SQL was never exercised. Add a case that inserts three rows sharing one `updated_at`, pages through them with limit 2, and asserts page 2 returns the remaining row with no duplicates and no skips — that's the exact bug the id tiebreak exists to prevent.
golangci-lint perfsprint catches fmt.Errorf with no args; the message is static so errors.New is the right primitive.
…inel Cycle-3 review nits: - scanSessionPagedRows redundantly called rows.Close() while the two callers already deferred it. Dropped the explicit close so the defer is the single close path; future callers that follow the same defer pattern won't accidentally double-close. - The "unsupported timestamp format" error was constructed with errors.New on every parse failure even though the message is static. Hoisted to a package-level sentinel errUnsupportedSQLiteTimestamp so callers can errors.Is-check and the lint flag goes away.
The keyset cursor in sessions_paged.go formats its bound timestamp at second precision to match SQLite's CURRENT_TIMESTAMP TEXT storage. That coupling was implicit; a future migration that upgraded bot_sessions.updated_at to sub-second precision would have silently broken paging. Pin the contract on both sides: - Migration baseline now carries a comment next to bot_sessions.updated_at explaining the precision invariant and the cursor-formatter dependency. - pagedCursorBindings comment now references the migration invariant and spells out the failure mode if either side changes alone. - Drop the errUnsupportedSQLiteTimestamp sentinel — no caller errors.Is's it, so the indirection only added a moving piece. - New TestSQLiteListSessionsByBotPagedWithinOneSecond walks paged listings across rows whose CURRENT_TIMESTAMP collapses to the same second; every row must be visited exactly once via the (updated_at, id) tiebreak. - New TestSQLiteListSessionsByBotPagedSurfacesRouteJoinColumns asserts that route_metadata and route_conversation_type round-trip through the LEFT JOIN, locking down the 14-column scan order against silent reorderings.
Tighten the paged session list endpoint along several fronts that all touch the same flow: - Has-more without a tail request. The handler asks the SQL layer for limit+1 rows and returns items[:limit]; next_cursor fires when the extra row exists, so a client paging through an exact-multiple of `limit` no longer issues a useless final request that returns []. The service signature stays (..., limit int64) and the handler documents the +1 bump at the call site. - Limit pipeline at int64 end-to-end. parseSessionLimitParam now returns int64; the service narrows once at the sqlc binding boundary via pagedLimitToInt32, which fails loudly if the caller hands it an out-of-range value. The //nolint:gosec on the old int32 cast is gone. - SQLite limit binding cast to int64 to match the rest of the package's driver bindings. - SessionCursor.IsZero() now means "neither half set". A partial cursor (only id or only updated_at) was previously treated like the zero value, which silently restarted pagination from the head and returned a duplicate page. pagedCursorParams now rejects the partial state as a programmer error — the handler-side decoder rejects 400 inputs already, so any partial state at the service boundary is internal-construction bug. - listSessionsResponse comment now documents the empty-next_cursor end-of-stream contract instead of paraphrasing the type name. - UserFacingSessionTypes is now a function returning a fresh slice rather than an exported mutable package-level slice, so callers can't poison the source of truth. Tests cover the new partial-cursor rejection, the IsZero contract, and update the handler tests to drive the +1 has-more probe.
…riant Initialize the sessions slice to an empty value so the JSON response carries `"items": []` when a page is empty, sparing clients a null check. Cover the behavior with a handler test. Move the "next_cursor reflects DB position, not filter survivorship" note onto trimPagedSessions where the invariant is actually enforced; the non-manager branch no longer needs to restate it.
The 0023 SQLite migration adds a partial composite index on bot_sessions, which made TestSQLiteModelEnableMigrationPreservesExisting Models fail under merge-with-main: createPreModelEnableSchema seeded only providers/models/model_variants, so when the test forced version to 21 and migrated up, 0023 hit a "no such table: main.bot_sessions" parse error before it could run. Seed a minimal bot_sessions stub (id, bot_id, updated_at, deleted_at — the columns 0023's index references) alongside the existing model tables. This is fixture upkeep, not a schema change: future migrations that touch other tables may need the same treatment.
sheepbox8646
approved these changes
Jun 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Make
GET /bots/{bot_id}/sessionsfilterable by session type and cursor-paginated. Heartbeat / schedule / subagent sessions are now hidden from the default response so the client never has to scan thousands of system-internal entries to render the sidebar.This is PR 1 of 3 in the SSE / chat-list refactor chain:
/messagesrequiressession_idAPI
GET /bots/{bot_id}/sessionsnow accepts:typeschat,discuss,acp_agentsession.UserFacingSessionTypeswhitelist server-sidelimit[1, 200]cursor<updated_at_rfc3339nano>|<id>Response gains
next_cursor:{ "items": [...], "next_cursor": "MjAyNi0wMy0yNFQxMDo0MTowNC41Mz..." }The cursor key is
(updated_at, id)so pages stay stable when many rows share anupdated_at.What this fixes
On a bot with months of activity,
bot_sessionsaccumulates ~30 heartbeat rows per hour (the heartbeat scheduler). One real instance had 2246 heartbeat sessions on the bot vs 25 chat/discuss — the unfiltered, unpaginated response returned all of them. The client filtered heartbeat out client-side, but every subsequent per-event handler still didsessions.value.some(...)on the full 2271-entry reactive array, which compounded into seconds of frozen UI on busy bots.After this PR the server returns only the 25 user-facing sessions by default, so
sessions.valueis small enough that the existing client code stops being O(thousands) per SSE event. The frontend changes that actually use the new query params land in PR 3.Compatibility
/sessions(the client filter for them was already there, so no visible change). Loses access to the previously implicit "all sessions" view; nothing reads it.cursor.No database migration; query additions only.
Test plan
go test ./internal/session/... ./internal/handlers/...— passes (4 new service tests, 7 new handler tests).GET /sessionsreturns onlychat / discuss / acp_agent.GET /sessions?types=chatreturns only chat.next_cursor, no duplicates, terminates with emptynext_cursor.limit=999→ clamped to 200.next_cursor.Notes
UserFacingSessionTypesis exported frominternal/sessionso PR 2 can use it on the SSE side without duplicating the whitelist.ListByBot/ListByBotAndCreatedByUserservice methods are intentionally untouched (no callers remain in this PR's handler, but a follow-up cleanup is safer once PR 2 confirms nothing else reaches for them).