Skip to content

📐 fix: Align Summarization Trigger Schema with Documented and Runtime-Supported Types#12735

Merged
danny-avila merged 8 commits intodevfrom
claude/recursing-sanderson-7b5826
Apr 20, 2026
Merged

📐 fix: Align Summarization Trigger Schema with Documented and Runtime-Supported Types#12735
danny-avila merged 8 commits intodevfrom
claude/recursing-sanderson-7b5826

Conversation

@danny-avila
Copy link
Copy Markdown
Owner

Summary

Fixes a schema/runtime/documentation mismatch that makes the summarization.trigger field entirely unusable as shipped in v0.8.5-rc1.

The Zod schema currently accepts only 'token_count' as a valid trigger type, but:

  • the official docs list three valid values: token_ratio, remaining_tokens, messages_to_refine
  • the @librechat/agents runtime (shouldTriggerSummarization) only evaluates those three types — it does not implement token_count at all, and silently returns false when it encounters any unrecognized type

The combined effect:

  • Anyone who follows the docs hits a startup Zod error.
  • Anyone who works around the schema with token_count gets a silent no-op — summarization never fires.
  • This also likely contributes to #12614 (summarization provider issue) — users diagnosing "summary never runs" without realizing the trigger never fires in the first place.

Change

Align the schema in packages/data-provider/src/config.ts with the documented, runtime-supported trigger types:

export const summarizationTriggerSchema = z.object({
  type: z.enum(['token_ratio', 'remaining_tokens', 'messages_to_refine']),
  value: z.number().positive(),
});

Also updated an existing unit test that had drifted to use token_count.

A defense-in-depth PR will land in @librechat/agents to log a warning when the runtime sees an unrecognized trigger type, so future schema/SDK drift surfaces immediately instead of silently no-opping.

Closes #12721

Test plan

  • New unit tests in packages/data-provider/specs/config-schemas.spec.ts validate that all three documented types parse, and that token_count plus other unknown values are rejected.
  • New unit test verifies the trigger parses inside the full summarizationConfigSchema.
  • Existing run-summarization passthrough tests updated from token_count to token_ratio and still pass.
  • cd packages/data-provider && npx jest specs/config-schemas.spec.ts — 53 passed.
  • cd packages/api && npx jest src/agents/__tests__/run-summarization.test.ts — 17 passed.
  • npm run build:data-provider — succeeds.
  • Manual: admin sets trigger: { type: 'token_ratio', value: 0.8 } in librechat.yaml, server boots without Zod errors, and summarization fires once context usage passes 80%.

The Zod schema for `summarization.trigger.type` only accepted
`'token_count'`, but:

- the documentation lists `token_ratio`, `remaining_tokens`, and
  `messages_to_refine` as valid
- the `@librechat/agents` runtime only evaluates those three types and
  silently no-ops on anything else

The result was a double failure: any user following the docs hit a
startup Zod error, and anyone who matched the schema by using
`token_count` got a silent no-op at runtime where summarization never
fired.

Align the schema with the documented, runtime-supported trigger types.

Closes #12721
Copilot AI review requested due to automatic review settings April 18, 2026 14:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Aligns the data-provider config schema with the documented/runtime-supported summarization trigger types so summarization.trigger works as described (fixing the token_count vs token_ratio mismatch seen in v0.8.5-rc1).

Changes:

  • Update summarizationTriggerSchema to accept token_ratio, remaining_tokens, and messages_to_refine.
  • Add/adjust unit tests to validate accepted/rejected trigger types and parsing within the full summarization config schema.
  • Update API agent passthrough test data from token_count to token_ratio.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
packages/data-provider/src/config.ts Updates Zod enum for summarization trigger types to match docs/runtime.
packages/data-provider/specs/config-schemas.spec.ts Adds schema tests covering valid/invalid trigger types and config embedding.
packages/api/src/agents/tests/run-summarization.test.ts Updates summarization passthrough test to use token_ratio.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown
Contributor

GitNexus: 🚀 deployed

The LibreChat-pr-12735 index is now live on the MCP server.
Deploy run

@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 072d224a2f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/data-provider/src/config.ts Outdated
export const summarizationTriggerSchema = z.object({
type: z.enum(['token_count']),
type: z.enum(['token_ratio', 'remaining_tokens', 'messages_to_refine']),
value: z.number().positive(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Constrain token_ratio trigger values to 0–1

summarizationTriggerSchema now accepts type: "token_ratio", but value is only validated as positive(), so values greater than 1 (for example 80) pass schema validation even though token usage ratio cannot exceed 1.0. In that configuration, summarization will never trigger, recreating a silent no-op path for misconfigured but schema-valid YAML; this is especially risky because the docs describe token_ratio as a fraction in the 0.0–1.0 range.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 599c0fe — switched to a discriminated union so token_ratio is bounded to (0, 1] while remaining_tokens and messages_to_refine keep unconstrained positive values (they're counts, not fractions). Added tests for value: 80 and value: 1.01 being rejected and value: 1 being accepted at the upper bound.

Per Codex review: the previous schema accepted `value: z.number().positive()`
for every trigger type. That meant `trigger: { type: 'token_ratio', value: 80 }`
(presumably meant as "80%") passed validation and then silently never fired —
because `usedRatio = 1 - remaining/max` is bounded at 1, so `>= 80` is always
false. That is exactly the silent-no-op pattern this PR is trying to eliminate.

Switch to a discriminated union so each trigger type has its own value
constraint:

- `token_ratio`: `(0, 1]` — documented as a fraction, so 80 is nonsense
- `remaining_tokens`: positive — token counts can be large
- `messages_to_refine`: positive — message counts can be > 1

Added tests for the upper-bound rejection and the inclusive upper bound
(`value: 1` still accepted as a valid "fire at 100%" extreme).
@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 599c0fed2b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/data-provider/src/config.ts Outdated
export const summarizationTriggerSchema = z.discriminatedUnion('type', [
z.object({
type: z.literal('token_ratio'),
value: z.number().positive().max(1),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow documented token_ratio: 0 trigger threshold

The new token_ratio branch rejects value: 0 because it uses .positive().max(1), but the published trigger range is 0.0–1.0; this means a documented configuration now fails schema validation and can stop startup when configSchema.strict().safeParse(...) runs in config loading. This regression only appears when admins intentionally set trigger.type: "token_ratio" with value: 0 (e.g., to force immediate summarization), and the fix is to include 0 in the accepted lower bound for that variant.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in bc4e2c8 — switched token_ratio from .positive().max(1) to .min(0).max(1) to match the documented inclusive 0.0–1.0 range. 0 is a valid extreme that means "summarize on every pruning event" per the SDK's usedRatio >= 0 check. remaining_tokens and messages_to_refine keep .positive()0 on either is semantically no-op given the SDK's messagesToRefineCount <= 0 early return.

@github-actions
Copy link
Copy Markdown
Contributor

GitNexus: 🚀 deployed

The LibreChat-pr-12735 index is now live on the MCP server.
Deploy run

Per Codex review: `.positive()` rejected `value: 0`, but the docs
describe the `token_ratio` range as `0.0–1.0` (both inclusive). Admins
who copy the documented lower bound into their YAML would fail schema
validation at startup.

Switch `token_ratio` to `.min(0).max(1)`. `0` is a valid (if extreme)
setting — the agents SDK's `usedRatio >= 0` check will fire as soon as
there is anything to refine, which is a legitimate "always summarize
when pruning happens" configuration.

`remaining_tokens` and `messages_to_refine` keep `.positive()`: both
are counts, and `0` there produces no meaningful behavior (the SDK
has an early return for `messagesToRefineCount <= 0`).
@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc4e2c89b0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/data-provider/src/config.ts Outdated
export const summarizationTriggerSchema = z.discriminatedUnion('type', [
z.object({
type: z.literal('token_ratio'),
value: z.number().min(0).max(1),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve zero-valued token_ratio triggers

token_ratio now accepts value: 0 here, but shapeSummarizationConfig forwards trigger only when config?.trigger?.value is truthy (packages/api/src/agents/run.ts, around lines 197-200). With value: 0, the trigger is silently dropped and runtime falls back to “no trigger configured,” which changes behavior (e.g., it can summarize on pruning even when token-ratio inputs are unavailable) instead of honoring the explicit ratio trigger. This makes a documented valid config behave differently than configured.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — addressed in 1aba326. shapeSummarizationConfig was using a truthy check on config?.trigger?.value which dropped value: 0 silently. Switched to typeof value === 'number' so the trigger survives and the runtime honors the explicit ratio instead of falling back to the no-trigger default. Added a regression test for the { type: 'token_ratio', value: 0 } passthrough.

@github-actions
Copy link
Copy Markdown
Contributor

GitNexus: 🚀 deployed

The LibreChat-pr-12735 index is now live on the MCP server.
Deploy run

Per Codex review: now that the schema accepts `token_ratio: 0`,
`shapeSummarizationConfig` would silently drop it because of a truthy
check on `config?.trigger?.value`. The trigger would disappear and the
runtime would fall back to "no trigger configured" — which fires on any
pruning rather than honoring the explicit ratio.

Switch to `typeof value === 'number'`, which preserves `0` while still
rejecting `undefined`/`null`.

Added a regression test that asserts `{ type: 'token_ratio', value: 0 }`
survives the shaping function untouched.
@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1aba3264e4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/data-provider/src/config.ts Outdated
}),
z.object({
type: z.literal('remaining_tokens'),
value: z.number().positive(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Disallow Infinity in count-based trigger values

The new remaining_tokens/messages_to_refine branches validate value with z.number().positive(), which in Zod still accepts Infinity (e.g., YAML .inf). That means config validation can succeed, but the summarization runtime rejects non-finite trigger values and returns false, so summarization silently never fires even though the server starts cleanly. This mismatch is user-visible whenever trigger.value becomes infinite and should be blocked at schema level (e.g., .finite()) to keep validation and runtime behavior aligned.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in a2731a9 — added .finite() to every trigger value. z.number().positive() accepts Infinity (and z.number() accepts NaN), which would pass validation and then silently no-op because the agents SDK guards every path with Number.isFinite(...). .finite() is applied uniformly (even to token_ratio where .max(1) already rules out Infinity but not NaN).

Per Codex review: `z.number().positive()` still accepts `Infinity` and
`NaN` (via YAML `.inf`, `.nan`). Config validation would succeed, but
the agents SDK guards every trigger path with `Number.isFinite(...)` and
silently returns `false` — summarization never fires while the server
starts cleanly. That is the exact schema/runtime split this PR is trying
to eliminate.

Add `.finite()` to every trigger value. `token_ratio` already had an
implicit guard via `.max(1)`, but applying `.finite()` uniformly keeps
the intent obvious and catches `NaN` (which `.max(1)` does not).
@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

@github-actions
Copy link
Copy Markdown
Contributor

GitNexus: 🚀 deployed

The LibreChat-pr-12735 index is now live on the MCP server.
Deploy run

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@github-actions
Copy link
Copy Markdown
Contributor

GitNexus: 🚀 deployed

The LibreChat-pr-12735 index is now live on the MCP server.
Deploy run

Two findings from the comprehensive review:

1. `remaining_tokens` and `messages_to_refine` are token/message counts
   and are always integers in the runtime (`Number.isFinite(...)` guards
   already assume integer semantics). `z.number().positive()` accepted
   fractional values like `2.5`, which was semantically confusing and
   would round oddly against the runtime's `>=` / `<=` comparisons. Add
   `.int()` to both count-based branches; `token_ratio` stays fractional.

2. Anyone upgrading with `trigger.type: 'token_count'` in their YAML got
   the generic "Invalid summarization config" warning plus a flattened
   Zod error. Detect that specific case in `loadSummarizationConfig` and
   emit a migration-friendly message that names the three valid
   replacements. Export the function so the behavior is unit-testable.

Also added a parameterized passthrough test covering `remaining_tokens`
and `messages_to_refine` shaping, complementing the existing
`token_ratio` coverage.
@danny-avila
Copy link
Copy Markdown
Owner Author

Audited the comprehensive review findings. Addressed in 7521b7e:

Finding 1 (MINOR): .int() on count-based triggers — Valid. remaining_tokens and messages_to_refine are integer quantities in the runtime and the >= / <= comparisons assumed that. Added .int() to both (plus tests that reject 500.5 and 2.5). token_ratio stays fractional.

Finding 2 (MINOR): Migration warning for legacy token_count — Valid. The generic safeParse-fail log buried the fix behind a flattened Zod error. Added a targeted path in loadSummarizationConfig that detects trigger.type === 'token_count' pre-validation and logs a migration message naming the three valid replacements. Exported the function and added a spec covering the happy path, the token_count branch, and the generic fallback.

Finding 3 (NIT): Passthrough coverage for non-ratio triggers — Valid. Added a parameterized test (it.each) covering remaining_tokens and messages_to_refine shaping, complementing the existing token_ratio coverage.

Finding 4 (NIT): Record<string, unknown> cast in test — Skipped. This is the pre-existing pattern in run-summarization.test.ts (used 4+ times before this PR); touching it here would create inconsistency within the same file. Worth a follow-up that migrates all occurrences at once.

All tests green: 60 config-schema, 20 run-summarization, 4 new loadSummarizationConfig.

@github-actions
Copy link
Copy Markdown
Contributor

GitNexus: 🚀 deployed

The LibreChat-pr-12735 index is now live on the MCP server.
Deploy run

Two nits from the follow-up audit:

1. The legacy-`token_count` warning claimed "Summarization will be
   disabled," but `shapeSummarizationConfig` treats a missing
   summarization config as self-summarize mode (fires on every pruning
   event using the agent's own provider/model). "Disabled" would
   mislead an admin into stopping investigation. Reword to describe the
   actual fallback and assert the new wording in the spec.

2. Add a regression test for the `trigger: 'bare-string'` YAML case, so
   the `typeof raw.trigger === 'object'` guard is exercised rather than
   implied.

3. Swap the en-dash in `(0–1)` for an ASCII hyphen so the log message
   is safe in every terminal/aggregator regardless of UTF-8 handling.
@danny-avila
Copy link
Copy Markdown
Owner Author

Follow-up audit findings — all addressed in 3d11e26:

Finding 1 (MINOR): "disabled" wording was misleading — Valid and worth fixing. When loadSummarizationConfig returns undefined, shapeSummarizationConfig falls back to self-summarize using the agent's own provider/model with no trigger (fires on every pruning event). Reworded the log to describe that behavior exactly so an admin has the full picture and knows to actually fix the config.

Finding 2 (NIT): untested bare-string trigger branch — Added a regression test for { summarization: { trigger: 'token_count' } } that asserts the typeof raw.trigger === 'object' guard routes it to the generic warning rather than the migration-specific branch.

Finding 3 (NIT): en-dash in log message — Swapped 0–1 (U+2013) for 0-1 (ASCII) so the message renders safely in every terminal/aggregator regardless of UTF-8 handling.

All 5 loadSummarizationConfig tests pass.

…union

CI TS check failed: after the schema tightening, `raw.trigger.type` is
narrowed to `"token_ratio" | "remaining_tokens" | "messages_to_refine"
| undefined`, so the runtime comparison to `"token_count"` is a
TS2367 ("no overlap") error even though that's exactly the comparison
we want for the migration guard.

Widen just that one access via `as { type?: unknown }` so the migration
check reads runtime-shaped YAML input without the type system folding
it back into the narrowed union.
@github-actions
Copy link
Copy Markdown
Contributor

GitNexus: 🚀 deployed

The LibreChat-pr-12735 index is now live on the MCP server.
Deploy run

@danny-avila danny-avila changed the title 🐛 fix: accept documented summarization.trigger.type values 📐 fix: Align Summarization Trigger Schema with Documented and Runtime-Supported Types Apr 20, 2026
@danny-avila danny-avila changed the base branch from main to dev April 20, 2026 02:33
@danny-avila danny-avila merged commit ccf3a6c into dev Apr 20, 2026
14 checks passed
@danny-avila danny-avila deleted the claude/recursing-sanderson-7b5826 branch April 20, 2026 02:33
krgokul pushed a commit to syedhabib39/LibreChat that referenced this pull request Apr 20, 2026
…-Supported Types (danny-avila#12735)

* 🐛 fix: accept documented `summarization.trigger.type` values

The Zod schema for `summarization.trigger.type` only accepted
`'token_count'`, but:

- the documentation lists `token_ratio`, `remaining_tokens`, and
  `messages_to_refine` as valid
- the `@librechat/agents` runtime only evaluates those three types and
  silently no-ops on anything else

The result was a double failure: any user following the docs hit a
startup Zod error, and anyone who matched the schema by using
`token_count` got a silent no-op at runtime where summarization never
fired.

Align the schema with the documented, runtime-supported trigger types.

Closes danny-avila#12721

* 🧹 fix: bound `token_ratio` trigger value to (0, 1]

Per Codex review: the previous schema accepted `value: z.number().positive()`
for every trigger type. That meant `trigger: { type: 'token_ratio', value: 80 }`
(presumably meant as "80%") passed validation and then silently never fired —
because `usedRatio = 1 - remaining/max` is bounded at 1, so `>= 80` is always
false. That is exactly the silent-no-op pattern this PR is trying to eliminate.

Switch to a discriminated union so each trigger type has its own value
constraint:

- `token_ratio`: `(0, 1]` — documented as a fraction, so 80 is nonsense
- `remaining_tokens`: positive — token counts can be large
- `messages_to_refine`: positive — message counts can be > 1

Added tests for the upper-bound rejection and the inclusive upper bound
(`value: 1` still accepted as a valid "fire at 100%" extreme).

* 🧹 fix: accept `token_ratio: 0` per documented 0.0–1.0 inclusive range

Per Codex review: `.positive()` rejected `value: 0`, but the docs
describe the `token_ratio` range as `0.0–1.0` (both inclusive). Admins
who copy the documented lower bound into their YAML would fail schema
validation at startup.

Switch `token_ratio` to `.min(0).max(1)`. `0` is a valid (if extreme)
setting — the agents SDK's `usedRatio >= 0` check will fire as soon as
there is anything to refine, which is a legitimate "always summarize
when pruning happens" configuration.

`remaining_tokens` and `messages_to_refine` keep `.positive()`: both
are counts, and `0` there produces no meaningful behavior (the SDK
has an early return for `messagesToRefineCount <= 0`).

* 🐛 fix: preserve `token_ratio` trigger when `value: 0`

Per Codex review: now that the schema accepts `token_ratio: 0`,
`shapeSummarizationConfig` would silently drop it because of a truthy
check on `config?.trigger?.value`. The trigger would disappear and the
runtime would fall back to "no trigger configured" — which fires on any
pruning rather than honoring the explicit ratio.

Switch to `typeof value === 'number'`, which preserves `0` while still
rejecting `undefined`/`null`.

Added a regression test that asserts `{ type: 'token_ratio', value: 0 }`
survives the shaping function untouched.

* 🧹 fix: reject non-finite trigger values at schema level

Per Codex review: `z.number().positive()` still accepts `Infinity` and
`NaN` (via YAML `.inf`, `.nan`). Config validation would succeed, but
the agents SDK guards every trigger path with `Number.isFinite(...)` and
silently returns `false` — summarization never fires while the server
starts cleanly. That is the exact schema/runtime split this PR is trying
to eliminate.

Add `.finite()` to every trigger value. `token_ratio` already had an
implicit guard via `.max(1)`, but applying `.finite()` uniformly keeps
the intent obvious and catches `NaN` (which `.max(1)` does not).

* 🧹 fix: integer counts + targeted token_count migration warning

Two findings from the comprehensive review:

1. `remaining_tokens` and `messages_to_refine` are token/message counts
   and are always integers in the runtime (`Number.isFinite(...)` guards
   already assume integer semantics). `z.number().positive()` accepted
   fractional values like `2.5`, which was semantically confusing and
   would round oddly against the runtime's `>=` / `<=` comparisons. Add
   `.int()` to both count-based branches; `token_ratio` stays fractional.

2. Anyone upgrading with `trigger.type: 'token_count'` in their YAML got
   the generic "Invalid summarization config" warning plus a flattened
   Zod error. Detect that specific case in `loadSummarizationConfig` and
   emit a migration-friendly message that names the three valid
   replacements. Export the function so the behavior is unit-testable.

Also added a parameterized passthrough test covering `remaining_tokens`
and `messages_to_refine` shaping, complementing the existing
`token_ratio` coverage.

* 🧹 fix: accurate fallback wording + bare-string trigger test

Two nits from the follow-up audit:

1. The legacy-`token_count` warning claimed "Summarization will be
   disabled," but `shapeSummarizationConfig` treats a missing
   summarization config as self-summarize mode (fires on every pruning
   event using the agent's own provider/model). "Disabled" would
   mislead an admin into stopping investigation. Reword to describe the
   actual fallback and assert the new wording in the spec.

2. Add a regression test for the `trigger: 'bare-string'` YAML case, so
   the `typeof raw.trigger === 'object'` guard is exercised rather than
   implied.

3. Swap the en-dash in `(0–1)` for an ASCII hyphen so the log message
   is safe in every terminal/aggregator regardless of UTF-8 handling.

* 🔇 fix: cast `raw.trigger.type` to inspect legacy value past narrowed union

CI TS check failed: after the schema tightening, `raw.trigger.type` is
narrowed to `"token_ratio" | "remaining_tokens" | "messages_to_refine"
| undefined`, so the runtime comparison to `"token_count"` is a
TS2367 ("no overlap") error even though that's exactly the comparison
we want for the migration guard.

Widen just that one access via `as { type?: unknown }` so the migration
check reads runtime-shaped YAML input without the type system folding
it back into the narrowed union.
krgokul pushed a commit to syedhabib39/LibreChat that referenced this pull request Apr 21, 2026
…-Supported Types (danny-avila#12735)

* 🐛 fix: accept documented `summarization.trigger.type` values

The Zod schema for `summarization.trigger.type` only accepted
`'token_count'`, but:

- the documentation lists `token_ratio`, `remaining_tokens`, and
  `messages_to_refine` as valid
- the `@librechat/agents` runtime only evaluates those three types and
  silently no-ops on anything else

The result was a double failure: any user following the docs hit a
startup Zod error, and anyone who matched the schema by using
`token_count` got a silent no-op at runtime where summarization never
fired.

Align the schema with the documented, runtime-supported trigger types.

Closes danny-avila#12721

* 🧹 fix: bound `token_ratio` trigger value to (0, 1]

Per Codex review: the previous schema accepted `value: z.number().positive()`
for every trigger type. That meant `trigger: { type: 'token_ratio', value: 80 }`
(presumably meant as "80%") passed validation and then silently never fired —
because `usedRatio = 1 - remaining/max` is bounded at 1, so `>= 80` is always
false. That is exactly the silent-no-op pattern this PR is trying to eliminate.

Switch to a discriminated union so each trigger type has its own value
constraint:

- `token_ratio`: `(0, 1]` — documented as a fraction, so 80 is nonsense
- `remaining_tokens`: positive — token counts can be large
- `messages_to_refine`: positive — message counts can be > 1

Added tests for the upper-bound rejection and the inclusive upper bound
(`value: 1` still accepted as a valid "fire at 100%" extreme).

* 🧹 fix: accept `token_ratio: 0` per documented 0.0–1.0 inclusive range

Per Codex review: `.positive()` rejected `value: 0`, but the docs
describe the `token_ratio` range as `0.0–1.0` (both inclusive). Admins
who copy the documented lower bound into their YAML would fail schema
validation at startup.

Switch `token_ratio` to `.min(0).max(1)`. `0` is a valid (if extreme)
setting — the agents SDK's `usedRatio >= 0` check will fire as soon as
there is anything to refine, which is a legitimate "always summarize
when pruning happens" configuration.

`remaining_tokens` and `messages_to_refine` keep `.positive()`: both
are counts, and `0` there produces no meaningful behavior (the SDK
has an early return for `messagesToRefineCount <= 0`).

* 🐛 fix: preserve `token_ratio` trigger when `value: 0`

Per Codex review: now that the schema accepts `token_ratio: 0`,
`shapeSummarizationConfig` would silently drop it because of a truthy
check on `config?.trigger?.value`. The trigger would disappear and the
runtime would fall back to "no trigger configured" — which fires on any
pruning rather than honoring the explicit ratio.

Switch to `typeof value === 'number'`, which preserves `0` while still
rejecting `undefined`/`null`.

Added a regression test that asserts `{ type: 'token_ratio', value: 0 }`
survives the shaping function untouched.

* 🧹 fix: reject non-finite trigger values at schema level

Per Codex review: `z.number().positive()` still accepts `Infinity` and
`NaN` (via YAML `.inf`, `.nan`). Config validation would succeed, but
the agents SDK guards every trigger path with `Number.isFinite(...)` and
silently returns `false` — summarization never fires while the server
starts cleanly. That is the exact schema/runtime split this PR is trying
to eliminate.

Add `.finite()` to every trigger value. `token_ratio` already had an
implicit guard via `.max(1)`, but applying `.finite()` uniformly keeps
the intent obvious and catches `NaN` (which `.max(1)` does not).

* 🧹 fix: integer counts + targeted token_count migration warning

Two findings from the comprehensive review:

1. `remaining_tokens` and `messages_to_refine` are token/message counts
   and are always integers in the runtime (`Number.isFinite(...)` guards
   already assume integer semantics). `z.number().positive()` accepted
   fractional values like `2.5`, which was semantically confusing and
   would round oddly against the runtime's `>=` / `<=` comparisons. Add
   `.int()` to both count-based branches; `token_ratio` stays fractional.

2. Anyone upgrading with `trigger.type: 'token_count'` in their YAML got
   the generic "Invalid summarization config" warning plus a flattened
   Zod error. Detect that specific case in `loadSummarizationConfig` and
   emit a migration-friendly message that names the three valid
   replacements. Export the function so the behavior is unit-testable.

Also added a parameterized passthrough test covering `remaining_tokens`
and `messages_to_refine` shaping, complementing the existing
`token_ratio` coverage.

* 🧹 fix: accurate fallback wording + bare-string trigger test

Two nits from the follow-up audit:

1. The legacy-`token_count` warning claimed "Summarization will be
   disabled," but `shapeSummarizationConfig` treats a missing
   summarization config as self-summarize mode (fires on every pruning
   event using the agent's own provider/model). "Disabled" would
   mislead an admin into stopping investigation. Reword to describe the
   actual fallback and assert the new wording in the spec.

2. Add a regression test for the `trigger: 'bare-string'` YAML case, so
   the `typeof raw.trigger === 'object'` guard is exercised rather than
   implied.

3. Swap the en-dash in `(0–1)` for an ASCII hyphen so the log message
   is safe in every terminal/aggregator regardless of UTF-8 handling.

* 🔇 fix: cast `raw.trigger.type` to inspect legacy value past narrowed union

CI TS check failed: after the schema tightening, `raw.trigger.type` is
narrowed to `"token_ratio" | "remaining_tokens" | "messages_to_refine"
| undefined`, so the runtime comparison to `"token_count"` is a
TS2367 ("no overlap") error even though that's exactly the comparison
we want for the migration guard.

Widen just that one access via `as { type?: unknown }` so the migration
check reads runtime-shaped YAML input without the type system folding
it back into the narrowed union.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: token_ratio trigger for summarization does not work

2 participants