📐 fix: Align Summarization Trigger Schema with Documented and Runtime-Supported Types (danny-avila#12735)

danny-avila · krgokul · commit 20a627e8c8d2 · 2026-04-20T13:09:36.000+05:30
* 🐛 fix: accept documented `summarization.trigger.type` values The Zod schema for `summarization.trigger.type` only accepted `'token_count'`, but: - the documentation lists `token_ratio`, `remaining_tokens`, and `messages_to_refine` as valid - the `@librechat/agents` runtime only evaluates those three types and silently no-ops on anything else The result was a double failure: any user following the docs hit a startup Zod error, and anyone who matched the schema by using `token_count` got a silent no-op at runtime where summarization never fired. Align the schema with the documented, runtime-supported trigger types. Closes danny-avila#12721 * 🧹 fix: bound `token_ratio` trigger value to (0, 1] Per Codex review: the previous schema accepted `value: z.number().positive()` for every trigger type. That meant `trigger: { type: 'token_ratio', value: 80 }` (presumably meant as "80%") passed validation and then silently never fired — because `usedRatio = 1 - remaining/max` is bounded at 1, so `>= 80` is always false. That is exactly the silent-no-op pattern this PR is trying to eliminate. Switch to a discriminated union so each trigger type has its own value constraint: - `token_ratio`: `(0, 1]` — documented as a fraction, so 80 is nonsense - `remaining_tokens`: positive — token counts can be large - `messages_to_refine`: positive — message counts can be > 1 Added tests for the upper-bound rejection and the inclusive upper bound (`value: 1` still accepted as a valid "fire at 100%" extreme). * 🧹 fix: accept `token_ratio: 0` per documented 0.0–1.0 inclusive range Per Codex review: `.positive()` rejected `value: 0`, but the docs describe the `token_ratio` range as `0.0–1.0` (both inclusive). Admins who copy the documented lower bound into their YAML would fail schema validation at startup. Switch `token_ratio` to `.min(0).max(1)`. `0` is a valid (if extreme) setting — the agents SDK's `usedRatio >= 0` check will fire as soon as there is anything to refine, which is a legitimate "always summarize when pruning happens" configuration. `remaining_tokens` and `messages_to_refine` keep `.positive()`: both are counts, and `0` there produces no meaningful behavior (the SDK has an early return for `messagesToRefineCount <= 0`). * 🐛 fix: preserve `token_ratio` trigger when `value: 0` Per Codex review: now that the schema accepts `token_ratio: 0`, `shapeSummarizationConfig` would silently drop it because of a truthy check on `config?.trigger?.value`. The trigger would disappear and the runtime would fall back to "no trigger configured" — which fires on any pruning rather than honoring the explicit ratio. Switch to `typeof value === 'number'`, which preserves `0` while still rejecting `undefined`/`null`. Added a regression test that asserts `{ type: 'token_ratio', value: 0 }` survives the shaping function untouched. * 🧹 fix: reject non-finite trigger values at schema level Per Codex review: `z.number().positive()` still accepts `Infinity` and `NaN` (via YAML `.inf`, `.nan`). Config validation would succeed, but the agents SDK guards every trigger path with `Number.isFinite(...)` and silently returns `false` — summarization never fires while the server starts cleanly. That is the exact schema/runtime split this PR is trying to eliminate. Add `.finite()` to every trigger value. `token_ratio` already had an implicit guard via `.max(1)`, but applying `.finite()` uniformly keeps the intent obvious and catches `NaN` (which `.max(1)` does not). * 🧹 fix: integer counts + targeted token_count migration warning Two findings from the comprehensive review: 1. `remaining_tokens` and `messages_to_refine` are token/message counts and are always integers in the runtime (`Number.isFinite(...)` guards already assume integer semantics). `z.number().positive()` accepted fractional values like `2.5`, which was semantically confusing and would round oddly against the runtime's `>=` / `<=` comparisons. Add `.int()` to both count-based branches; `token_ratio` stays fractional. 2. Anyone upgrading with `trigger.type: 'token_count'` in their YAML got the generic "Invalid summarization config" warning plus a flattened Zod error. Detect that specific case in `loadSummarizationConfig` and emit a migration-friendly message that names the three valid replacements. Export the function so the behavior is unit-testable. Also added a parameterized passthrough test covering `remaining_tokens` and `messages_to_refine` shaping, complementing the existing `token_ratio` coverage. * 🧹 fix: accurate fallback wording + bare-string trigger test Two nits from the follow-up audit: 1. The legacy-`token_count` warning claimed "Summarization will be disabled," but `shapeSummarizationConfig` treats a missing summarization config as self-summarize mode (fires on every pruning event using the agent's own provider/model). "Disabled" would mislead an admin into stopping investigation. Reword to describe the actual fallback and assert the new wording in the spec. 2. Add a regression test for the `trigger: 'bare-string'` YAML case, so the `typeof raw.trigger === 'object'` guard is exercised rather than implied. 3. Swap the en-dash in `(0–1)` for an ASCII hyphen so the log message is safe in every terminal/aggregator regardless of UTF-8 handling. * 🔇 fix: cast `raw.trigger.type` to inspect legacy value past narrowed union CI TS check failed: after the schema tightening, `raw.trigger.type` is narrowed to `"token_ratio" | "remaining_tokens" | "messages_to_refine" | undefined`, so the runtime comparison to `"token_count"` is a TS2367 ("no overlap") error even though that's exactly the comparison we want for the migration guard. Widen just that one access via `as { type?: unknown }` so the migration check reads runtime-shaped YAML input without the type system folding it back into the narrowed union.
diff --git a/packages/api/src/agents/__tests__/run-summarization.test.ts b/packages/api/src/agents/__tests__/run-summarization.test.ts
@@ -218,7 +218,7 @@ describe('summarizationConfig field passthrough', () => {
     const agents = await callAndCapture({
       summarizationConfig: {
         enabled: true,
-        trigger: { type: 'token_count', value: 8000 },
+        trigger: { type: 'token_ratio', value: 0.8 },
         provider: 'anthropic',
         model: 'claude-3-haiku',
         parameters: { temperature: 0.2 },
@@ -233,7 +233,7 @@ describe('summarizationConfig field passthrough', () => {
     // `enabled` is not forwarded to the agent-level config — it is resolved
     // into the separate `summarizationEnabled` boolean on the agent input.
     expect(agents[0].summarizationEnabled).toBe(true);
-    expect(config.trigger).toEqual({ type: 'token_count', value: 8000 });
+    expect(config.trigger).toEqual({ type: 'token_ratio', value: 0.8 });
     expect(config.provider).toBe('anthropic');
     expect(config.model).toBe('claude-3-haiku');
     expect(config.parameters).toEqual({ temperature: 0.2 });
@@ -254,6 +254,31 @@ describe('summarizationConfig field passthrough', () => {
     expect(config.provider).toBe('openAI');
     expect(config.model).toBe('gpt-4o');
   });
+
+  it('preserves `token_ratio` trigger with `value: 0` (documented, extreme-but-valid)', async () => {
+    const agents = await callAndCapture({
+      summarizationConfig: {
+        enabled: true,
+        trigger: { type: 'token_ratio', value: 0 },
+      },
+    });
+    const config = agents[0].summarizationConfig as Record<string, unknown>;
+    expect(config.trigger).toEqual({ type: 'token_ratio', value: 0 });
+  });
+
+  it.each([
+    ['remaining_tokens', 500],
+    ['messages_to_refine', 4],
+  ] as const)('passes %s trigger through unchanged', async (type, value) => {
+    const agents = await callAndCapture({
+      summarizationConfig: {
+        enabled: true,
+        trigger: { type, value },
+      },
+    });
+    const config = agents[0].summarizationConfig as Record<string, unknown>;
+    expect(config.trigger).toEqual({ type, value });
+  });
 });
 
 // ---------------------------------------------------------------------------
diff --git a/packages/api/src/agents/run.ts b/packages/api/src/agents/run.ts
@@ -195,7 +195,7 @@ function shapeSummarizationConfig(
   const provider = config?.provider ?? fallbackProvider;
   const model = config?.model ?? fallbackModel;
   const trigger =
-    config?.trigger?.type && config?.trigger?.value
+    config?.trigger?.type && typeof config?.trigger?.value === 'number'
       ? { type: config.trigger.type, value: config.trigger.value }
       : undefined;
 
diff --git a/packages/data-provider/specs/config-schemas.spec.ts b/packages/data-provider/specs/config-schemas.spec.ts
@@ -7,6 +7,8 @@ import {
   interfaceSchema,
   fileStorageSchema,
   fileStrategiesSchema,
+  summarizationTriggerSchema,
+  summarizationConfigSchema,
 } from '../src/config';
 import { tModelSpecPresetSchema, EModelEndpoint } from '../src/schemas';
 import { FileSources } from '../src/types/files';
@@ -502,3 +504,109 @@ describe('interfaceSchema', () => {
     expect(result.modelSelect).toBe(false);
   });
 });
+
+describe('summarizationTriggerSchema', () => {
+  it.each([
+    ['token_ratio', 0.8],
+    ['remaining_tokens', 500],
+    ['messages_to_refine', 4],
+  ] as const)('accepts documented trigger type "%s" with a sensible value', (type, value) => {
+    const result = summarizationTriggerSchema.safeParse({ type, value });
+    expect(result.success).toBe(true);
+  });
+
+  it('rejects the legacy/typoed "token_count" trigger type', () => {
+    const result = summarizationTriggerSchema.safeParse({
+      type: 'token_count',
+      value: 8000,
+    });
+    expect(result.success).toBe(false);
+  });
+
+  it('rejects unknown trigger types', () => {
+    const result = summarizationTriggerSchema.safeParse({
+      type: 'never_heard_of_it',
+      value: 1,
+    });
+    expect(result.success).toBe(false);
+  });
+
+  it('rejects negative values on any trigger type', () => {
+    expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: -0.5 }).success).toBe(
+      false,
+    );
+    expect(
+      summarizationTriggerSchema.safeParse({ type: 'remaining_tokens', value: -1 }).success,
+    ).toBe(false);
+    expect(
+      summarizationTriggerSchema.safeParse({ type: 'messages_to_refine', value: -1 }).success,
+    ).toBe(false);
+  });
+
+  it('rejects zero for count-based triggers where it has no meaningful effect', () => {
+    expect(
+      summarizationTriggerSchema.safeParse({ type: 'remaining_tokens', value: 0 }).success,
+    ).toBe(false);
+    expect(
+      summarizationTriggerSchema.safeParse({ type: 'messages_to_refine', value: 0 }).success,
+    ).toBe(false);
+  });
+
+  it('rejects token_ratio values > 1 to catch the "80 meant as 80%" mistake', () => {
+    expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 80 }).success).toBe(
+      false,
+    );
+    expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 1.01 }).success).toBe(
+      false,
+    );
+  });
+
+  it('accepts token_ratio values at the inclusive 0 and 1 bounds per docs', () => {
+    expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 0 }).success).toBe(
+      true,
+    );
+    expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 1 }).success).toBe(
+      true,
+    );
+  });
+
+  it('allows remaining_tokens and messages_to_refine values above 1 (token/message counts)', () => {
+    expect(
+      summarizationTriggerSchema.safeParse({ type: 'remaining_tokens', value: 2000 }).success,
+    ).toBe(true);
+    expect(
+      summarizationTriggerSchema.safeParse({ type: 'messages_to_refine', value: 20 }).success,
+    ).toBe(true);
+  });
+
+  it('rejects non-finite values (Infinity, NaN) for every trigger type', () => {
+    for (const type of ['token_ratio', 'remaining_tokens', 'messages_to_refine'] as const) {
+      expect(summarizationTriggerSchema.safeParse({ type, value: Infinity }).success).toBe(false);
+      expect(summarizationTriggerSchema.safeParse({ type, value: -Infinity }).success).toBe(false);
+      expect(summarizationTriggerSchema.safeParse({ type, value: NaN }).success).toBe(false);
+    }
+  });
+
+  it('requires integer values for count-based triggers', () => {
+    expect(
+      summarizationTriggerSchema.safeParse({ type: 'remaining_tokens', value: 500.5 }).success,
+    ).toBe(false);
+    expect(
+      summarizationTriggerSchema.safeParse({ type: 'messages_to_refine', value: 2.5 }).success,
+    ).toBe(false);
+  });
+
+  it('still allows fractional values for token_ratio', () => {
+    expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 0.8 }).success).toBe(
+      true,
+    );
+  });
+
+  it('parses inside the full summarization config', () => {
+    const result = summarizationConfigSchema.safeParse({
+      enabled: true,
+      trigger: { type: 'token_ratio', value: 0.8 },
+    });
+    expect(result.success).toBe(true);
+  });
+});
diff --git a/packages/data-provider/src/config.ts b/packages/data-provider/src/config.ts
@@ -1020,10 +1020,20 @@ export const memorySchema = z.object({
 
 export type TMemoryConfig = DeepPartial<z.infer<typeof memorySchema>>;
 
-export const summarizationTriggerSchema = z.object({
-  type: z.enum(['token_count']),
-  value: z.number().positive(),
-});
+export const summarizationTriggerSchema = z.discriminatedUnion('type', [
+  z.object({
+    type: z.literal('token_ratio'),
+    value: z.number().finite().min(0).max(1),
+  }),
+  z.object({
+    type: z.literal('remaining_tokens'),
+    value: z.number().finite().int().positive(),
+  }),
+  z.object({
+    type: z.literal('messages_to_refine'),
+    value: z.number().finite().int().positive(),
+  }),
+]);
 
 export const contextPruningSchema = z.object({
   enabled: z.boolean().optional(),
diff --git a/packages/data-schemas/src/app/service.spec.ts b/packages/data-schemas/src/app/service.spec.ts
@@ -0,0 +1,80 @@
+import type { DeepPartial, TCustomConfig } from 'librechat-data-provider';
+import { loadSummarizationConfig } from './service';
+import logger from '~/config/winston';
+
+jest.mock('~/config/winston', () => ({
+  __esModule: true,
+  default: {
+    warn: jest.fn(),
+    info: jest.fn(),
+    error: jest.fn(),
+    debug: jest.fn(),
+  },
+}));
+
+describe('loadSummarizationConfig', () => {
+  const warnSpy = logger.warn as jest.Mock;
+
+  beforeEach(() => {
+    warnSpy.mockClear();
+  });
+
+  it('returns undefined when no summarization config is provided', () => {
+    expect(loadSummarizationConfig({} as DeepPartial<TCustomConfig>)).toBeUndefined();
+  });
+
+  it('accepts a valid token_ratio trigger', () => {
+    const result = loadSummarizationConfig({
+      summarization: {
+        enabled: true,
+        trigger: { type: 'token_ratio', value: 0.8 },
+      },
+    } as DeepPartial<TCustomConfig>);
+
+    expect(result).toBeDefined();
+    expect(result?.enabled).toBe(true);
+    expect(result?.trigger).toEqual({ type: 'token_ratio', value: 0.8 });
+    expect(warnSpy).not.toHaveBeenCalled();
+  });
+
+  it('emits a targeted migration warning when trigger.type is the legacy "token_count"', () => {
+    const result = loadSummarizationConfig({
+      summarization: {
+        trigger: { type: 'token_count', value: 8000 },
+      },
+    } as unknown as DeepPartial<TCustomConfig>);
+
+    expect(result).toBeUndefined();
+    expect(warnSpy).toHaveBeenCalledTimes(1);
+    const message = String(warnSpy.mock.calls[0][0]);
+    expect(message).toContain('token_count');
+    expect(message).toContain('token_ratio');
+    expect(message).toContain('remaining_tokens');
+    expect(message).toContain('messages_to_refine');
+    expect(message).toContain('fall back');
+  });
+
+  it('falls back to the generic warning when trigger is a bare string (not an object)', () => {
+    const result = loadSummarizationConfig({
+      summarization: {
+        trigger: 'token_count',
+      },
+    } as unknown as DeepPartial<TCustomConfig>);
+
+    expect(result).toBeUndefined();
+    expect(warnSpy).toHaveBeenCalledTimes(1);
+    expect(String(warnSpy.mock.calls[0][0])).toContain('Invalid summarization config');
+  });
+
+  it('falls back to the generic warning for other schema violations', () => {
+    const result = loadSummarizationConfig({
+      summarization: {
+        trigger: { type: 'token_ratio', value: 80 },
+      },
+    } as unknown as DeepPartial<TCustomConfig>);
+
+    expect(result).toBeUndefined();
+    expect(warnSpy).toHaveBeenCalledTimes(1);
+    expect(String(warnSpy.mock.calls[0][0])).toContain('Invalid summarization config');
+  });
+});
diff --git a/packages/data-schemas/src/app/service.ts b/packages/data-schemas/src/app/service.ts
@@ -15,12 +15,30 @@ import { loadEndpoints } from './endpoints';
 import { loadOCRConfig } from './ocr';
 import logger from '~/config/winston';
 
-function loadSummarizationConfig(config: DeepPartial<TCustomConfig>): AppConfig['summarization'] {
+export function loadSummarizationConfig(
+  config: DeepPartial<TCustomConfig>,
+): AppConfig['summarization'] {
   const raw = config.summarization;
   if (!raw || typeof raw !== 'object') {
     return undefined;
   }
 
+  if (
+    raw.trigger &&
+    typeof raw.trigger === 'object' &&
+    (raw.trigger as { type?: unknown }).type === 'token_count'
+  ) {
+    logger.warn(
+      "[AppService] `summarization.trigger.type: 'token_count'` is no longer supported. " +
+        "Use 'token_ratio' (0-1), 'remaining_tokens' (positive integer), or " +
+        "'messages_to_refine' (positive integer). Your `summarization` config will be " +
+        'ignored and summarization will fall back to self-summarize defaults (the ' +
+        "agent's own provider/model, fires on every pruning event) until this is " +
+        'corrected.',
+    );
+    return undefined;
+  }
+
   const parsed = summarizationConfigSchema.safeParse(raw);
   if (!parsed.success) {
     logger.warn('[AppService] Invalid summarization config', parsed.error.flatten());