Skip to content

Commit 20a627e

Browse files
danny-avilakrgokul
authored andcommitted
📐 fix: Align Summarization Trigger Schema with Documented and Runtime-Supported Types (danny-avila#12735)
* 🐛 fix: accept documented `summarization.trigger.type` values The Zod schema for `summarization.trigger.type` only accepted `'token_count'`, but: - the documentation lists `token_ratio`, `remaining_tokens`, and `messages_to_refine` as valid - the `@librechat/agents` runtime only evaluates those three types and silently no-ops on anything else The result was a double failure: any user following the docs hit a startup Zod error, and anyone who matched the schema by using `token_count` got a silent no-op at runtime where summarization never fired. Align the schema with the documented, runtime-supported trigger types. Closes danny-avila#12721 * 🧹 fix: bound `token_ratio` trigger value to (0, 1] Per Codex review: the previous schema accepted `value: z.number().positive()` for every trigger type. That meant `trigger: { type: 'token_ratio', value: 80 }` (presumably meant as "80%") passed validation and then silently never fired — because `usedRatio = 1 - remaining/max` is bounded at 1, so `>= 80` is always false. That is exactly the silent-no-op pattern this PR is trying to eliminate. Switch to a discriminated union so each trigger type has its own value constraint: - `token_ratio`: `(0, 1]` — documented as a fraction, so 80 is nonsense - `remaining_tokens`: positive — token counts can be large - `messages_to_refine`: positive — message counts can be > 1 Added tests for the upper-bound rejection and the inclusive upper bound (`value: 1` still accepted as a valid "fire at 100%" extreme). * 🧹 fix: accept `token_ratio: 0` per documented 0.0–1.0 inclusive range Per Codex review: `.positive()` rejected `value: 0`, but the docs describe the `token_ratio` range as `0.0–1.0` (both inclusive). Admins who copy the documented lower bound into their YAML would fail schema validation at startup. Switch `token_ratio` to `.min(0).max(1)`. `0` is a valid (if extreme) setting — the agents SDK's `usedRatio >= 0` check will fire as soon as there is anything to refine, which is a legitimate "always summarize when pruning happens" configuration. `remaining_tokens` and `messages_to_refine` keep `.positive()`: both are counts, and `0` there produces no meaningful behavior (the SDK has an early return for `messagesToRefineCount <= 0`). * 🐛 fix: preserve `token_ratio` trigger when `value: 0` Per Codex review: now that the schema accepts `token_ratio: 0`, `shapeSummarizationConfig` would silently drop it because of a truthy check on `config?.trigger?.value`. The trigger would disappear and the runtime would fall back to "no trigger configured" — which fires on any pruning rather than honoring the explicit ratio. Switch to `typeof value === 'number'`, which preserves `0` while still rejecting `undefined`/`null`. Added a regression test that asserts `{ type: 'token_ratio', value: 0 }` survives the shaping function untouched. * 🧹 fix: reject non-finite trigger values at schema level Per Codex review: `z.number().positive()` still accepts `Infinity` and `NaN` (via YAML `.inf`, `.nan`). Config validation would succeed, but the agents SDK guards every trigger path with `Number.isFinite(...)` and silently returns `false` — summarization never fires while the server starts cleanly. That is the exact schema/runtime split this PR is trying to eliminate. Add `.finite()` to every trigger value. `token_ratio` already had an implicit guard via `.max(1)`, but applying `.finite()` uniformly keeps the intent obvious and catches `NaN` (which `.max(1)` does not). * 🧹 fix: integer counts + targeted token_count migration warning Two findings from the comprehensive review: 1. `remaining_tokens` and `messages_to_refine` are token/message counts and are always integers in the runtime (`Number.isFinite(...)` guards already assume integer semantics). `z.number().positive()` accepted fractional values like `2.5`, which was semantically confusing and would round oddly against the runtime's `>=` / `<=` comparisons. Add `.int()` to both count-based branches; `token_ratio` stays fractional. 2. Anyone upgrading with `trigger.type: 'token_count'` in their YAML got the generic "Invalid summarization config" warning plus a flattened Zod error. Detect that specific case in `loadSummarizationConfig` and emit a migration-friendly message that names the three valid replacements. Export the function so the behavior is unit-testable. Also added a parameterized passthrough test covering `remaining_tokens` and `messages_to_refine` shaping, complementing the existing `token_ratio` coverage. * 🧹 fix: accurate fallback wording + bare-string trigger test Two nits from the follow-up audit: 1. The legacy-`token_count` warning claimed "Summarization will be disabled," but `shapeSummarizationConfig` treats a missing summarization config as self-summarize mode (fires on every pruning event using the agent's own provider/model). "Disabled" would mislead an admin into stopping investigation. Reword to describe the actual fallback and assert the new wording in the spec. 2. Add a regression test for the `trigger: 'bare-string'` YAML case, so the `typeof raw.trigger === 'object'` guard is exercised rather than implied. 3. Swap the en-dash in `(0–1)` for an ASCII hyphen so the log message is safe in every terminal/aggregator regardless of UTF-8 handling. * 🔇 fix: cast `raw.trigger.type` to inspect legacy value past narrowed union CI TS check failed: after the schema tightening, `raw.trigger.type` is narrowed to `"token_ratio" | "remaining_tokens" | "messages_to_refine" | undefined`, so the runtime comparison to `"token_count"` is a TS2367 ("no overlap") error even though that's exactly the comparison we want for the migration guard. Widen just that one access via `as { type?: unknown }` so the migration check reads runtime-shaped YAML input without the type system folding it back into the narrowed union.
1 parent 1225a3f commit 20a627e

6 files changed

Lines changed: 249 additions & 8 deletions

File tree

packages/api/src/agents/__tests__/run-summarization.test.ts

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -218,7 +218,7 @@ describe('summarizationConfig field passthrough', () => {
218218
const agents = await callAndCapture({
219219
summarizationConfig: {
220220
enabled: true,
221-
trigger: { type: 'token_count', value: 8000 },
221+
trigger: { type: 'token_ratio', value: 0.8 },
222222
provider: 'anthropic',
223223
model: 'claude-3-haiku',
224224
parameters: { temperature: 0.2 },
@@ -233,7 +233,7 @@ describe('summarizationConfig field passthrough', () => {
233233
// `enabled` is not forwarded to the agent-level config — it is resolved
234234
// into the separate `summarizationEnabled` boolean on the agent input.
235235
expect(agents[0].summarizationEnabled).toBe(true);
236-
expect(config.trigger).toEqual({ type: 'token_count', value: 8000 });
236+
expect(config.trigger).toEqual({ type: 'token_ratio', value: 0.8 });
237237
expect(config.provider).toBe('anthropic');
238238
expect(config.model).toBe('claude-3-haiku');
239239
expect(config.parameters).toEqual({ temperature: 0.2 });
@@ -254,6 +254,31 @@ describe('summarizationConfig field passthrough', () => {
254254
expect(config.provider).toBe('openAI');
255255
expect(config.model).toBe('gpt-4o');
256256
});
257+
258+
it('preserves `token_ratio` trigger with `value: 0` (documented, extreme-but-valid)', async () => {
259+
const agents = await callAndCapture({
260+
summarizationConfig: {
261+
enabled: true,
262+
trigger: { type: 'token_ratio', value: 0 },
263+
},
264+
});
265+
const config = agents[0].summarizationConfig as Record<string, unknown>;
266+
expect(config.trigger).toEqual({ type: 'token_ratio', value: 0 });
267+
});
268+
269+
it.each([
270+
['remaining_tokens', 500],
271+
['messages_to_refine', 4],
272+
] as const)('passes %s trigger through unchanged', async (type, value) => {
273+
const agents = await callAndCapture({
274+
summarizationConfig: {
275+
enabled: true,
276+
trigger: { type, value },
277+
},
278+
});
279+
const config = agents[0].summarizationConfig as Record<string, unknown>;
280+
expect(config.trigger).toEqual({ type, value });
281+
});
257282
});
258283

259284
// ---------------------------------------------------------------------------

packages/api/src/agents/run.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@ function shapeSummarizationConfig(
195195
const provider = config?.provider ?? fallbackProvider;
196196
const model = config?.model ?? fallbackModel;
197197
const trigger =
198-
config?.trigger?.type && config?.trigger?.value
198+
config?.trigger?.type && typeof config?.trigger?.value === 'number'
199199
? { type: config.trigger.type, value: config.trigger.value }
200200
: undefined;
201201

packages/data-provider/specs/config-schemas.spec.ts

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ import {
77
interfaceSchema,
88
fileStorageSchema,
99
fileStrategiesSchema,
10+
summarizationTriggerSchema,
11+
summarizationConfigSchema,
1012
} from '../src/config';
1113
import { tModelSpecPresetSchema, EModelEndpoint } from '../src/schemas';
1214
import { FileSources } from '../src/types/files';
@@ -502,3 +504,109 @@ describe('interfaceSchema', () => {
502504
expect(result.modelSelect).toBe(false);
503505
});
504506
});
507+
508+
describe('summarizationTriggerSchema', () => {
509+
it.each([
510+
['token_ratio', 0.8],
511+
['remaining_tokens', 500],
512+
['messages_to_refine', 4],
513+
] as const)('accepts documented trigger type "%s" with a sensible value', (type, value) => {
514+
const result = summarizationTriggerSchema.safeParse({ type, value });
515+
expect(result.success).toBe(true);
516+
});
517+
518+
it('rejects the legacy/typoed "token_count" trigger type', () => {
519+
const result = summarizationTriggerSchema.safeParse({
520+
type: 'token_count',
521+
value: 8000,
522+
});
523+
expect(result.success).toBe(false);
524+
});
525+
526+
it('rejects unknown trigger types', () => {
527+
const result = summarizationTriggerSchema.safeParse({
528+
type: 'never_heard_of_it',
529+
value: 1,
530+
});
531+
expect(result.success).toBe(false);
532+
});
533+
534+
it('rejects negative values on any trigger type', () => {
535+
expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: -0.5 }).success).toBe(
536+
false,
537+
);
538+
expect(
539+
summarizationTriggerSchema.safeParse({ type: 'remaining_tokens', value: -1 }).success,
540+
).toBe(false);
541+
expect(
542+
summarizationTriggerSchema.safeParse({ type: 'messages_to_refine', value: -1 }).success,
543+
).toBe(false);
544+
});
545+
546+
it('rejects zero for count-based triggers where it has no meaningful effect', () => {
547+
expect(
548+
summarizationTriggerSchema.safeParse({ type: 'remaining_tokens', value: 0 }).success,
549+
).toBe(false);
550+
expect(
551+
summarizationTriggerSchema.safeParse({ type: 'messages_to_refine', value: 0 }).success,
552+
).toBe(false);
553+
});
554+
555+
it('rejects token_ratio values > 1 to catch the "80 meant as 80%" mistake', () => {
556+
expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 80 }).success).toBe(
557+
false,
558+
);
559+
expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 1.01 }).success).toBe(
560+
false,
561+
);
562+
});
563+
564+
it('accepts token_ratio values at the inclusive 0 and 1 bounds per docs', () => {
565+
expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 0 }).success).toBe(
566+
true,
567+
);
568+
expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 1 }).success).toBe(
569+
true,
570+
);
571+
});
572+
573+
it('allows remaining_tokens and messages_to_refine values above 1 (token/message counts)', () => {
574+
expect(
575+
summarizationTriggerSchema.safeParse({ type: 'remaining_tokens', value: 2000 }).success,
576+
).toBe(true);
577+
expect(
578+
summarizationTriggerSchema.safeParse({ type: 'messages_to_refine', value: 20 }).success,
579+
).toBe(true);
580+
});
581+
582+
it('rejects non-finite values (Infinity, NaN) for every trigger type', () => {
583+
for (const type of ['token_ratio', 'remaining_tokens', 'messages_to_refine'] as const) {
584+
expect(summarizationTriggerSchema.safeParse({ type, value: Infinity }).success).toBe(false);
585+
expect(summarizationTriggerSchema.safeParse({ type, value: -Infinity }).success).toBe(false);
586+
expect(summarizationTriggerSchema.safeParse({ type, value: NaN }).success).toBe(false);
587+
}
588+
});
589+
590+
it('requires integer values for count-based triggers', () => {
591+
expect(
592+
summarizationTriggerSchema.safeParse({ type: 'remaining_tokens', value: 500.5 }).success,
593+
).toBe(false);
594+
expect(
595+
summarizationTriggerSchema.safeParse({ type: 'messages_to_refine', value: 2.5 }).success,
596+
).toBe(false);
597+
});
598+
599+
it('still allows fractional values for token_ratio', () => {
600+
expect(summarizationTriggerSchema.safeParse({ type: 'token_ratio', value: 0.8 }).success).toBe(
601+
true,
602+
);
603+
});
604+
605+
it('parses inside the full summarization config', () => {
606+
const result = summarizationConfigSchema.safeParse({
607+
enabled: true,
608+
trigger: { type: 'token_ratio', value: 0.8 },
609+
});
610+
expect(result.success).toBe(true);
611+
});
612+
});

packages/data-provider/src/config.ts

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1020,10 +1020,20 @@ export const memorySchema = z.object({
10201020

10211021
export type TMemoryConfig = DeepPartial<z.infer<typeof memorySchema>>;
10221022

1023-
export const summarizationTriggerSchema = z.object({
1024-
type: z.enum(['token_count']),
1025-
value: z.number().positive(),
1026-
});
1023+
export const summarizationTriggerSchema = z.discriminatedUnion('type', [
1024+
z.object({
1025+
type: z.literal('token_ratio'),
1026+
value: z.number().finite().min(0).max(1),
1027+
}),
1028+
z.object({
1029+
type: z.literal('remaining_tokens'),
1030+
value: z.number().finite().int().positive(),
1031+
}),
1032+
z.object({
1033+
type: z.literal('messages_to_refine'),
1034+
value: z.number().finite().int().positive(),
1035+
}),
1036+
]);
10271037

10281038
export const contextPruningSchema = z.object({
10291039
enabled: z.boolean().optional(),
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
import type { DeepPartial, TCustomConfig } from 'librechat-data-provider';
2+
import { loadSummarizationConfig } from './service';
3+
import logger from '~/config/winston';
4+
5+
jest.mock('~/config/winston', () => ({
6+
__esModule: true,
7+
default: {
8+
warn: jest.fn(),
9+
info: jest.fn(),
10+
error: jest.fn(),
11+
debug: jest.fn(),
12+
},
13+
}));
14+
15+
describe('loadSummarizationConfig', () => {
16+
const warnSpy = logger.warn as jest.Mock;
17+
18+
beforeEach(() => {
19+
warnSpy.mockClear();
20+
});
21+
22+
it('returns undefined when no summarization config is provided', () => {
23+
expect(loadSummarizationConfig({} as DeepPartial<TCustomConfig>)).toBeUndefined();
24+
});
25+
26+
it('accepts a valid token_ratio trigger', () => {
27+
const result = loadSummarizationConfig({
28+
summarization: {
29+
enabled: true,
30+
trigger: { type: 'token_ratio', value: 0.8 },
31+
},
32+
} as DeepPartial<TCustomConfig>);
33+
34+
expect(result).toBeDefined();
35+
expect(result?.enabled).toBe(true);
36+
expect(result?.trigger).toEqual({ type: 'token_ratio', value: 0.8 });
37+
expect(warnSpy).not.toHaveBeenCalled();
38+
});
39+
40+
it('emits a targeted migration warning when trigger.type is the legacy "token_count"', () => {
41+
const result = loadSummarizationConfig({
42+
summarization: {
43+
trigger: { type: 'token_count', value: 8000 },
44+
},
45+
} as unknown as DeepPartial<TCustomConfig>);
46+
47+
expect(result).toBeUndefined();
48+
expect(warnSpy).toHaveBeenCalledTimes(1);
49+
const message = String(warnSpy.mock.calls[0][0]);
50+
expect(message).toContain('token_count');
51+
expect(message).toContain('token_ratio');
52+
expect(message).toContain('remaining_tokens');
53+
expect(message).toContain('messages_to_refine');
54+
expect(message).toContain('fall back');
55+
});
56+
57+
it('falls back to the generic warning when trigger is a bare string (not an object)', () => {
58+
const result = loadSummarizationConfig({
59+
summarization: {
60+
trigger: 'token_count',
61+
},
62+
} as unknown as DeepPartial<TCustomConfig>);
63+
64+
expect(result).toBeUndefined();
65+
expect(warnSpy).toHaveBeenCalledTimes(1);
66+
expect(String(warnSpy.mock.calls[0][0])).toContain('Invalid summarization config');
67+
});
68+
69+
it('falls back to the generic warning for other schema violations', () => {
70+
const result = loadSummarizationConfig({
71+
summarization: {
72+
trigger: { type: 'token_ratio', value: 80 },
73+
},
74+
} as unknown as DeepPartial<TCustomConfig>);
75+
76+
expect(result).toBeUndefined();
77+
expect(warnSpy).toHaveBeenCalledTimes(1);
78+
expect(String(warnSpy.mock.calls[0][0])).toContain('Invalid summarization config');
79+
});
80+
});

packages/data-schemas/src/app/service.ts

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,30 @@ import { loadEndpoints } from './endpoints';
1515
import { loadOCRConfig } from './ocr';
1616
import logger from '~/config/winston';
1717

18-
function loadSummarizationConfig(config: DeepPartial<TCustomConfig>): AppConfig['summarization'] {
18+
export function loadSummarizationConfig(
19+
config: DeepPartial<TCustomConfig>,
20+
): AppConfig['summarization'] {
1921
const raw = config.summarization;
2022
if (!raw || typeof raw !== 'object') {
2123
return undefined;
2224
}
2325

26+
if (
27+
raw.trigger &&
28+
typeof raw.trigger === 'object' &&
29+
(raw.trigger as { type?: unknown }).type === 'token_count'
30+
) {
31+
logger.warn(
32+
"[AppService] `summarization.trigger.type: 'token_count'` is no longer supported. " +
33+
"Use 'token_ratio' (0-1), 'remaining_tokens' (positive integer), or " +
34+
"'messages_to_refine' (positive integer). Your `summarization` config will be " +
35+
'ignored and summarization will fall back to self-summarize defaults (the ' +
36+
"agent's own provider/model, fires on every pruning event) until this is " +
37+
'corrected.',
38+
);
39+
return undefined;
40+
}
41+
2442
const parsed = summarizationConfigSchema.safeParse(raw);
2543
if (!parsed.success) {
2644
logger.warn('[AppService] Invalid summarization config', parsed.error.flatten());

0 commit comments

Comments
 (0)