Skip to content

feat: add Bedrock prompt cache TTL config#12875

Open
SharpLu wants to merge 1 commit intodanny-avila:mainfrom
SharpLu:feat/bedrock-cache-ttl
Open

feat: add Bedrock prompt cache TTL config#12875
SharpLu wants to merge 1 commit intodanny-avila:mainfrom
SharpLu:feat/bedrock-cache-ttl

Conversation

@SharpLu
Copy link
Copy Markdown
Contributor

@SharpLu SharpLu commented Apr 29, 2026

Summary

Bedrock prompt caching now supports both 5-minute and 1-hour cache checkpoint TTLs. This PR adds an optional endpoints.bedrock.promptCacheTtl YAML config so LibreChat admins can choose 5m or 1h; when the setting is omitted, LibreChat does not send a TTL and Bedrock keeps the existing 5-minute default behavior.

This is useful for longer Bedrock/Claude sessions where users reuse large system prompts, tools, or reference context across turns that may be more than five minutes apart. The 1-hour TTL lets supported Bedrock Claude 4.5 models retain those cache checkpoints longer when the admin explicitly opts in.

Changes

  • Add Bedrock endpoint config validation for promptCacheTtl: "5m" | "1h"
  • Pass promptCacheTtl from librechat.yaml into Bedrock llmConfig
  • Preserve 1h only for Claude 4.5 Bedrock model IDs, and strip stale/unsupported TTL values so unsupported models fall back to Bedrock default 5-minute behavior
  • Preserve explicit 5m for prompt-cache-supported Claude/Nova models
  • Strip TTL when prompt caching is disabled or when the selected model does not support Bedrock prompt caching
  • Add conversation/preset schema support for the new value
  • Document the YAML option in librechat.example.yaml and the Helm configYamlContent example

Deployment configuration

Docker/Compose users can set this in the mounted librechat.yaml file. Helm users can set the same YAML under librechat.configYamlContent, or provide an existing ConfigMap through librechat.existingConfigYaml with a librechat.yaml key.

endpoints:
  bedrock:
    models:
      - "anthropic.claude-sonnet-4-5-20250929-v1:0"
    promptCacheTtl: "1h"

References

Testing

  • cd packages/data-provider && npx jest specs/bedrock.spec.ts src/config.spec.ts --runInBand --coverage=false
  • npm run build:data-provider
  • cd packages/api && npx jest src/endpoints/bedrock/initialize.spec.ts --runInBand --coverage=false
  • npm run build:data-schemas
  • helm lint helm/librechat
  • helm template bedrock-test helm/librechat -f <values-with-bedrock-promptCacheTtl.yaml>
  • docker compose -f docker-compose.yml -f docker-compose.override.yml config
  • docker compose -f deploy-compose.yml config
  • git diff --check

Copilot AI review requested due to automatic review settings April 29, 2026 10:17
@SharpLu SharpLu force-pushed the feat/bedrock-cache-ttl branch from 44d18ec to 96286dd Compare April 29, 2026 10:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional Bedrock prompt-cache checkpoint TTL configuration (5m or 1h) that can be set via endpoints.bedrock.promptCacheTtl, validates it, threads it through Bedrock initialization, and ensures stale TTL values are stripped when prompt caching is disabled/unsupported.

Changes:

  • Extend conversation/preset types + schemas to allow promptCacheTtl: '5m' | '1h'.
  • Add Bedrock endpoint config validation for promptCacheTtl and pass it into Bedrock llmConfig.
  • Normalize Bedrock prompt-cache options to preserve TTL only for Claude/Nova models and strip TTL when promptCache is false/unsupported; add tests + document YAML option.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/data-schemas/src/types/convo.ts Adds promptCacheTtl to conversation TS typing.
packages/data-schemas/src/schema/preset.ts Adds promptCacheTtl to preset TS typing.
packages/data-schemas/src/schema/defaults.ts Adds Mongoose schema enum for promptCacheTtl.
packages/data-provider/src/types.ts Allows promptCacheTtl in endpoint option typing surface.
packages/data-provider/src/schemas.ts Accepts promptCacheTtl in conversation/query validation schemas.
packages/data-provider/src/config.ts Validates Bedrock endpoint config promptCacheTtl (5m/1h).
packages/data-provider/src/config.spec.ts Tests Bedrock endpoint schema accepts/rejects TTL values.
packages/data-provider/src/bedrock.ts Normalizes prompt cache + TTL behavior for supported models and strips stale TTL.
packages/data-provider/specs/bedrock.spec.ts Adds tests for preserving/stripping promptCacheTtl behavior.
packages/api/src/types/bedrock.ts Adds BedrockPromptCacheTtl type + config fields.
packages/api/src/endpoints/bedrock/initialize.ts Threads endpoint-config TTL into Bedrock request/llmConfig.
packages/api/src/endpoints/bedrock/initialize.spec.ts Tests promptCacheTtl is present only when configured.
librechat.example.yaml Documents the new endpoints.bedrock.promptCacheTtl option.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 775 to +776
promptCache: z.boolean().optional(),
promptCacheTtl: z.enum(['5m', '1h']).optional(),
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

promptCacheTtl is documented as part of the /* Anthropic */ block, but this TTL is Bedrock-specific (per PR description and usage in bedrockInputParser). This comment/section placement is misleading; please either move promptCacheTtl to the AWS Bedrock section of the schema or update the comment to reflect Bedrock usage.

Suggested change
promptCache: z.boolean().optional(),
promptCacheTtl: z.enum(['5m', '1h']).optional(),
promptCache: z.boolean().optional(),
/* AWS Bedrock */
promptCacheTtl: z.enum(['5m', '1h']).optional(),
/* Anthropic */

Copilot uses AI. Check for mistakes.
Comment on lines 51 to +53
// Anthropic-specific
| 'promptCache'
| 'promptCacheTtl'
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

promptCacheTtl is listed under the // Anthropic-specific grouping, but it's Bedrock-only (whereas promptCache applies to both Anthropic and Bedrock). Please adjust the grouping/comment so future readers don’t assume this option is supported by the Anthropic endpoint.

Suggested change
// Anthropic-specific
| 'promptCache'
| 'promptCacheTtl'
// Anthropic/Bedrock
| 'promptCache'
// Bedrock-specific
| 'promptCacheTtl'
// Anthropic-specific

Copilot uses AI. Check for mistakes.
Comment on lines 76 to +83
/* Anthropic only */
promptCache: {
type: Boolean,
},
promptCacheTtl: {
type: String,
enum: ['5m', '1h'],
},
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new promptCacheTtl field is placed under the /* Anthropic only */ section, but the TTL is Bedrock-specific. This comment/section placement is misleading; please move promptCacheTtl to a Bedrock section (or rename the comment to reflect shared/Bedrock usage).

Copilot uses AI. Check for mistakes.
@SharpLu SharpLu force-pushed the feat/bedrock-cache-ttl branch from 96286dd to 9835757 Compare April 29, 2026 10:36
@SharpLu SharpLu force-pushed the feat/bedrock-cache-ttl branch from 9835757 to e5ff82f Compare May 8, 2026 08:32
@SharpLu
Copy link
Copy Markdown
Contributor Author

SharpLu commented May 8, 2026

Hey @danny-avila, friendly bump on this one — just rebased onto main, CI is green.

The PR adds an optional endpoints.bedrock.promptCacheTtl: "5m" | "1h" in librechat.yaml so admins can opt into the new 1-hour Bedrock prompt cache TTL for Claude 4.5 models. When unset, behavior is unchanged (Bedrock defaults to 5 min). Useful for long agent sessions where reusing large system prompts/tools across turns >5 min would otherwise blow the cache.

Depends on danny-avila/agents#123 (also rebased now) for the runtime side.

Happy to address any feedback whenever you have a moment. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants