Skip to content

fix: set default maxRetries to 0 for OpenAI endpoint to prevent retry delays (fixes #12547)#12666

Draft
armorbreak001 wants to merge 1 commit intodanny-avila:mainfrom
armorbreak001:fix/configurable-max-retries
Draft

fix: set default maxRetries to 0 for OpenAI endpoint to prevent retry delays (fixes #12547)#12666
armorbreak001 wants to merge 1 commit intodanny-avila:mainfrom
armorbreak001:fix/configurable-max-retries

Conversation

@armorbreak001
Copy link
Copy Markdown

Background

When an OpenAI-compatible API returns an error (e.g., 503), LangChain's AsyncCaller retries up to 6 times with exponential backoff, causing a ~2 minute delay before the user sees the error message. This happens because the default maxRetries in @langchain/core is 6.

While @librechat/agents already sets maxRetries: 0 on the OpenAI SDK client itself, the outer LangChain layer still retries via caller.call(...).

Solution

Set maxRetries: 0 as the default in the OpenAI LLM configuration. This disables the outer LangChain retry layer by default, so error responses are returned immediately to the user.

The value can still be overridden:

  • Via modelOptions.maxRetries in conversation settings
  • Via customParams.defaultParams with { key: "maxRetries", default: N }

Changes

  • packages/api/src/endpoints/openai/llm.ts: Added maxRetries: 0 to the default llmConfig object in getOpenAILLMConfig(). Since Object.assign is used, any explicit maxRetries value from modelOptions or defaultParams will override this default.

Verification

  1. Send a request that triggers a 503 error → error message should appear immediately (not after ~2 minutes)
  2. Set maxRetries: 2 in model options → verify retries still work as expected
  3. Verify existing functionality is unchanged for normal (non-error) requests

Closes #12547

@danny-avila danny-avila marked this pull request as draft April 15, 2026 11:39
/** Default to 0 retries to avoid long delays from LangChain's
* exponential backoff (up to ~2 min with default maxRetries=6).
* Can be overridden via modelOptions or customParams.defaultParams. */
maxRetries: 0,
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would only resolve #12547 if you made it configurable

@armorbreak001
Copy link
Copy Markdown
Author

Thanks for the review @danny-avila!

To clarify: maxRetries is already configurable per endpoint via the existing customParams system. It's listed in knownOpenAIParams, so users can set it through the endpoint settings UI:

Settings → Endpoints → [endpoint] → Advanced → Custom Parameters → defaultParams:

{
  "maxRetries": 3
}

The code flow is:

  1. Base default: maxRetries: 0 (my change, prevents 2-min delays)
  2. Override via modelOptions (model-specific config)
  3. Override via defaultParams (endpoint-level custom params)
  4. Override via addParams (per-request params)

So the user can configure maxRetries to any value they want per endpoint. The default of 0 just prevents the surprising 2-minute delay for users who don't explicitly configure it.

That said, I'm happy to make the default itself configurable (e.g., via env var) if you'd prefer. Let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement]: Configurable maxRetries for LangChain Outer Retry Layer per Endpoint

2 participants