feat: add vision capability flag to modelSpecs configuration by JumpLink · Pull Request #11501 · danny-avila/LibreChat

JumpLink · 2026-01-24T12:07:33Z

Adds an optional vision boolean field to modelSpecs configuration to explicitly declare model vision support. This enables proper UI gating for image upload options based on model capabilities.

Related to: #11418 (partially addresses) and danny-avila/agents#48

Changes

Add vision?: boolean field to TModelSpec type and schema
Extend validateVisionModel() to check modelSpecs.vision first before fallback to hardcoded list
Create useVisionModel() hook to centralize vision model detection logic
Update UI components (DragDropModal, AttachFileMenu) to conditionally show image upload options based on model vision capability

Benefits

Enables proper UI gating: image upload options only appear for vision-capable models
Configuration-driven approach: model capabilities declared in librechat.yaml instead of hardcoded
Backward compatible: falls back to existing hardcoded list if modelSpecs not provided

Testing

Verify image upload options only appear for vision-capable models
Verify modelSpecs.vision configuration is respected
Verify fallback to hardcoded list works when modelSpecs not provided

- Add Scaleway to RECOGNIZED_PROVIDERS for improved MCP content formatting - Add Scaleway detection for proper usage field handling (streamUsage: false, usage: true) - Scaleway uses standard OpenAI reasoning_content format, no special handling needed Scaleway custom endpoints are identified by endpoint name or baseURL containing 'scaleway' or 'api.scaleway.ai'.

LangChain may store usage data in response_metadata.usage instead of usage_metadata. This change checks both locations and converts LangChain format to the expected format when token data is present. This improves compatibility with custom endpoints that use LangChain internally.

- Add Scaleway to RECOGNIZED_PROVIDERS for improved MCP content formatting - Add Scaleway detection for proper usage field handling (streamUsage: false, usage: true) - Scaleway uses standard OpenAI reasoning_content format, no special handling needed Scaleway custom endpoints are identified by endpoint name or baseURL containing 'scaleway' or 'api.scaleway.ai'.

- Generalize custom endpoint detection for usage field handling - Replace provider-specific checks with generic isCustomOpenAIEndpoint function - Automatically handles all custom endpoints (provider=OPENAI but endpoint name differs) - Removes need for explicit provider additions - Improve MCP content formatting for custom endpoints - Add isRecognizedProvider helper function for clarity - Custom endpoints automatically recognized since they use 'openai' provider - Helps address MCP tool response formatting issues (LibreChat danny-avila#11494) This change benefits all OpenAI-compatible custom endpoints, not just specific providers, making the codebase more maintainable and reducing the need for provider-specific additions.

- Removed redundant checks for usage data in LangChain responses, consolidating the logic to directly access usage_metadata. - This change streamlines the code and improves readability while maintaining functionality.

- Add isCustomOpenAIEndpoint function to automatically detect custom endpoints for proper usage field handling (provider=OPENAI but endpoint name differs) - Add Scaleway to RECOGNIZED_PROVIDERS for MCP content formatting - Improves handling of MCP tool responses with structured content formatting This change benefits all OpenAI-compatible custom endpoints by automatically detecting them for usage field handling, while MCP formatting requires explicit provider additions since custom endpoints are passed with their endpoint name.

Add `vision` boolean field to modelSpecs configuration to explicitly declare model vision support. This enables proper filtering of image artifacts for non-vision models and UI gating for image upload options. - Add vision field to TModelSpec type/schema - Extend validateVisionModel() to check modelSpecs first - Pass modelSpecs from API to agents package - Update UI components to use vision capability check

- Removed direct calls to validateVisionModel in AttachFileMenu and DragDropModal components. - Introduced useVisionModel hook to encapsulate vision model validation logic. - Updated imports to reflect the new hook usage, improving code modularity and readability.

- Remove modelSpecs parameter from createRun() function - Remove modelSpecs conversion logic (handled by agent-level vision toggle) - Remove modelSpecs from createRun() call in client.js - This keeps PR 11501 focused on modelSpecs vision for UI gating only

- Add vision to AgentCapabilities enum and default capabilities - Add vision?: boolean field to Agent type and validation schema - Add vision toggle UI component for agents with hover card and info description - Include vision in agent create/update payload - Pass vision from agent to AgentInputs in run API Depends on PR danny-avila#11501 (modelSpecs vision) for validateVisionModel function

Automatically recognize and format MCP tool responses for all OpenAI-compatible custom endpoints without requiring explicit additions. Uses negative list (NON_OPENAI_PROVIDERS) instead of maintaining positive list for each new provider.

- Add vision to AgentCapabilities enum and default capabilities - Add vision?: boolean field to Agent type and validation schema - Add vision toggle UI component for agents with hover card and info description - Include vision in agent create/update payload - Pass vision from agent to AgentInputs in run API Depends on PR danny-avila#11501 (modelSpecs vision) for validateVisionModel function

- Implement automatic detection of vision capability based on model specifications. - Update agent configuration to auto-set vision based on model changes. - Introduce artifact processing for MCP tools, ensuring proper handling of image URLs and base64 data. - Refactor related components to utilize new vision validation logic and improve modularity. - Update UI elements to reflect changes in vision capability handling and provide clearer user guidance.

- Removed unnecessary blank lines in the createToolEndCallback function to improve code readability and maintainability.

- Remove redundant result processing in MCP.js - formatToolContent already returns correct tuple format - Add debug logging in run.ts to diagnose vision capability detection issues - Improve code clarity by removing workaround code

- Change MCP.js to directly return the result from mcpManager.callTool, enhancing clarity. - Remove console logs in run.ts related to vision capability detection to streamline the code.

- Update AgentClient to conditionally handle image URLs and attachments based on vision capability. - Modify AssistantService to check for image file types before processing artifact messages. - Refactor ToolService to improve vision capability validation and ensure proper handling of artifacts for non-vision models. - Clarify documentation regarding vision capabilities and processing behavior for better understanding.

…processing - Simplify image URL handling in AgentClient by removing unnecessary checks when vision is disabled. - Enhance AssistantService to use a boolean flag for determining if file IDs should be attached to artifact messages. - Add validation for max_tokens in createRun to ensure it is always set to a valid value, preventing potential errors from invalid configurations.

…nt models - Update loadEphemeralAgent and loadAddedAgent functions to prioritize model specifications for vision and spec attributes. - Modify determineVisionCapability to incorporate spec-based vision detection, improving clarity and functionality. - Refactor createRun to ensure valid max_tokens handling, enhancing robustness against invalid configurations.

…ents - Enhanced the agent client to validate vision capabilities based on agent settings and model specifications. - Updated AttachFileMenu and DragDropModal components to utilize the new vision capability checks, ensuring proper handling of image uploads. - Introduced visionEnabledByAgent in useAgentToolPermissions hook to streamline permission checks across components.

# Conflicts: # api/models/Agent.js # api/models/loadAddedAgent.js # api/server/controllers/agents/client.js # api/server/services/MCP.js # api/server/services/ToolService.js # packages/api/src/agents/run.ts

JumpLink added 8 commits January 24, 2026 09:32

refactor: Simplify usage metadata extraction in ModelEndHandler

3166bcc

- Removed redundant checks for usage data in LangChain responses, consolidating the logic to directly access usage_metadata. - This change streamlines the code and improves readability while maintaining functionality.

JumpLink mentioned this pull request Jan 24, 2026

feat: filter base64 image artifacts based on agent vision capability danny-avila/agents#48

Open

JumpLink force-pushed the feat/vision branch from ea1c984 to 488dd2d Compare January 24, 2026 14:50

JumpLink mentioned this pull request Jan 24, 2026

feat: add vision toggle for agents #11504

Closed

JumpLink marked this pull request as ready for review January 24, 2026 18:05

JumpLink added 4 commits January 24, 2026 19:20

chore: Revert formatting changes

5f8d81b

Merge branch 'feat/vision'

77cd68f

Merge branch 'feat/scaleway'

daf56d2

JumpLink marked this pull request as draft January 25, 2026 06:25

JumpLink added 5 commits January 26, 2026 16:54

Merge remote-tracking branch 'upstream/main'

4a67bbc

refactor: Clean up whitespace in createToolEndCallback function

986f6e6

- Removed unnecessary blank lines in the createToolEndCallback function to improve code readability and maintainability.

JumpLink force-pushed the feat/vision branch from f705ffb to 72149a2 Compare January 26, 2026 15:55

JumpLink added 3 commits January 26, 2026 21:33

refactor: Simplify MCP tool result handling and remove vision debug logs

a3a845f

- Change MCP.js to directly return the result from mcpManager.callTool, enhancing clarity. - Remove console logs in run.ts related to vision capability detection to streamline the code.

JumpLink added 2 commits January 27, 2026 09:04

Merge upstream: resolve conflicts (vision + deferred/programmatic tools)

cd6f861

JumpLink marked this pull request as ready for review February 11, 2026 10:19

JumpLink added 8 commits February 14, 2026 21:15

Merge remote-tracking branch 'upstream' into feat/vision

5cb2c91

Merge remote-tracking branch 'upstream' into feat/vision

4ff3bbc

Merge remote-tracking branch 'upstream' into feat/vision

dd54998

feat: Enhance model validation logic in Vision Model

bb657c2

Merge upstream

4adebaa

refactor: Revert provider recognition and content formatting logic

eaff1ba

Merge remote-tracking branch 'upstream/main' into feat/vision

34a8a09

# Conflicts: # api/models/Agent.js # api/models/loadAddedAgent.js # api/server/controllers/agents/client.js # api/server/services/MCP.js # api/server/services/ToolService.js # packages/api/src/agents/run.ts

danny-avila closed this Apr 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add vision capability flag to modelSpecs configuration#11501

feat: add vision capability flag to modelSpecs configuration#11501
JumpLink wants to merge 32 commits intodanny-avila:mainfrom
faktenforum:feat/vision

JumpLink commented Jan 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

JumpLink commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Benefits

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JumpLink commented Jan 24, 2026 •

edited

Loading