feat: add vision capability flag to modelSpecs configuration#11501
Closed
JumpLink wants to merge 32 commits intodanny-avila:mainfrom
Closed
feat: add vision capability flag to modelSpecs configuration#11501JumpLink wants to merge 32 commits intodanny-avila:mainfrom
JumpLink wants to merge 32 commits intodanny-avila:mainfrom
Conversation
- Add Scaleway to RECOGNIZED_PROVIDERS for improved MCP content formatting - Add Scaleway detection for proper usage field handling (streamUsage: false, usage: true) - Scaleway uses standard OpenAI reasoning_content format, no special handling needed Scaleway custom endpoints are identified by endpoint name or baseURL containing 'scaleway' or 'api.scaleway.ai'.
LangChain may store usage data in response_metadata.usage instead of usage_metadata. This change checks both locations and converts LangChain format to the expected format when token data is present. This improves compatibility with custom endpoints that use LangChain internally.
- Add Scaleway to RECOGNIZED_PROVIDERS for improved MCP content formatting - Add Scaleway detection for proper usage field handling (streamUsage: false, usage: true) - Scaleway uses standard OpenAI reasoning_content format, no special handling needed Scaleway custom endpoints are identified by endpoint name or baseURL containing 'scaleway' or 'api.scaleway.ai'.
- Generalize custom endpoint detection for usage field handling - Replace provider-specific checks with generic isCustomOpenAIEndpoint function - Automatically handles all custom endpoints (provider=OPENAI but endpoint name differs) - Removes need for explicit provider additions - Improve MCP content formatting for custom endpoints - Add isRecognizedProvider helper function for clarity - Custom endpoints automatically recognized since they use 'openai' provider - Helps address MCP tool response formatting issues (LibreChat danny-avila#11494) This change benefits all OpenAI-compatible custom endpoints, not just specific providers, making the codebase more maintainable and reducing the need for provider-specific additions.
- Removed redundant checks for usage data in LangChain responses, consolidating the logic to directly access usage_metadata. - This change streamlines the code and improves readability while maintaining functionality.
- Add isCustomOpenAIEndpoint function to automatically detect custom endpoints for proper usage field handling (provider=OPENAI but endpoint name differs) - Add Scaleway to RECOGNIZED_PROVIDERS for MCP content formatting - Improves handling of MCP tool responses with structured content formatting This change benefits all OpenAI-compatible custom endpoints by automatically detecting them for usage field handling, while MCP formatting requires explicit provider additions since custom endpoints are passed with their endpoint name.
- Add isCustomOpenAIEndpoint function to automatically detect custom endpoints for proper usage field handling (provider=OPENAI but endpoint name differs) - Add Scaleway to RECOGNIZED_PROVIDERS for MCP content formatting - Improves handling of MCP tool responses with structured content formatting This change benefits all OpenAI-compatible custom endpoints by automatically detecting them for usage field handling, while MCP formatting requires explicit provider additions since custom endpoints are passed with their endpoint name.
Add `vision` boolean field to modelSpecs configuration to explicitly declare model vision support. This enables proper filtering of image artifacts for non-vision models and UI gating for image upload options. - Add vision field to TModelSpec type/schema - Extend validateVisionModel() to check modelSpecs first - Pass modelSpecs from API to agents package - Update UI components to use vision capability check
- Removed direct calls to validateVisionModel in AttachFileMenu and DragDropModal components. - Introduced useVisionModel hook to encapsulate vision model validation logic. - Updated imports to reflect the new hook usage, improving code modularity and readability.
- Remove modelSpecs parameter from createRun() function - Remove modelSpecs conversion logic (handled by agent-level vision toggle) - Remove modelSpecs from createRun() call in client.js - This keeps PR 11501 focused on modelSpecs vision for UI gating only
JumpLink
added a commit
to faktenforum/LibreChat
that referenced
this pull request
Jan 24, 2026
- Add vision to AgentCapabilities enum and default capabilities - Add vision?: boolean field to Agent type and validation schema - Add vision toggle UI component for agents with hover card and info description - Include vision in agent create/update payload - Pass vision from agent to AgentInputs in run API Depends on PR danny-avila#11501 (modelSpecs vision) for validateVisionModel function
Automatically recognize and format MCP tool responses for all OpenAI-compatible custom endpoints without requiring explicit additions. Uses negative list (NON_OPENAI_PROVIDERS) instead of maintaining positive list for each new provider.
JumpLink
added a commit
to faktenforum/LibreChat
that referenced
this pull request
Jan 26, 2026
- Add vision to AgentCapabilities enum and default capabilities - Add vision?: boolean field to Agent type and validation schema - Add vision toggle UI component for agents with hover card and info description - Include vision in agent create/update payload - Pass vision from agent to AgentInputs in run API Depends on PR danny-avila#11501 (modelSpecs vision) for validateVisionModel function
- Add vision to AgentCapabilities enum and default capabilities - Add vision?: boolean field to Agent type and validation schema - Add vision toggle UI component for agents with hover card and info description - Include vision in agent create/update payload - Pass vision from agent to AgentInputs in run API Depends on PR danny-avila#11501 (modelSpecs vision) for validateVisionModel function
- Implement automatic detection of vision capability based on model specifications. - Update agent configuration to auto-set vision based on model changes. - Introduce artifact processing for MCP tools, ensuring proper handling of image URLs and base64 data. - Refactor related components to utilize new vision validation logic and improve modularity. - Update UI elements to reflect changes in vision capability handling and provide clearer user guidance.
- Removed unnecessary blank lines in the createToolEndCallback function to improve code readability and maintainability.
- Remove redundant result processing in MCP.js - formatToolContent already returns correct tuple format - Add debug logging in run.ts to diagnose vision capability detection issues - Improve code clarity by removing workaround code
- Change MCP.js to directly return the result from mcpManager.callTool, enhancing clarity. - Remove console logs in run.ts related to vision capability detection to streamline the code.
- Update AgentClient to conditionally handle image URLs and attachments based on vision capability. - Modify AssistantService to check for image file types before processing artifact messages. - Refactor ToolService to improve vision capability validation and ensure proper handling of artifacts for non-vision models. - Clarify documentation regarding vision capabilities and processing behavior for better understanding.
…processing - Simplify image URL handling in AgentClient by removing unnecessary checks when vision is disabled. - Enhance AssistantService to use a boolean flag for determining if file IDs should be attached to artifact messages. - Add validation for max_tokens in createRun to ensure it is always set to a valid value, preventing potential errors from invalid configurations.
…nt models - Update loadEphemeralAgent and loadAddedAgent functions to prioritize model specifications for vision and spec attributes. - Modify determineVisionCapability to incorporate spec-based vision detection, improving clarity and functionality. - Refactor createRun to ensure valid max_tokens handling, enhancing robustness against invalid configurations.
…ents - Enhanced the agent client to validate vision capabilities based on agent settings and model specifications. - Updated AttachFileMenu and DragDropModal components to utilize the new vision capability checks, ensuring proper handling of image uploads. - Introduced visionEnabledByAgent in useAgentToolPermissions hook to streamline permission checks across components.
# Conflicts: # api/models/Agent.js # api/models/loadAddedAgent.js # api/server/controllers/agents/client.js # api/server/services/MCP.js # api/server/services/ToolService.js # packages/api/src/agents/run.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds an optional
visionboolean field tomodelSpecsconfiguration to explicitly declare model vision support. This enables proper UI gating for image upload options based on model capabilities.Related to: #11418 (partially addresses) and danny-avila/agents#48
Changes
vision?: booleanfield toTModelSpectype and schemavalidateVisionModel()to checkmodelSpecs.visionfirst before fallback to hardcoded listuseVisionModel()hook to centralize vision model detection logicDragDropModal,AttachFileMenu) to conditionally show image upload options based on model vision capabilityBenefits
librechat.yamlinstead of hardcodedmodelSpecsnot providedTesting
modelSpecs.visionconfiguration is respectedmodelSpecsnot provided