diff --git a/docs/features/authentication-access/rbac/groups.md b/docs/features/authentication-access/rbac/groups.md index ce077c3de..e6db919e6 100644 --- a/docs/features/authentication-access/rbac/groups.md +++ b/docs/features/authentication-access/rbac/groups.md @@ -84,3 +84,18 @@ For example, granting the "Marketing" group read access and a specific editor us * **Read**: Users can view and use the resource. * **Write**: Users can update or delete the resource. + +### Previewing Access (Audit) + +When access grants span many groups and resources, it's easy to lose track of who can see what. Open WebUI ships an admin-only **Preview Access** view that resolves every access grant for a specific user or group and lists the result in one place — no need to crawl through individual resource pages. + +**For a user** — In **Admin Panel > Users**, hover over a non-admin user row and click the eye-style **Preview Access** button. The modal shows every model, knowledge base, and tool the user can read, aggregated across all of their group memberships and any direct user grants. + +**For a group** — In **Admin Panel > Users > Groups**, open the group editor and use the **Preview Group Access** panel. The output is the same shape (models, knowledge, tools), scoped to just that group's grants. + +Both views are admin-only and read-only — they reflect what the access-grant table currently says without modifying it. Use them after a permission change to confirm the result matches intent, or as part of a periodic RBAC audit. + +Programmatic equivalents: + +- `GET /api/v1/users/{user_id}/preview` — user view (admin auth required) +- `GET /api/v1/groups/id/{id}/preview` — group view (admin auth required) diff --git a/docs/features/channels/index.md b/docs/features/channels/index.md index 0287854ee..08c75e004 100644 --- a/docs/features/channels/index.md +++ b/docs/features/channels/index.md @@ -81,9 +81,14 @@ Mentioning a model in a channel runs through the same chat-completion pipeline a | **User tools and MCP tools** | Whatever the model is configured to call, it can call | | **Filters** | Inlet/outlet/stream filters apply just like in chats | | **Knowledge (RAG)** | Knowledge bases attached to the model are queried and injected | +| **Attached documents** | Images **and** non-image files (PDF, DOCX, etc.) uploaded in the thread are forwarded into the model's context | In other words, a channel-summoned model is a fully-equipped agent — not a one-shot completion. +:::note Document attachments in channels (v0.9.6+) +Before v0.9.6, tagging a model in a channel only forwarded **images** from the thread — uploaded PDFs, DOCX, and other non-image documents were ignored, so summarization and document-comparison prompts silently had nothing to work with. As of v0.9.6 these files are forwarded the same way they are in a direct chat, so document workflows behave identically in channels. +::: + ### Tagging people and linking channels Use `@username` to notify teammates. Use `#channel-name` to create clickable cross-references between conversations. diff --git a/docs/features/chat-conversations/web-search/providers/linkup.md b/docs/features/chat-conversations/web-search/providers/linkup.md new file mode 100644 index 000000000..c9418bd75 --- /dev/null +++ b/docs/features/chat-conversations/web-search/providers/linkup.md @@ -0,0 +1,93 @@ +--- +sidebar_position: 23 +title: "Linkup" +--- + +:::warning + +This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the [contributing tutorial](https://docs.openwebui.com/contributing). + +::: + +:::tip + +For a comprehensive list of all environment variables related to Web Search (including concurrency settings, result counts, and more), please refer to the [Environment Configuration documentation](/reference/env-configuration#web-search). + +::: + +:::tip Troubleshooting + +Having issues with web search? Check out the [Web Search Troubleshooting Guide](/troubleshooting/web-search) for solutions to common problems like proxy configuration, connection timeouts, and empty content. + +::: + +## Overview + +[Linkup](https://www.linkup.so/) is a search API built for AI applications. Integrating it with Open WebUI lets your language model perform real-time web searches and ground responses in current sources. This tutorial guides you through configuring Linkup as a web search provider. + +Linkup support was added in Open WebUI v0.9.6. + +## Prerequisites + +Ensure you have: + +- **Open WebUI Installed**: A running instance of Open WebUI (local or Docker). See the [Getting Started guide](https://docs.openwebui.com/getting-started). +- **Linkup Account**: An account with an API key from [Linkup](https://www.linkup.so/). +- **Admin Access**: Administrative access to your Open WebUI instance. +- **Internet Connection**: Required for Linkup API requests. + +## Step-by-Step Configuration + +### 1. Obtain a Linkup API Key + +1. Log in or sign up at [Linkup](https://www.linkup.so/). +2. Open the API keys section of your dashboard. +3. Copy or generate a new API key. Keep it secure. + +### 2. Configure Open WebUI + +1. Log in to Open WebUI with an admin account. +2. Open **Admin Panel → Settings → Web Search**. +3. Enable **Web Search** by toggling it **On**. +4. Select **linkup** from the **Web Search Engine** dropdown. +5. Paste your Linkup API key into the **Linkup API Key** field. +6. (Optional) Set the **Search Depth** and **Output Type** (see below). +7. Save your settings. + +### 3. Test the Integration + +1. Start a chat session in Open WebUI. +2. Click the **plus (+)** button in the prompt field to enable web search. +3. Enter a query (e.g., `+latest AI news`) and confirm Linkup returns real-time results. + +## Search Parameters + +Linkup requests are built from a small set of defaults that you can override. The query (`q`) and result count (`maxResults`) are injected automatically and cannot be overridden. + +| Parameter | Default | Notes | +|-----------|---------|-------| +| `depth` | `standard` | `standard` is faster and cheaper; `deep` runs a more thorough multi-step search. | +| `outputType` | `sourcedAnswer` | `sourcedAnswer` returns an answer plus its source pages; `searchResults` returns raw result entries. | +| `url` | `https://api.linkup.so/v1/search` | Override only if you need to point at a different endpoint. | + +These map to the [`LINKUP_SEARCH_PARAMS`](/reference/env-configuration#linkup_search_params) environment variable, supplied as a JSON object. For example: + +```bash +-e LINKUP_API_KEY="your_linkup_api_key" +-e LINKUP_SEARCH_PARAMS='{"depth": "deep", "outputType": "searchResults"}' +``` + +The same fields are exposed in the Admin UI when the `linkup` engine is selected, so you do not need environment variables unless you prefer to manage configuration that way. See [Environment Variable Configuration](https://docs.openwebui.com/environment) for details and the [`ENABLE_PERSISTENT_CONFIG`](/reference/env-configuration#enable_persistent_config) behavior. + +## Troubleshooting + +- **Invalid API Key**: Ensure the key is copied correctly, without extra spaces. +- **No Results**: Confirm the web search toggle (`+`) is enabled and your internet is active. Try `depth: deep` for sparse topics. +- **Quota Exceeded**: Check your plan and usage on the Linkup dashboard. +- **Settings Not Saved**: Verify admin privileges and that `webui.db` is writable. + +## Additional Resources + +- [Linkup Documentation](https://docs.linkup.so/): API reference and advanced options. +- [Open WebUI Features](https://docs.openwebui.com/features): Details on RAG and web search. +- [Contributing to Open WebUI](https://docs.openwebui.com/contributing): Share improvements or report issues. diff --git a/docs/features/extensibility/mcp.mdx b/docs/features/extensibility/mcp.mdx index 3b68120dd..d4ad05b08 100644 --- a/docs/features/extensibility/mcp.mdx +++ b/docs/features/extensibility/mcp.mdx @@ -128,11 +128,17 @@ Both MCP and OpenAPI tool-server connections accept a free-form **Headers** fiel | :--- | :--- | | `{{USER_ID}}` | The calling user's ID. | | `{{USER_NAME}}` | The calling user's display name. | +| `{{USER_EMAIL}}` | The calling user's email address. | +| `{{USER_ROLE}}` | The calling user's role (e.g. `admin`, `user`). | | `{{CHAT_ID}}` | The current chat ID (empty in non-chat contexts like the **Verify Connection** button). | | `{{MESSAGE_ID}}` | The current message ID (empty in non-chat contexts). | Unknown tokens are passed through as literal text. Non-string header values are coerced to strings before substitution. The same tokens are honored on custom headers attached to OpenAI-compatible model connections in **Admin Settings → Connections → OpenAI**, so you can use the feature for tenant routing or audit-trail propagation across both surfaces. +:::note +`{{USER_EMAIL}}` and `{{USER_ROLE}}` were added in v0.9.6. The same release also fixed MCP server connections, where custom-header templates were previously stored but **not** interpolated at request time — they now expand the same way they always have for direct connections and OpenAPI tool servers. +::: + ### Function Name Filter List This field restricts which tools are exposed to the LLM. diff --git a/docs/features/extensibility/pipelines/pipes.md b/docs/features/extensibility/pipelines/pipes.md index a02365b67..ab0bdcb2c 100644 --- a/docs/features/extensibility/pipelines/pipes.md +++ b/docs/features/extensibility/pipelines/pipes.md @@ -46,7 +46,7 @@ yield {"choices": [{"delta": {}, "finish_reason": "stop"}]} This is the single biggest gotcha when building an agent pipeline (LangChain, LlamaIndex, a custom planner, anything that executes its own tools and streams the result back). -`delta.tool_calls` in a chunk means **"please execute this tool call for me, client"**. When Open WebUI's middleware sees it, the tool executor picks up the call, runs it, appends a `role: "tool"` message, and fires a continuation request back at the same pipeline. It does this in a loop capped by `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES` (≈30). +`delta.tool_calls` in a chunk means **"please execute this tool call for me, client"**. When Open WebUI's middleware sees it, the tool executor picks up the call, runs it, appends a `role: "tool"` message, and fires a continuation request back at the same pipeline. It does this in a loop capped by [`CHAT_RESPONSE_MAX_TOOL_CALL_ITERATIONS`](/reference/env-configuration#chat_response_max_tool_call_iterations) (default 256; `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES`, default 30, on versions before v0.9.6). If your pipeline already executed the tool internally, emitting `delta.tool_calls` makes Open WebUI try to execute it *again* — and since the pipeline keeps emitting the same call on every retry, you get 30 copies of the response stacked on top of each other before the retry cap trips. Same thing happens if you set `finish_reason: "tool_calls"` mid-stream. diff --git a/docs/features/extensibility/plugin/functions/filter.mdx b/docs/features/extensibility/plugin/functions/filter.mdx index af374e6cf..e66efa1c0 100644 --- a/docs/features/extensibility/plugin/functions/filter.mdx +++ b/docs/features/extensibility/plugin/functions/filter.mdx @@ -3,7 +3,7 @@ sidebar_position: 3 title: "Filter Function" --- -# 🪄 Filter Function: Modify Inputs and Outputs +# Filter Function: Modify Inputs and Outputs :::danger ⚠️ Critical Security Warning **Filter Functions execute arbitrary Python code on your server.** Function creation is restricted to administrators only. Only install from trusted sources and review code before importing. A malicious Function could access your file system, exfiltrate data, or compromise your entire system. For full details, see the [Plugin Security Warning](/features/extensibility/plugin/). @@ -15,7 +15,7 @@ This guide will break down **what Filters are**, how they work, their structure, --- -## 🌊 What Are Filters in Open WebUI? +## What Are Filters in Open WebUI? Imagine Open WebUI as a **stream of water** flowing through pipes: @@ -36,11 +36,11 @@ Filters are like **translators or editors** in the AI workflow: you can intercep --- -## 🗺️ Structure of a Filter Function: The Skeleton +## Structure of a Filter Function: The Skeleton Let's start with the simplest representation of a Filter Function. Don't worry if some parts feel technical at first—we’ll break it all down step by step! -### 🦴 Basic Skeleton of a Filter +### Basic Skeleton of a Filter ```python from pydantic import BaseModel @@ -73,7 +73,7 @@ class Filter: --- -### 🧲 Toggleable Filters: Making Filters User-Controllable (`self.toggle`) +### Toggleable Filters: Making Filters User-Controllable (`self.toggle`) By default a filter that's **active and in scope** (global, or attached to the model) runs on every request — the user has no say in it. That's often what you want (PII scrubbing, logging, mandatory guardrails). Sometimes you want the opposite: let the user decide whether the filter runs for a given conversation. @@ -144,9 +144,88 @@ The chip being present = the filter is enabled for the next request. The chip be --- -## ⚙️ Filter Administration & Configuration +### Owning Retrieval With file_handler -### 🌐 Global Filters vs. Model-Specific Filters +By default, when a user attaches a knowledge collection or uploads a file to a chat, Open WebUI runs the built-in RAG pipeline **after** every inlet filter has returned. The chat-completion handler queries the vector DB for chunks relevant to the user's last message, wraps them in `` tags, appends them to the last user message (or to a system message, depending on `RAG_SYSTEM_CONTEXT`), and only then calls the LLM. + +This is important to understand for filter authors: at `inlet()` time, `body["metadata"]["files"]` and `body["files"]` contain only the file/collection *references* (IDs, names, types). **The chunk text doesn't exist yet** — retrieval hasn't happened. So if you want to inspect or transform the chunks themselves (PII / PHI redaction, reranking, custom hybrid scoring, translation, chunk-level access control, anonymization), the standard inlet contract is not enough — the data you want isn't there yet. + +**`file_handler = True`** is the opt-in escape hatch for exactly this case. Declared as a **module-level attribute** at the top of your filter file, it tells Open WebUI "I am handling retrieval and chunk injection myself — skip the built-in RAG step." When set, the backend strips `body["metadata"]["files"]` and `body["files"]` after your `inlet()` returns, so the chat-completion handler finds no files to retrieve over and goes straight to the LLM with whatever you injected. + +```python +from pydantic import BaseModel +from typing import Optional + +# Module-level attribute — sits OUTSIDE the Filter class, alongside imports. +file_handler = True + +class Filter: + class Valves(BaseModel): + pass + + def __init__(self): + self.valves = self.Valves() + + async def inlet( + self, + body: dict, + __request__=None, + __user__: Optional[dict] = None, + __model__: Optional[dict] = None, + ) -> dict: + # body["metadata"]["files"] still contains the file/collection REFERENCES here. + # After this method returns, Open WebUI strips them and does NOT run its own RAG. + # Therefore: it is YOUR job to retrieve, transform, and inject chunks below. + return body +``` + +:::warning Module attribute, not `self.file_handler` +Open WebUI reads `file_handler` from the **module object** (the file your filter lives in), not from the `Filter` instance. Setting `self.file_handler = True` inside `__init__` is silently ignored. Put the assignment at the top of the file, alongside your imports — exactly as shown above. +::: + +#### When to use it + +- **Per-model redaction.** Apply PII / PHI scrubbing only when the request targets a remote model, while letting a self-hosted model see raw chunks. Branch on `__model__["owned_by"]` (or another signal) inside the inlet and transform chunks accordingly. +- **Custom retrieval logic.** Hybrid BM25 + dense scoring, query rewriting, multi-collection routing, reranking with a different model than the one Open WebUI uses, result caching keyed on the rewritten query. +- **Pre-injection transformation.** Translation, summarization, deduplication, or any transform that needs the *actual chunk text* rather than just the references. +- **Chunk-level access control.** Filter out chunks the current user shouldn't see based on metadata attached to the source documents. + +#### The recipe + +1. Set `file_handler = True` at the top of your filter module. +2. In `inlet()`, read the file references from `body["metadata"]["files"]` (and `body["files"]` for ad-hoc attachments). +3. Retrieve chunks yourself. Two options: + - **HTTP**: call `POST /api/v1/retrieval/query/doc` (single collection) or `POST /api/v1/retrieval/query/collection` (multiple), passing the user's last message as the query string and the inbound request's bearer token so permissions stay scoped to the user. + - **In-process**: `from open_webui.retrieval.utils import get_sources_from_items` and call it directly with the same arguments the core code uses. This avoids the network hop and returns a cleaner shape (list of dicts each containing a `document` array of chunks and a parallel `metadata` array). +4. Transform the chunks however you need. Branch on `__model__` / `__user__` if the transform is conditional (e.g. "redact only when the model is remote"). +5. Inject the transformed chunks back into `body["messages"]`. To preserve clickable citations in the UI, mirror the format Open WebUI uses internally: + + ```html + + ...chunk text... + + ``` + + Plain Markdown also works if you don't care about citations being clickable in the UI — only the structured `` form wires up the citation popovers. +6. Return `body`. The built-in RAG step is skipped (because `file_handler` caused the file references to be stripped), and the LLM call goes out with your sanitized chunks already in the prompt. + +#### Caveat: it's static, all-or-nothing per filter + +`file_handler` is read **once per filter, at the module level**. It is not a per-request signal and cannot be flipped based on the model, user, or chat from inside `inlet()`. When set, the built-in RAG is **always** skipped for any request where this filter is invoked — regardless of whether your `inlet()` actually called any retrieval logic on that particular request. + +In practice this means: if you use `file_handler = True`, your filter must handle retrieval for **every** scenario where files would normally be retrieved by the built-in path, including the cases where you'd have been happy with the default behavior. The retrieval call itself is identical in both cases; only any conditional *transformation* (e.g. "only redact for remote models") branches on context. + +If you genuinely need per-request switching between built-in and custom retrieval (e.g. "use built-in RAG for some users, custom for others on the same model"), the cleanest approach is to gate the custom-RAG filter on `self.toggle = True` so it only runs when the user has it selected — when the filter isn't selected, it doesn't run, its `file_handler` doesn't apply, and the built-in RAG handles the request normally. Don't try to dynamically mutate `file_handler` from inside `inlet()`; the flag is read off the module object before your method is called. + +#### Why this matters compared to mutating `body["files"]` in inlet + +A naive alternative is to clear `body["metadata"]["files"] = []` and `body["files"] = []` inside `inlet()` to suppress the built-in RAG dynamically. This works in practice but is brittle: future Open WebUI versions can add new file/collection plumbing under additional keys, and the official "I'm handling this myself" contract is `file_handler`. Prefer the documented opt-in. + +--- + +## Filter Administration & Configuration + +### Global Filters vs. Model-Specific Filters Open WebUI provides a flexible multi-level filter system that allows you to control which filters are active, how they're enabled, and who can toggle them. Understanding this system is crucial for effective filter management. @@ -191,7 +270,7 @@ POST /functions/id/{filter_id}/toggle/global --- -### 🎛️ The Two-Tier Filter System +### The Two-Tier Filter System Open WebUI uses a sophisticated two-tier system for managing filters on a per-model basis. This can be confusing at first, but it's designed to support both **always-on filters** and **user-toggleable filters**. @@ -258,7 +337,7 @@ class Filter: --- -### 🔄 Toggleable Filters vs. Always-On Filters +### Toggleable Filters vs. Always-On Filters Understanding the difference between these two types is key to using the filter system effectively. @@ -348,7 +427,7 @@ class WebSearchFilter: --- -### 📊 Filter Execution Flow +### Filter Execution Flow Here's the complete flow from admin configuration to filter execution: @@ -386,7 +465,7 @@ Here's the complete flow from admin configuration to filter execution: --- -### 📡 Filter Behavior with API Requests +### Filter Behavior with API Requests When using Open WebUI's API endpoints directly (e.g., via `curl` or external applications), `inlet()` and `stream()` follow the same execution model as WebUI requests. `outlet()` is the one that behaves very differently for direct API callers and is covered in detail below. @@ -608,7 +687,7 @@ Filters are sorted in **ascending** order by priority. A filter with `priority=0 --- -### 🔗 Data Passing Between Filters +### Data Passing Between Filters When multiple filters are active, each filter in the chain receives the **modified data from the previous filter**. The returned value from one filter becomes the input to the next filter in the priority order. @@ -932,6 +1011,10 @@ In the world of Open WebUI, the `inlet` function does this important prep work o 🚀 **Your Task**: Modify and return the `body`. The modified version of the `body` is what the LLM works with, so this is your chance to bring clarity, structure, and context to the input. +:::info Want to transform RAG chunks? `inlet()` runs **before** retrieval +At `inlet()` time, `body["metadata"]["files"]` and `body["files"]` contain only file/collection *references* — the actual chunk text is fetched and injected later, after every inlet filter has returned. If you need to inspect or transform the chunk text itself (PII redaction, reranking, translation, chunk-level ACLs), see [Owning Retrieval With `file_handler`](#file-handler-custom-rag) for the supported opt-in. +::: + ##### Why Would You Use the `inlet`? 1. **Adding Context**: Automatically append crucial information to the user’s input, especially if their text is vague or incomplete. For example, you might add "You are a friendly assistant" or "Help this user troubleshoot a software bug." @@ -1036,7 +1119,7 @@ async def stream(self, event: dict) -> dict: - Each line represents a **small fragment** of the model's streamed response. - The **`delta.content` field** contains the progressively generated text. -##### 🔄 Example: Filtering Out Emojis from Streamed Data +##### Example: Filtering Out Emojis from Streamed Data ```python async def stream(self, event: dict) -> dict: for choice in event.get("choices", []): @@ -1169,7 +1252,7 @@ Publishing a curated package on **[openwebui.com](https://openwebui.com/)** lets --- -## 🚧 Potential Confusion: Clear FAQ 🛑 +## Potential Confusion: Clear FAQ ### **Q: How Are Filters Different From Pipe Functions?** diff --git a/docs/features/extensibility/plugin/functions/pipe.mdx b/docs/features/extensibility/plugin/functions/pipe.mdx index 8eb46f923..09a8b1cab 100644 --- a/docs/features/extensibility/plugin/functions/pipe.mdx +++ b/docs/features/extensibility/plugin/functions/pipe.mdx @@ -279,7 +279,7 @@ If you must use a synchronous third-party library in an async handler, wrap the You can modify this proxy Pipe to support additional service providers like Anthropic, Perplexity, and more by adjusting the API endpoints, headers, and logic within the `pipes` and `pipe` functions. :::caution Building a self-contained agent? Don't emit `delta.tool_calls`. -If your Pipe wraps an agent (LangChain, LlamaIndex, a custom planner, …) that executes tools **internally** and then streams the final answer back to the chat, emitting `delta.tool_calls` in the stream will trigger Open WebUI's tool-execution retry loop — the middleware treats `delta.tool_calls` as "please execute this for me, client" and loops back through your pipe, duplicating the response up to `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES` (~30) times. +If your Pipe wraps an agent (LangChain, LlamaIndex, a custom planner, …) that executes tools **internally** and then streams the final answer back to the chat, emitting `delta.tool_calls` in the stream will trigger Open WebUI's tool-execution retry loop — the middleware treats `delta.tool_calls` as "please execute this for me, client" and loops back through your pipe, duplicating the response up to [`CHAT_RESPONSE_MAX_TOOL_CALL_ITERATIONS`](/reference/env-configuration#chat_response_max_tool_call_iterations) (default 256; `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES`, default 30, before v0.9.6) times. For self-contained agents, render tool executions as `
` content blocks instead — the same shape Open WebUI itself emits after internal tool execution. See the [Pipes → Self-contained agents and `delta.tool_calls`](/features/extensibility/pipelines/pipes#self-contained-agents-and-deltatool_calls) section for the full pattern, a LangChain example, and the rule of thumb for which path to take. ::: diff --git a/docs/features/extensibility/plugin/tools/development.mdx b/docs/features/extensibility/plugin/tools/development.mdx index 8c86128c0..642d4af94 100644 --- a/docs/features/extensibility/plugin/tools/development.mdx +++ b/docs/features/extensibility/plugin/tools/development.mdx @@ -33,6 +33,10 @@ licence: MIT """ ``` +:::tip Metadata auto-fill (v0.9.6+) +When you create a **new** tool (also applies to functions and skills), the editor reads the frontmatter as you paste or type code and auto-fills the **Name**, **ID**, and **Description** fields from `title` and `description` if you haven't already filled them in. It never overwrites a value you've entered, and it does not re-derive fields when editing an existing item — so you no longer need to retype metadata that's already declared in the source. +::: + ### Tools Class Tools have to be defined as methods within a class called `Tools`, with optional subclasses called `Valves` and `UserValves`, for example: diff --git a/docs/features/extensibility/plugin/tools/index.mdx b/docs/features/extensibility/plugin/tools/index.mdx index 1b4b354cf..05d5426ca 100644 --- a/docs/features/extensibility/plugin/tools/index.mdx +++ b/docs/features/extensibility/plugin/tools/index.mdx @@ -229,8 +229,10 @@ Default Mode is **not** a supported workaround even for DeepSeek — it is legac | `search_knowledge_bases` | Text search over KB names/descriptions. | | `query_knowledge_files` | Search file contents via the RAG retrieval pipeline (hybrid + rerank when enabled). Main tool for finding answers in docs. | | `search_knowledge_files` | Search files by filename. | -| `view_file` | Read a user-accessible file by ID with pagination (`offset`, `max_chars`). | +| `grep_knowledge_files` | Exact text / regex search across knowledge file content. Returns matching lines with line numbers. Complements `query_knowledge_files` (semantic) when you need literal matches. | +| `view_file` | Read a user-accessible file by ID with character pagination (`offset`, `max_chars`) or line range (`start_line`, `end_line`, optional `line_numbers`). | | `view_knowledge_file` | Read a knowledge-base file by ID with pagination (`offset`, `max_chars`). | +| `kb_exec` *(opt-in)* | Filesystem-style command interface for knowledge bases (`ls`, `tree`, `cat`, `head`, `tail`, `sed`, `grep`, `find`, `wc`, `stat`, with pipe support). Directory-aware: `ls docs/`, `tree`, `grep "x" docs/`, and path-based file refs (`docs/api/auth.md`). Replaces the discovery/read tools above when [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set. | | **Image Gen** | *Requires image generation enabled (per-tool) AND per-chat "Image Generation" toggle enabled.* | | `generate_image` | Generates a new image based on a prompt. Requires `ENABLE_IMAGE_GENERATION`. | | `edit_image` | Edits existing images based on a prompt and image URLs. Requires `ENABLE_IMAGE_EDIT`. | @@ -287,12 +289,17 @@ Use this quick matrix instead of memorizing per-row caveats. | `query_knowledge_bases` | ❌ | ✅ | | `search_knowledge_files` | ✅ (auto-scoped) | ✅ (all accessible KBs) | | `query_knowledge_files` | ✅ (auto-scoped) | ✅ | +| `grep_knowledge_files` | ✅ (auto-scoped) | ✅ | | `view_file` | ✅ (when attached items include files/collections) | ❌ | | `view_knowledge_file` | ✅ (when attached items include files/collections) | ✅ | | `view_note` | ✅ (when attached items include notes) | ❌ | Quick rule: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. +:::info `kb_exec` replaces the matrix when enabled +When [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set, Open WebUI injects `kb_exec` instead of the file-oriented tools listed above. Still injected alongside it: `query_knowledge_files` (always), `view_note` (when notes are attached), and `query_knowledge_bases` + `search_knowledge_bases` (when no KB is attached). The model interacts with files through familiar shell commands. See the [Knowledge feature page](/features/workspace/knowledge#filesystem-style-access-kb_exec) for details. +::: + #### Tool Reference | Tool | Parameters | Output | @@ -307,8 +314,10 @@ Quick rule: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. | `search_knowledge_bases` | `query` (required), `count` (default: 5), `skip` (default: 0) | Array of `{id, name, description, file_count}` | | `query_knowledge_files` | `query` (required), `knowledge_ids` (optional), `count` (default: 5) | Array of chunks like `{content, source, file_id, distance?}`; note hits include `{note_id, type: "note"}` | | `search_knowledge_files` | `query` (required), `knowledge_id` (optional), `count` (default: 5), `skip` (default: 0) | Array of `{id, filename, knowledge_id, knowledge_name}` | -| `view_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000) | `{id, filename, content, updated_at, created_at}` — includes `truncated`, `total_chars`, `next_offset` when paginated | +| `grep_knowledge_files` | `pattern` (required; regex auto-detected), `file_id` (optional — single-file mode), `case_insensitive` (default: false), `count_only` (default: false) | Matching lines with file IDs, filenames, and 1-indexed line numbers (capped at 50 matches) | +| `view_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000), `line_numbers` (default: false), `start_line` / `end_line` (optional — line-based addressing overrides `offset`/`max_chars`) | `{id, filename, content, updated_at, created_at}` — includes `truncated`, `total_chars`, `next_offset` when paginated, or `total_lines`, `showing_lines`, `next_start_line` in line mode | | `view_knowledge_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000) | `{id, filename, content, knowledge_id, knowledge_name}` — includes pagination metadata when truncated | +| `kb_exec` | `command` (required) — filesystem-style command: `ls` (root) / `ls /` / `ls -a` (flat with paths), `tree` / `tree /`, `cat -n `, `head -N `, `tail -N `, `sed -n ',p' `, `grep [-i\|-l\|-c] "" [/\|\|*.ext]`, `find [/] ""`, `wc `, `stat `; supports pipes (`grep "auth" \| head -5`); files referenced by path (`docs/api/auth.md`), filename, or file ID | Plain text command output (matches/listing/tree/file content as appropriate) | | **Image Gen** | | | | `generate_image` | `prompt` (required) | `{status, message, images}` — auto-displayed | | `edit_image` | `prompt` (required), `image_urls` (required) | `{status, message, images}` — auto-displayed | @@ -443,7 +452,7 @@ When the **Builtin Tools** capability is enabled, you can further control which | **Memory** | `search_memories`, `add_memory`, `replace_memory_content`, `delete_memory`, `list_memories` | Search and manage user memories | | **Chat History** | `search_chats`, `view_chat` | Search and view user chat history | | **Notes** | `search_notes`, `view_note`, `write_note`, `replace_note_content` | Search, view, and manage user notes | -| **Knowledge Base** | `list_knowledge`, `list_knowledge_bases`, `search_knowledge_bases`, `query_knowledge_bases`, `search_knowledge_files`, `query_knowledge_files`, `view_file`, `view_knowledge_file` | Browse and query knowledge bases | +| **Knowledge Base** | `list_knowledge`, `list_knowledge_bases`, `search_knowledge_bases`, `query_knowledge_bases`, `search_knowledge_files`, `query_knowledge_files`, `grep_knowledge_files`, `view_file`, `view_knowledge_file` (or `kb_exec` + `query_knowledge_files` + `view_note`/`query_knowledge_bases`/`search_knowledge_bases` as applicable when [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set) | Browse and query knowledge bases | | **Web Search** | `search_web`, `fetch_url` | Search the web and fetch URL content | | **Image Generation** | `generate_image`, `edit_image` | Generate and edit images | | **Code Interpreter** | `execute_code` | Execute code in a sandboxed environment | diff --git a/docs/features/workspace/knowledge.md b/docs/features/workspace/knowledge.md index 1ba316d90..a2365cf68 100644 --- a/docs/features/workspace/knowledge.md +++ b/docs/features/workspace/knowledge.md @@ -42,6 +42,8 @@ Attach specific knowledge bases to a model so it only searches what's relevant. | 📑 **5 extraction engines** | Tika, Docling, Azure, Mistral OCR, custom loaders | | 🤖 **Agentic retrieval** | Models browse, search, and read your documents autonomously | | 📄 **Full context mode** | Inject entire documents with no chunking | +| 🗂️ **Nested directories** | Organize files into subdirectories with drag-and-drop reordering | +| 🔄 **Incremental directory sync** | Mirror a local folder into the KB — only new and modified files upload, deletions are removed, mirroring folder structure | | 📦 **Export and API** | Back up knowledge bases as zip files, manage via REST API | --- @@ -76,12 +78,93 @@ With [native function calling](/features/extensibility/plugin/tools#tool-calling | `query_knowledge_bases` | ❌ | ✅ | Search KB names/descriptions by semantic similarity | | `search_knowledge_files` | ✅ (scoped) | ✅ (all) | Search files by filename | | `query_knowledge_files` | ✅ (scoped) | ✅ | Search file contents using the RAG pipeline | -| `view_file` | ✅ | ❌ | Read file content with pagination (default 10K chars, cap 100K) | +| `grep_knowledge_files` | ✅ (scoped) | ✅ | Exact text / regex search across knowledge files (returns matching lines with line numbers; auto-detects regex like `error|warn`) | +| `view_file` | ✅ | ❌ | Read file content with pagination (`offset`/`max_chars`) or by line range (`start_line`/`end_line`, optional `line_numbers`) | | `view_knowledge_file` | ✅ | ✅ | Read file content from any accessible KB | | `view_note` | ✅ | ❌ | Read attached notes | The key split: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. Attaching a KB scopes the model to only those documents. Leaving it unscoped lets the model discover everything the user has access to. +#### When to prefer `grep_knowledge_files` over `query_knowledge_files` + +The two search tools complement each other: + +| | `query_knowledge_files` | `grep_knowledge_files` | +|---|---|---| +| **How it matches** | Semantic / vector retrieval (with optional BM25 + rerank when [`ENABLE_RAG_HYBRID_SEARCH`](/reference/env-configuration#enable_rag_hybrid_search) is on) | Exact string match — regex auto-detected (e.g. `error\|warn`, `version \d+`) | +| **Returns** | Relevant chunks of content | Matching lines with file ID, filename, and 1-indexed line number | +| **Use when** | "What does the documentation say about X?" — paraphrased questions, conceptual lookups | "Find every place we mention `OPENAI_API_KEY`" — literal identifiers, error strings, version numbers | +| **Result cap** | Top K (default 5) | 50 matches | +| **Flags** | — | `case_insensitive`, `count_only`, `file_id` (single-file mode) | + +In agentic flows, a typical pattern is: `query_knowledge_files` to locate the relevant document, then `grep_knowledge_files` to pinpoint exact lines, then `view_file` (line-range mode below) to read the surrounding context. + +#### Reading with `view_file` + +`view_file` supports two addressing modes: + +- **Character pagination** — `offset` + `max_chars` (default `10000`, hard cap `100000`). Best for streaming through a long document; the response includes `next_offset` when the file is truncated. +- **Line range** — `start_line` + optional `end_line` (1-indexed, inclusive). Overrides `offset`/`max_chars` when set; pairs naturally with `grep_knowledge_files`' line numbers. Pass `line_numbers: true` to also get a `: ` prefix on each returned line. + +The line-range response includes `total_lines`, `showing_lines`, and `next_start_line` for follow-up reads. + +### Filesystem-style access (`kb_exec`) + +When [`ENABLE_KB_EXEC=True`](/reference/env-configuration#enable_kb_exec) is set, Open WebUI exposes a `kb_exec` tool that gives the model a filesystem-style interface over knowledge bases. + +**Tools that go away**, because their function is now covered by `kb_exec` commands: + +- `list_knowledge` — replaced by `ls` +- `search_knowledge_files` — replaced by `find ""` +- `grep_knowledge_files` — replaced by `grep ""` +- `view_file` and `view_knowledge_file` — replaced by `cat`, `head`, `tail`, `sed -n ',p'` + +**Tools that stay injected alongside `kb_exec`**, because they do something `kb_exec` can't: + +- **`query_knowledge_files`** — semantic / RAG search (always) +- **`view_note`** — when notes are attached to the model (`kb_exec` is file-only, so notes need a dedicated reader) +- **`query_knowledge_bases`** and **`search_knowledge_bases`** — when no KB is attached to the model, so the model can still discover and search across knowledge bases by name/description + +This is experimental and **off by default**. It targets frontier models that already "think in shell" — they tend to chain `ls`, `grep`, and `cat` more reliably than they orchestrate a fan-out of specialized tools. + +**Supported commands** + +| Command | Purpose | +|---------|---------| +| `ls`, `ls /`, `ls -a` | List the current level / a subdirectory / a flat view of every file with full paths | +| `tree`, `tree /` | Recursive directory tree | +| `cat -n ` | Read a file (optionally with line numbers) | +| `head -N ` / `tail -N ` | First or last N lines | +| `sed -n ',p' ` | Print lines `` through `` | +| `grep "" [/\|\|*.ext]` | Exact / regex search; flags `-i` (case-insensitive), `-l` (filenames only), `-c` (counts) | +| `find [/] ""` | Find files by glob | +| `wc ` | Line / word / char counts | +| `stat ` | File metadata | + +**Pipes** + +`kb_exec` parses a single pipeline, so commands compose: + +```text +grep "auth" | head -5 +grep -l "TODO" docs/ +find docs/ "*.md" | head -10 +``` + +**File references** + +Files can be addressed three ways — pick whichever is unambiguous: + +- **Path** — `docs/api/auth.md` (relative to the knowledge base root; resolves through the directory tree) +- **Filename** — `auth.md` (errors with an "ambiguous filename" hint when the same name exists in multiple directories or KBs) +- **File ID** — the UUID returned by `ls`, `find`, or `grep` + +**Behavior notes** + +- `kb_exec` respects the same access control as the other knowledge tools — files the user can't read are silently excluded from results. +- The model still has `query_knowledge_files` for semantic search; reach for it when literal commands won't find a paraphrased concept. +- Built on top of the directory model — `kb_exec` is the only tool that fully reflects the directory structure created in the UI. + Autonomous exploration works best with frontier models that can intelligently chain search, browse, and synthesize. Smaller models may struggle with multi-step retrieval. Administrators can disable the **Knowledge Base** tool category per-model in **Workspace > Models > Edit > Builtin Tools**. For the full list of built-in agentic tools, see the [Native/Agentic Mode Tools Guide](/features/extensibility/plugin/tools#built-in-system-tools-nativeagentic-mode). @@ -104,6 +187,54 @@ When native function calling is enabled, attached knowledge is **not automatical 3. Upload files or add existing documents. 4. Attach the knowledge base to a model in **Workspace > Models > Edit**, or reference it in chat with `#`. +### Organizing into directories + +Knowledge bases support nested **directories** so larger document sets stay navigable. Create them from the **Add Content** menu (**+ New Directory**), then reorganize freely. + +**Creating and navigating** + +- **+ New Directory** lives next to file upload in the **Add Content** menu. Name uniqueness is enforced per parent — two siblings can't share a name, but you can reuse names in different parents. +- Click a directory to descend into it; the **breadcrumb trail** at the top of the view always reflects the current path and lets you jump back to any ancestor in one click. +- Directories can be **renamed** or **moved to a different parent** without affecting the files inside them. + +**Drag-and-drop** + +You can move items by dragging: + +- **Files** onto a directory row, into the empty area of an open directory, or onto any breadcrumb crumb (including the root crumb to send a file back to the top level). +- **Directories** onto another directory to nest them, or onto a breadcrumb crumb to move them up the tree. Moving a directory into itself or one of its descendants is blocked server-side. + +**Deletion semantics** + +Deleting a non-empty directory prompts for the action to take with its contents: + +- **Move files to parent** (default) — the directory is removed but its files and subdirectories are re-parented one level up. +- **Delete everything** — the directory and all files/subdirectories underneath it are permanently removed. + +**Effect on retrieval and tools** + +- **Retrieval and standard RAG** still span the entire knowledge base. Directories don't shard the vector index; chunks from any subdirectory remain reachable in a single search. +- **Agentic tools** are directory-aware: + - `kb_exec` (when enabled) treats subdirectories like a filesystem: `ls docs/`, `tree`, `grep "x" docs/`, and path-style refs (`docs/api/auth.md`) all work — see [Filesystem-Style Access (`kb_exec`)](#filesystem-style-access-kb_exec) below. + - The other knowledge tools (`query_knowledge_files`, `grep_knowledge_files`, `search_knowledge_files`) ignore directory boundaries and return matches from the whole KB. + +### Renaming files + +Individual files can be renamed in place from the workspace via the file's item menu — no need to re-upload. The new name is reflected everywhere the file is referenced (knowledge listings, agentic tool output, citations). + +### Syncing a local directory + +The **Add Content → Sync Directory** action mirrors a local folder into the knowledge base **incrementally**: the client hashes each local file (SHA-256), the server compares hashes and paths against what is already stored, and only **new**, **modified**, and **deleted** files are touched. Unmodified files (the typical majority) are left alone — no re-upload, no re-embedding. The local folder's subdirectory structure is mirrored in the KB; missing subdirectories are created, and subdirectories that no longer exist locally are removed. + +Behavior to be aware of: + +- Hidden files and folders (anything beginning with `.`) are skipped. +- Files modified locally upload with a new content hash; the old file entry is removed from the KB and replaced. +- Files removed locally are deleted from the KB during the cleanup step. +- The action is **non-destructive** for unchanged files. Earlier versions of the same menu action used to wipe and re-upload everything — that is no longer the case as of v0.9.6. + +For programmatic use, the same workflow is exposed as two endpoints under [API access](#api-access) below. + ### Exporting Admins can export an entire knowledge base as a zip file via the item menu (three dots) > **Export**. Files are converted to `.txt` for universal compatibility. Regular users will not see the Export option. @@ -112,9 +243,25 @@ Admins can export an entire knowledge base as a zip file via the item menu (thre Knowledge bases can be managed programmatically: -- `POST /api/v1/files/` - Upload files -- `GET /api/v1/files/{id}/process/status` - Check processing status -- `POST /api/v1/knowledge/{id}/file/add` - Add files to a knowledge base +**Files** + +- `POST /api/v1/files/` — Upload files. Pass `knowledge_id` (and optionally `directory_id`) in the upload metadata to have the backend **auto-link and process the file into that knowledge base server-side** — equivalent to a follow-up `POST /api/v1/knowledge/{id}/file/add`, but it does not depend on the client staying connected after upload. This is the recommended single-call path (added in v0.9.6, fixing files left unlinked when the uploader disconnected mid-processing). The server SHA-256-hashes the uploaded bytes into `file.meta.file_hash`; clients can pre-compute and send `file_hash` in metadata to skip server-side hashing (used by the incremental sync flow below). +- `GET /api/v1/files/{id}/process/status` — Check processing status +- `POST /api/v1/files/{id}/rename` — Rename a file +- `POST /api/v1/knowledge/{id}/file/add` — Add files to a knowledge base +- `POST /api/v1/knowledge/{id}/file/move` — Move a file between directories within the same KB (body: `file_id`, `directory_id` — `null` moves to the KB root) + +**Directories** + +- `POST /api/v1/knowledge/{id}/dirs/create` — Create a directory (body: `name`, optional `parent_id`) +- `POST /api/v1/knowledge/{id}/dirs/{dir_id}/update` — Rename or re-parent a directory (body: `name` and/or `parent_id`) +- `DELETE /api/v1/knowledge/{id}/dirs/{dir_id}/delete?move_files=true` — Delete a directory. With `move_files=true` (default), contained files are re-parented; with `move_files=false`, they're deleted along with the directory. + +**Incremental directory sync** (added in v0.9.6) + +- `POST /api/v1/knowledge/{id}/sync/diff` — Submit a local manifest (`manifest: [{path, filename, checksum}]` where `checksum` is the SHA-256 of the file bytes) and receive `{added, modified, deleted, mkdir, rmdir, unmodified_count}` describing exactly what to upload, replace, delete, and which directories to create/remove. Read-only — does not mutate the KB. +- After acting on the diff (create `mkdir` paths, upload `added` + `modified` files with their hashes via `POST /api/v1/files/`), call: +- `POST /api/v1/knowledge/{id}/sync/cleanup` — Body: `{file_ids: [...], dir_ids: [...]}`. Removes the stale files (from the KB, vector store, and per-file collections) and the now-empty directories returned by `sync/diff`. Run this last so deletions don't outrun uploads. File processing happens asynchronously. You must poll the status endpoint until processing completes before adding files to a knowledge base, or you'll get an "empty content" error. See [API Endpoints](/reference/api-endpoints#-retrieval-augmented-generation-rag) for workflow examples. @@ -144,7 +291,7 @@ Add dozens of papers to a knowledge base. The AI searches across all of them to ### Processing delay for API uploads -Files uploaded via API are processed asynchronously. Attempting to use a file before processing completes will fail silently or return empty results. +Files uploaded via API are processed asynchronously. Attempting to use a file before processing completes will fail silently or return empty results. Note that uploading with a `knowledge_id` (above) makes linking server-side and robust to client disconnects, but it does **not** make the content instantly queryable — extraction/embedding still runs in the background, so poll `GET /api/v1/files/{id}/process/status` before relying on retrieval. ### Native function calling changes behavior diff --git a/docs/getting-started/advanced-topics/hardening.md b/docs/getting-started/advanced-topics/hardening.md index 9dc6e3751..7b41f30f9 100644 --- a/docs/getting-started/advanced-topics/hardening.md +++ b/docs/getting-started/advanced-topics/hardening.md @@ -551,6 +551,10 @@ Outbound HTTP requests also do not follow `3xx` redirects by default. Without th AIOHTTP_CLIENT_ALLOW_REDIRECTS=false ``` +:::note Playwright loader (v0.9.6+) +Earlier versions applied URL validation and the redirect gate only to the default web loader; the Playwright-based loader (`WEB_LOADER_ENGINE=playwright` / the `playwright` Docker variant) could navigate and follow redirects to internal or blocklisted URLs unchecked. As of v0.9.6 the Playwright path enforces the same `validate_url()` and redirect rules as the default loader, so the SSRF controls above apply regardless of which web loader engine you run. If you use Playwright, ensure you are on v0.9.6 or later. +::: + ### Profile image URL forwarding The user and model profile-image endpoints can issue a `302 Found` redirect to whatever origin is stored in `profile_image_url` so that externally-hosted avatars (e.g. Gravatar via an upstream identity provider) display in the UI. That redirect causes the user's browser to make a request directly to the external origin, leaking client IP, User-Agent, and Referer headers — and an account whose `profile_image_url` was set to an attacker-controlled host can use that to deanonymize anyone who renders their avatar. diff --git a/docs/getting-started/advanced-topics/scaling.md b/docs/getting-started/advanced-topics/scaling.md index 829918525..9fa4bf896 100644 --- a/docs/getting-started/advanced-topics/scaling.md +++ b/docs/getting-started/advanced-topics/scaling.md @@ -385,8 +385,19 @@ UVICORN_WORKERS=1 # Migrations (set to false on all but one instance) ENABLE_DB_MIGRATIONS=false + +# Concurrency & DB write throttling (REQUIRED at scale — see note below) +THREAD_POOL_SIZE=2000 +DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL=300 ``` +:::warning Two settings people forget — and then their scaled deployment stalls +- **`THREAD_POOL_SIZE=2000`** — Open WebUI offloads blocking work (DB calls, file I/O, sync handlers) to a thread pool whose default concurrency ceiling is only **40**. At scale, once 40 blocking operations are in flight every further request **queues**, and the whole app appears to freeze even though CPU/RAM look fine. `2000` is a *lower* bound for large instances; it is a concurrency ceiling, **not** a CPU/thread count, so a high value is not a contention risk. Never lower it. (The only exception is genuinely tiny hardware, which is not a "scaled deployment".) +- **`DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL=300`** — presence tracking writes each user's `last_active_at` to the database. **Unset (the default) means this write is unthrottled — roughly one `UPDATE` + `COMMIT` per authenticated request.** At scale that is a continuous flood of tiny write transactions that saturates the connection pool for no functional gain. Set it to `300`–`500` seconds; it is mandatory for large/production deployments and free performance everywhere else. + +Both are read once at startup and are not configurable from the Admin UI. See [Performance → Database Optimization](/troubleshooting/performance#-database-optimization) and [Performance → High-Concurrency](/troubleshooting/performance#-high-concurrency--network-optimization). +::: + ### Security defaults to revisit at scale A few defaults that are reasonable for single-user evaluation become less so once you put the deployment behind SSO and serve real users. The full discussion lives in the [Hardening guide](/getting-started/advanced-topics/hardening); the items most often missed in enterprise rollouts: diff --git a/docs/getting-started/quick-start/tab-docker/DockerCompose.md b/docs/getting-started/quick-start/tab-docker/DockerCompose.md index 8b88d3ac4..b7bd492f3 100644 --- a/docs/getting-started/quick-start/tab-docker/DockerCompose.md +++ b/docs/getting-started/quick-start/tab-docker/DockerCompose.md @@ -56,9 +56,15 @@ To start your services, run the following command: docker compose up -d ``` -## Helper Script +## Helper Scripts -A useful helper script called `run-compose.sh` is included with the codebase. This script assists in choosing which Docker Compose files to include in your deployment, streamlining the setup process. +A set of helper scripts is included with the codebase to streamline common Docker workflows: + +- `docker-compose-launcher.sh` — Interactive Compose launcher with GPU auto-detection, configurable WebUI/API ports, host data mounts, and optional Playwright support. Run `./docker-compose-launcher.sh --help` for the full list of flags. Use `--drop` to tear down the project. +- `docker-cleanup.sh` — Stops the Compose project and **deletes all volumes**, including persistent data. Prompts for confirmation before destroying data. +- `docker-run.sh` — Builds the Open WebUI image and runs a single container, exposing it on `OPEN_WEBUI_PORT` (default `3000`). +- `docker-ollama.sh` — Pulls and runs the official Ollama container with optional GPU passthrough, exposing it on `OLLAMA_PORT` (default `11434`). +- `docker-update-models.sh` — Iterates through every model installed in the Ollama container and pulls the latest version. --- diff --git a/docs/getting-started/quick-start/tab-docker/ManualDocker.md b/docs/getting-started/quick-start/tab-docker/ManualDocker.md index b944625d4..8825dedf2 100644 --- a/docs/getting-started/quick-start/tab-docker/ManualDocker.md +++ b/docs/getting-started/quick-start/tab-docker/ManualDocker.md @@ -49,9 +49,9 @@ Visit [http://localhost:3000](http://localhost:3000). For production environments, pin a specific version instead of using floating tags: ```bash -docker pull ghcr.io/open-webui/open-webui:v0.9.5 -docker pull ghcr.io/open-webui/open-webui:v0.9.5-cuda -docker pull ghcr.io/open-webui/open-webui:v0.9.5-ollama +docker pull ghcr.io/open-webui/open-webui:v0.9.6 +docker pull ghcr.io/open-webui/open-webui:v0.9.6-cuda +docker pull ghcr.io/open-webui/open-webui:v0.9.6-ollama ``` --- diff --git a/docs/getting-started/updating.mdx b/docs/getting-started/updating.mdx index 68a118ccd..7b9000e04 100644 --- a/docs/getting-started/updating.mdx +++ b/docs/getting-started/updating.mdx @@ -31,9 +31,9 @@ The `:main` tag always points to the **latest build**. It's convenient but can i For stability, pin a specific release tag: ``` -ghcr.io/open-webui/open-webui:v0.9.5 -ghcr.io/open-webui/open-webui:v0.9.5-cuda -ghcr.io/open-webui/open-webui:v0.9.5-ollama +ghcr.io/open-webui/open-webui:v0.9.6 +ghcr.io/open-webui/open-webui:v0.9.6-cuda +ghcr.io/open-webui/open-webui:v0.9.6-ollama ``` Browse all available tags on the [GitHub releases page](https://github.com/open-webui/open-webui/releases). diff --git a/docs/reference/database-schema.md b/docs/reference/database-schema.md index 8b5ab256e..464ba831a 100644 --- a/docs/reference/database-schema.md +++ b/docs/reference/database-schema.md @@ -10,7 +10,7 @@ This tutorial is a community contribution and is not supported by the Open WebUI ::: > [!WARNING] -> This documentation reflects schema changes up to Open WebUI v0.9.5. +> This documentation reflects schema changes up to Open WebUI v0.9.6. ## Open-WebUI Internal SQLite Database diff --git a/docs/reference/env-configuration.mdx b/docs/reference/env-configuration.mdx index ed6ad2fc9..43c924440 100644 --- a/docs/reference/env-configuration.mdx +++ b/docs/reference/env-configuration.mdx @@ -12,23 +12,23 @@ As new variables are introduced, this page will be updated to reflect the growin :::info -This page is up-to-date with Open WebUI release version [v0.9.5](https://github.com/open-webui/open-webui/releases/tag/v0.9.5), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions. +This page is up-to-date with Open WebUI release version [v0.9.6](https://github.com/open-webui/open-webui/releases/tag/v0.9.6), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions. ::: -### Important Note on `PersistentConfig` Environment Variables +### Important Note on `ConfigVar` Environment Variables :::note -When launching Open WebUI for the first time, all environment variables are treated equally and can be used to configure the application. However, for environment variables marked as `PersistentConfig`, their values are persisted and stored internally. +When launching Open WebUI for the first time, all environment variables are treated equally and can be used to configure the application. However, for environment variables marked as `ConfigVar`, their values are persisted and stored internally. -After the initial launch, if you restart the container, `PersistentConfig` environment variables will no longer use the external environment variable values. Instead, they will use the internally stored values. +After the initial launch, if you restart the container, `ConfigVar` environment variables will no longer use the external environment variable values. Instead, they will use the internally stored values. In contrast, regular environment variables will continue to be updated and applied on each subsequent restart. -You can update the values of `PersistentConfig` environment variables directly from within Open WebUI, and these changes will be stored internally. This allows you to manage these configuration settings independently of the external environment variables. +You can update the values of `ConfigVar` environment variables directly from within Open WebUI, and these changes will be stored internally. This allows you to manage these configuration settings independently of the external environment variables. -Please note that `PersistentConfig` environment variables are clearly marked as such in the documentation below, so you can be aware of how they will behave. +Please note that `ConfigVar` environment variables are clearly marked as such in the documentation below, so you can be aware of how they will behave. To disable this behavior and force Open WebUI to always use your environment variables (ignoring the database), set `ENABLE_PERSISTENT_CONFIG` to `False`. @@ -44,7 +44,7 @@ If you change an environment variable (like `ENABLE_SIGNUP=True`) but don't see Set `ENABLE_PERSISTENT_CONFIG=False` in your environment. This forces Open WebUI to read your variables directly. Note that UI-based settings changes will not persist across restarts in this mode. #### Option 2: Update via Admin UI (Recommended) -The simplest and safest way to change `PersistentConfig` settings is directly through the **Admin Panel** within Open WebUI. Even if an environment variable is set, changes made in the UI will take precedence and be saved to the database. +The simplest and safest way to change `ConfigVar` settings is directly through the **Admin Panel** within Open WebUI. Even if an environment variable is set, changes made in the UI will take precedence and be saved to the database. #### Option 3: Manual Database Update (Last Resort / Lock-out Recovery) If you are locked out or cannot access the UI, you can manually update the SQLite database via Docker: @@ -78,7 +78,7 @@ environment variables, see our [logging documentation](https://docs.openwebui.co - Type: `str` - Default: `http://localhost:3000` - Description: Specifies the URL where your Open WebUI installation is reachable. Needed for search engine support and OAuth/SSO. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. :::warning @@ -97,7 +97,7 @@ Failure to set WEBUI_URL before using OAuth/SSO will result in failure to log in - Type: `bool` - Default: `True` - Description: Toggles user account creation. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_SIGNUP_PASSWORD_CONFIRMATION` @@ -148,14 +148,14 @@ After the admin account is created, sign-up is automatically disabled for securi - Type: `bool` - Default: `True` - Description: Toggles email, password, sign-in and "or" (only when `ENABLE_OAUTH_SIGNUP` is set to True) elements. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_PASSWORD_CHANGE_FORM` - Type: `bool` - Default: `True` - Description: Controls visibility of the password change UI in **Settings > Account**. When set to `False`, users do not see the password update form, which is useful for SSO-focused deployments where password changes should not be presented in the UI. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_PASSWORD_AUTH` @@ -181,14 +181,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `str` - Default: `en` - Description: Sets the default locale for the application. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_MODELS` - Type: `str` - Default: Empty string (' '), since `None`. - Description: Sets a default Language Model. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_PINNED_MODELS` @@ -196,14 +196,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Default: Empty string (' ') - Description: Comma-separated list of model IDs to pin by default for new users who haven't customized their pinned models. This provides a pre-selected set of frequently used models in the model selector for new accounts. - Example: `gpt-4,claude-3-opus,llama-3-70b` -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_MODEL_METADATA` - Type: `dict` (JSON object) - Default: `{}` - Description: Sets global default metadata (capabilities and other model info) for all models. These defaults act as a baseline — per-model overrides always take precedence. For capabilities, the defaults and per-model values are merged (per-model wins on conflicts). For other metadata fields, the default is only applied if the model has no value set. Configurable via **Admin Settings → Models**. -- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `models.default_metadata`. +- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `models.default_metadata`. :::info @@ -220,7 +220,7 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `dict` (JSON object) - Default: `{}` - Description: Sets global default parameters (temperature, top_p, max_tokens, seed, etc.) for all models. These defaults are applied as a baseline at chat completion time — per-model parameter overrides always take precedence. Configurable via **Admin Settings → Models**. -- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `models.default_params`. +- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `models.default_params`. :::info @@ -240,14 +240,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - `admin` - New users are automatically activated with administrator permissions. - Default: `pending` - Description: Sets the default role assigned to new users. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_GROUP_ID` - Type: `str` - Default: Empty string (' ') - Description: Sets the default group ID to assign to new users upon registration. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_GROUP_SHARE_PERMISSION` @@ -261,63 +261,63 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `str` - Default: Empty string (' ') - Description: Sets a custom title for the pending user overlay. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `PENDING_USER_OVERLAY_CONTENT` - Type: `str` - Default: Empty string (' ') - Description: Sets a custom text content for the pending user overlay. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_CALENDAR` - Type: `bool` - Default: `True` - Description: Enables or disables the Calendar feature. When enabled, users can create calendars, manage events, and share calendars with other users or groups via access grants. Active automations are automatically surfaced as virtual events on a dedicated "Scheduled Tasks" calendar. Requires the `features.calendar` user permission (admins always pass). -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_CHANNELS` - Type: `bool` - Default: `False` - Description: Enables or disables channel support. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_FOLDERS` - Type: `bool` - Default: `True` - Description: Enables or disables the folders feature, allowing users to organize their chats into folders in the sidebar. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `FOLDER_MAX_FILE_COUNT` - Type: `int` - Default: `("") empty string` - Description: Sets the maximum number of files processing allowed per folder. -- Persistence: This environment variable is a `PersistentConfig` variable. It can be configured in the **Admin Panel > Settings > General > Folder Max File Count**. Default is none (empty string) which is unlimited. +- Persistence: This environment variable is a `ConfigVar` variable. It can be configured in the **Admin Panel > Settings > General > Folder Max File Count**. Default is none (empty string) which is unlimited. #### `ENABLE_AUTOMATIONS` - Type: `bool` - Default: `True` - Description: Enables or disables the Automations feature globally. When disabled, the scheduler skips automation processing, the automation API endpoints return `403 Forbidden`, automation builtin tools are not injected, and the Automations entry is hidden from the sidebar. Requires the `features.automations` user permission (admins always pass). -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `AUTOMATION_MAX_COUNT` - Type: `int` - Default: `("") empty string` (unlimited) - Description: Sets the maximum number of automations a non-admin user can create. When set to a positive integer, users who reach this limit will receive a `403 Forbidden` error when attempting to create additional automations. Admins bypass this limit. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `AUTOMATION_MIN_INTERVAL` - Type: `int` (seconds) - Default: `("") empty string` (no minimum) - Description: Sets the minimum allowed interval in seconds between automation recurrences for non-admin users. When set, any automation schedule that recurs more frequently than this value will be rejected with a `400 Bad Request` error. One-time automations (`COUNT=1`) are exempt from this check. Admins bypass this limit. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. :::tip Common values for AUTOMATION_MIN_INTERVAL @@ -347,20 +347,20 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `bool` - Default: `True` - Description: Enables or disables the notes feature, allowing users to create and manage personal notes within Open WebUI. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_MEMORIES` - Type: `bool` - Default: `True` - Description: Enables or disables the [memory feature](/features/chat-conversations/memory), allowing models to store and retrieve long-term information about users. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `WEBHOOK_URL` - Type: `str` - Description: Sets a webhook for integration with Discord/Slack/Microsoft Teams. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. :::note Admin posture toggles vs. security boundaries @@ -416,14 +416,14 @@ Treat anything in this cluster as *what the admin sees and does in the product U - Type: `bool` - Default: `False` - Description: Enables or disables user webhooks. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `RESPONSE_WATERMARK` - Type: `str` - Default: Empty string (' ') - Description: Sets a custom text that will be included when you copy a message in the chat. e.g., `"This text is AI generated"` -> will add "This text is AI generated" to every message, when copied. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `IFRAME_CSP` @@ -434,12 +434,15 @@ Treat anything in this cluster as *what the admin sees and does in the product U #### `THREAD_POOL_SIZE` - Type: `int` -- Default: `0` -- Description: Sets the thread pool size for FastAPI/AnyIO blocking calls. By default (when set to `0`) FastAPI/AnyIO use `40` threads. In case of large instances and many concurrent users, it may be needed to increase `THREAD_POOL_SIZE` to prevent blocking. +- Default: `0` (unset — the AnyIO default limit of `40` applies) +- Description: Sets the maximum number of **concurrent** blocking operations that may run in the AnyIO worker thread pool at once. Open WebUI offloads synchronous/blocking work (many DB calls, file I/O, sync route handlers, some library calls) to this pool via `run_in_threadpool`. The value is a **concurrency ceiling (a token limit), not a fixed pool of pre-spawned OS threads and not a CPU-core/thread count**: worker threads are created lazily only when needed and reused, so a high value does **not** by itself create that many threads, consume CPU, or cause CPU contention while idle. It only raises how many blocking operations can be in flight simultaneously before the rest must queue. -:::info +:::warning Set this high on any real server (2000+); never lower it +The AnyIO default of `40` is far too low for production. When more than `THREAD_POOL_SIZE` blocking operations are needed at once (many users acting at the same time, or a few users each triggering several blocking calls), every further request **waits** for a free slot. The symptom is the whole app appearing to **hang / freeze / stop responding** under load, even though CPU and memory look fine — it is pool starvation, not resource exhaustion. -If you are running larger instances, you WILL NEED to set this to a higher value like multiple hundreds if not thousands (e.g. `1000`) otherwise your app may get stuck the default pool size (which is 40 threads) is full and will not react anymore. +- **Normal servers / production:** `2000` or higher. `2000` is a *lower* bound for very large multi-user instances — going higher is fine and is **not** a CPU or contention risk (it is a ceiling, not a preallocation). +- **Never decrease below the default.** An idle high ceiling costs effectively nothing; a low ceiling causes freezes. +- **Exception — weak hardware (Raspberry Pi, tiny VPS, containers capped at ~250m CPU / very low RAM):** do **not** set `2000` here. Each *genuinely concurrent* blocking op still uses a real OS thread (stack memory), so on a tiny device an enormous ceiling lets a traffic burst spawn enough threads to exhaust RAM. Leave it at the default, or set a modest value (e.g. a few hundred) matched to what the device can actually absorb. This caveat applies only to constrained single-board / micro deployments — any normal server should use `2000+`. ::: @@ -454,21 +457,21 @@ If you are running larger instances, you WILL NEED to set this to a higher value - Type: `bool` - Default: `True` - Description: Toggles whether to show admin user details in the interface. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_PUBLIC_ACTIVE_USERS_COUNT` - Type: `bool` - Default: `True` - Description: Controls whether the active user count is visible to all users or restricted to administrators only. When set to `False`, only admin users can see how many users are currently active, reducing backend load and addressing privacy concerns in large deployments. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_USER_STATUS` - Type: `bool` - Default: `True` - Description: Globally enables or disables user status functionality. When disabled, the status UI (including blinking active/away indicators and status messages) is hidden across the application, and user status API endpoints are restricted. -- Persistence: This environment variable is a `PersistentConfig` variable. It can be toggled in the **Admin Panel > Settings > General > User Status**. +- Persistence: This environment variable is a `ConfigVar` variable. It can be toggled in the **Admin Panel > Settings > General > User Status**. #### `ENABLE_EASTER_EGGS` @@ -480,7 +483,7 @@ If you are running larger instances, you WILL NEED to set this to a higher value - Type: `str` - Description: Sets the admin email shown by `SHOW_ADMIN_DETAILS` -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENV` @@ -566,13 +569,13 @@ Enabling `ENABLE_REALTIME_CHAT_SAVE` causes every single token generated by the - Type: `bool` - Default: `True` -- Description: Controls whether the user and model profile-image endpoints honor an external `http(s)://` URL stored in `profile_image_url` by issuing a `302 Found` redirect to the original origin. When `False`, the redirect is suppressed and the endpoint falls through to the bundled default image instead. Set to `False` to prevent client-side IP, User-Agent, and Referer leaks to attacker-controlled origins via attacker-stored profile URLs (data URIs and same-origin/static images continue to load normally). Existing deployments that legitimately rely on external profile image URLs (e.g. Gravatar redirects served by upstream identity providers) should keep the default. **This variable is read once at startup — it is not a `PersistentConfig` and cannot be changed from the Admin UI.** +- Description: Controls whether the user and model profile-image endpoints honor an external `http(s)://` URL stored in `profile_image_url` by issuing a `302 Found` redirect to the original origin. When `False`, the redirect is suppressed and the endpoint falls through to the bundled default image instead. Set to `False` to prevent client-side IP, User-Agent, and Referer leaks to attacker-controlled origins via attacker-stored profile URLs (data URIs and same-origin/static images continue to load normally). Existing deployments that legitimately rely on external profile image URLs (e.g. Gravatar redirects served by upstream identity providers) should keep the default. **This variable is read once at startup — it is not a `ConfigVar` and cannot be changed from the Admin UI.** #### `PROFILE_IMAGE_ALLOWED_MIME_TYPES` - Type: `str` (comma-separated MIME types) - Default: `image/png,image/jpeg,image/gif,image/webp` -- Description: Allowlist of MIME types accepted when serving a base64 `data:` URI as a profile image. The MIME type is parsed from the data URI prefix and checked against this list before the response is streamed; non-allowlisted types fall through to the bundled default image. Responses also set `X-Content-Type-Options: nosniff` to prevent the browser from sniffing the body into an executable type. SVG is intentionally not in the default list because it can carry inline `