Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[submodule "dev/rag_api"]
path = dev/rag_api
url = https://github.com/kalaspuffar/rag_api.git
branch = reranker
[submodule "dev/librechat"]
path = dev/librechat
url = https://github.com/kalaspuffar/LibreChat.git
branch = new/feature/simple_reranker
[submodule "dev/agents"]
path = dev/agents
url = https://github.com/kalaspuffar/agents.git
branch = simple_reranker
66 changes: 66 additions & 0 deletions dev/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Development Submodules

Git submodules for local development and testing PRs.

## Setup

### 1. Initialize Submodules

```bash
git submodule update --init --remote
```

This checks out the branches specified in `.gitmodules`.

### 1.1. Build Agents Package

Since `agents` is an npm package used by LibreChat, build it before starting:

```bash
cd dev/agents
npm install
npm run build
cd ../..
```

### 2. Build and Start

To use local builds from submodules, include the override file:

```bash
docker compose -f docker-compose.librechat.yml -f docker-compose.librechat.override.yml build
docker compose -f docker-compose.librechat.yml -f docker-compose.librechat.override.yml up -d
```

To use published images, omit the override file:

```bash
docker compose -f docker-compose.librechat.yml build
docker compose -f docker-compose.librechat.yml up -d
```

## Testing the Reranker

```bash
curl -s http://localhost:8000/rerank \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $JWT_TOKEN" \
-d '{
"query": "I love you",
"docs": ["I hate you", "I really like you"],
"k": 5
}'
```

## Update Submodules

```bash
git submodule update --remote
```

## Switch Between Local and Published Images

Since the override file is not automatically loaded, simply include or omit it in your commands:

- **Local builds**: Include `-f docker-compose.librechat.override.yml`
- **Published images**: Omit the override file flag
1 change: 1 addition & 0 deletions dev/agents
Submodule agents added at 42d90e
1 change: 1 addition & 0 deletions dev/librechat
Submodule librechat added at 2b8578
1 change: 1 addition & 0 deletions dev/rag_api
Submodule rag_api added at 563790
20 changes: 20 additions & 0 deletions docker-compose.librechat.override.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Use this file to override the docker-compose.librechat.yml file to enable local builds from dev/ submodules
# docker-compose will automatically use this file when present

services:
api:
image: librechat:local
build:
context: ./dev/librechat
dockerfile: Dockerfile
target: node
volumes:
- ./dev/agents/dist:/app/node_modules/@librechat/agents/dist

rag_api:
image: rag_api:local
build:
context: ./dev/rag_api
dockerfile: Dockerfile


4 changes: 4 additions & 0 deletions docker-compose.librechat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ services:
rag_api:
container_name: rag_api
image: ghcr.io/danny-avila/librechat-rag-api-dev-lite:latest
ports:
- "${LIBRECHAT_RAG_PORT:-8000}:8000"
env_file:
- .env
environment:
Expand All @@ -87,6 +89,8 @@ services:
- RAG_OPENAI_API_KEY=${RAG_OPENAI_API_KEY:-${OPENROUTER_API_KEY:-}}
- RAG_OPENAI_BASEURL=${RAG_OPENAI_BASEURL:-${OPENROUTER_BASE_URL:-}}
- OPENAI_API_KEY=${OPENAI_API_KEY:-${OPENROUTER_API_KEY:-}}
- SIMPLE_RERANKER_MODEL_NAME=${SIMPLE_RERANKER_MODEL_NAME:-mixedbread-ai/mxbai-rerank-large-v2}
- SIMPLE_RERANKER_MODEL_TYPE=${SIMPLE_RERANKER_MODEL_TYPE:-cross-encoder}
restart: always
depends_on:
- vectordb
Expand Down
42 changes: 39 additions & 3 deletions env.example
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,49 @@ LIBRECHAT_MEILI_MASTER_KEY=change-me-meili-master-key
LIBRECHAT_RAG_PORT=8000
LIBRECHAT_RAG_API_URL=http://rag_api:8000


# RAG API - Embeddings (uses OpenRouter by default via docker-compose)
# Uncomment and set if you want to use different credentials than OpenRouter
#RAG_OPENAI_API_KEY=sk-...
#RAG_OPENAI_BASEURL=https://openrouter.ai/api/v1
#OPENAI_API_KEY=sk-...

# RAG API - Simple Reranker (local models, not via OpenRouter)
# Reranking models run locally, not through OpenRouter API
# See https://github.com/AnswerDotAI/rerankers for available models
#
# IMPORTANT: The reranker is called ONCE PER SCRAPED URL during web search.
# If 5 URLs are scraped, the reranker runs 5 times. Choose models accordingly.
#
# Option 1 (high quality but CPU-intensive without GPU):
SIMPLE_RERANKER_MODEL_NAME=mixedbread-ai/mxbai-rerank-large-v2
SIMPLE_RERANKER_MODEL_TYPE=cross-encoder
#
# RECOMMENDED FOR CPU-ONLY (no GPU): FlashRank models are ONNX-optimized
# and 5-10x faster on CPU with lower memory footprint. Good multilingual support.
# Option 2 - FlashRank (fastest, recommended for CPU):
#SIMPLE_RERANKER_MODEL_NAME=ms-marco-MiniLM-L-12-v2
#SIMPLE_RERANKER_MODEL_TYPE=flashrank
#
# Option 3 - FlashRank default (auto-selects best model):
#SIMPLE_RERANKER_MODEL_NAME=flashrank
#SIMPLE_RERANKER_MODEL_TYPE=flashrank
#
# Option 4 - Smaller cross-encoder (if you prefer cross-encoder architecture):
#SIMPLE_RERANKER_MODEL_NAME=cross-encoder/ms-marco-MiniLM-L-6-v2
#SIMPLE_RERANKER_MODEL_TYPE=cross-encoder
# Note: L-6-v2 has 22M parameters (very small), L-12-v2 has 33M (small)
#
# Option 5 - Multilingual cross-encoder (moderate size, good for German):
#SIMPLE_RERANKER_MODEL_NAME=BAAI/bge-reranker-base
#SIMPLE_RERANKER_MODEL_TYPE=cross-encoder
#
# PERFORMANCE NOTES:
# - FlashRank models: Best for CPU-only, ONNX optimized, fast inference
# - Cross-encoder models: Higher quality but slower on CPU, better with GPU
# - Large models (like mxbai-rerank-large-v2): Require significant CPU/GPU resources
# - For German/multilingual: All above models support multiple languages
# - Model is loaded once at startup and reused for all requests

# LibreChat - Vector Database (PostgreSQL)
LIBRECHAT_VECTORDB_DB=mydatabase
LIBRECHAT_VECTORDB_USER=myuser
Expand All @@ -58,7 +94,7 @@ GID=1000
SEARXNG_SECRET_KEY=change-me-searxng-secret
SEARXNG_BASE_URL=http://localhost:8080/
SEARXNG_INSTANCE_URL=http://searxng:8080
SEARXNG_API_KEY=
SEARXNG_API_KEY=local-selfhost

# Firecrawl - Self-hosted stack
FIRECRAWL_PORT=3003
Expand All @@ -77,7 +113,7 @@ FIRECRAWL_API_KEY=local-selfhost
FIRECRAWL_VERSION=v2
FIRECRAWL_NUM_WORKERS=8
FIRECRAWL_BULL_AUTH_KEY=my-secret-key
FIRECRAWL_LLM_MODEL=gpt-4o
FIRECRAWL_LLM_MODEL=mistralai/ministral-8b-2512

# Jina Reranker (temporary - will be replaced with RAG API reranker)
JINA_API_KEY=
Expand Down
4 changes: 1 addition & 3 deletions librechat.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,6 @@ webSearch:
firecrawlApiKey: "${FIRECRAWL_API_KEY}"
firecrawlApiUrl: "${FIRECRAWL_API_URL}"
firecrawlVersion: "${FIRECRAWL_VERSION}"
jinaApiKey: "${JINA_API_KEY}"
jinaApiUrl: "${JINA_API_URL}"
rerankerType: "jina"
rerankerType: "simple"
scraperTimeout: 7500
safeSearch: 1