Skip to content

Feat: Add crawl4ai as an option for local scrape#51

Open
relic664 wants to merge 5 commits intodanny-avila:mainfrom
relic664:feature/crawl4ai-with-fit
Open

Feat: Add crawl4ai as an option for local scrape#51
relic664 wants to merge 5 commits intodanny-avila:mainfrom
relic664:feature/crawl4ai-with-fit

Conversation

@relic664
Copy link
Copy Markdown

@relic664 relic664 commented Jan 28, 2026

This is a PR to add crawl4ai as an option for local crawl, to supplement the existing firecrawl option. There's going to be a complementary PR in the LibreChat main repo, which I've drafted here

This PR builds on @lukolszewski original work and adds fit as a default option for crawl4ai's /md endpoint, which uses adaptive filtering to filter the markdown, or returns just the raw markdown (raw for fitStrategy).

I've tested locally and it works fine. I have docker image for testing if anybody is interested (ghcr.io/relic664/librechat:latest).

It's worth nothing that this is a very basic implementation to provide a simple option for a self-hosted scrape option. This implementation doesn't provide options for the scrape beyond fit (filtered markdown) or raw (raw markdown). Given that there's only one self-hosted option for scrape, I thought it was prudent to go ahead and make a MVP PR for crawl4ai before a full featured implementation with all the configuration knobs.

A simple quick start is to set the env var CRAWL4AI_API_URL to your instance, and in your librechat.config:

webSearch:
  crawl4aiApiUrl: "${CRAWL4AI_API_URL}"
  scraperProvider: "crawl4ai"

lukolszewski and others added 5 commits November 11, 2025 13:21
Implements Crawl4AI scraper alongside existing Firecrawl and Serper scrapers
for web content extraction in the search tool.

Changes:
- Add 'crawl4ai' to ScraperProvider type
- Add Crawl4AIScraperConfig and Crawl4AIScrapeResponse interfaces
- Create crawl4ai-scraper.ts implementing BaseScraper interface
- Update tool.ts to support Crawl4AI configuration and selection
- Support extraction and chunking strategies
- Use Bearer token authentication
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants