Added document parser OCR strategy & enable it by dlew · Pull Request #107 · newjersey/LibreChat

dlew · 2026-02-11T16:55:53Z

The document parser uses libraries to parse the text out of known document types. This lets LibreChat handle some complex document types without having to use a secondary service (like Mistral or standing up a RAG API server).

To enable the document parser, set the ocr strategy to "document_parser" in librechat.yaml.

We now support:

PDFs using pdfjs
DOCX using mammoth
XLS/XLSX using SheetJS

(The associated packages were also added to the project.)

(This is the same as the upstream PR danny-avila#11519, but we're merging in now so we can release soon.)

Part of https://github.com/newjersey/innovation-platform-pm/issues/982

The document parser uses libraries to parse the text out of known document types. This lets LibreChat handle some complex document types without having to use a secondary service (like Mistral or standing up a RAG API server). To enable the document parser, set the ocr strategy to "document_parser" in librechat.yaml. We now support: - PDFs using pdfjs - DOCX using mammoth - XLS/XLSX using SheetJS (The associated packages were also added to the project.)

entropic489 · 2026-02-11T20:03:07Z

looks good and works

dlew added 2 commits February 11, 2026 10:51

Enable document parser for NJ

9feccd5

dlew self-assigned this Feb 11, 2026

dlew added the pr-review label Feb 11, 2026

ooi-pull-request-app Bot removed the pr-review label Feb 11, 2026

ooi-pull-request-app Bot requested a review from entropic489 February 11, 2026 16:56

dlew changed the title ~~Added "document parser" OCR strategy & enable it~~ Added document parser OCR strategy & enable it Feb 11, 2026

entropic489 approved these changes Feb 11, 2026

View reviewed changes

dlew merged commit 916a417 into newjersey Feb 11, 2026
12 of 14 checks passed

dlew deleted the dlew/document-parser-ocr-nj branch February 11, 2026 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added document parser OCR strategy & enable it#107

Added document parser OCR strategy & enable it#107
dlew merged 2 commits intonewjerseyfrom
dlew/document-parser-ocr-nj

dlew commented Feb 11, 2026 •

edited

Loading

Uh oh!

entropic489 commented Feb 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dlew commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

entropic489 commented Feb 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dlew commented Feb 11, 2026 •

edited

Loading