Skip to content

Added document parser OCR strategy & enable it#107

Merged
dlew merged 2 commits intonewjerseyfrom
dlew/document-parser-ocr-nj
Feb 11, 2026
Merged

Added document parser OCR strategy & enable it#107
dlew merged 2 commits intonewjerseyfrom
dlew/document-parser-ocr-nj

Conversation

@dlew
Copy link
Copy Markdown
Collaborator

@dlew dlew commented Feb 11, 2026

The document parser uses libraries to parse the text out of known document types. This lets LibreChat handle some complex document types without having to use a secondary service (like Mistral or standing up a RAG API server).

To enable the document parser, set the ocr strategy to "document_parser" in librechat.yaml.

We now support:

  • PDFs using pdfjs
  • DOCX using mammoth
  • XLS/XLSX using SheetJS

(The associated packages were also added to the project.)

(This is the same as the upstream PR danny-avila#11519, but we're merging in now so we can release soon.)

Part of https://github.com/newjersey/innovation-platform-pm/issues/982

dlew added 2 commits February 11, 2026 10:51
The document parser uses libraries to parse the text out of known document types.
This lets LibreChat handle some complex document types without having to use a
secondary service (like Mistral or standing up a RAG API server).

To enable the document parser, set the ocr strategy to "document_parser" in
librechat.yaml.

We now support:

- PDFs using pdfjs
- DOCX using mammoth
- XLS/XLSX using SheetJS

(The associated packages were also added to the project.)
@dlew dlew self-assigned this Feb 11, 2026
@dlew dlew added the pr-review label Feb 11, 2026
@dlew dlew changed the title Added "document parser" OCR strategy & enable it Added document parser OCR strategy & enable it Feb 11, 2026
@entropic489
Copy link
Copy Markdown

looks good and works

@dlew dlew merged commit 916a417 into newjersey Feb 11, 2026
12 of 14 checks passed
@dlew dlew deleted the dlew/document-parser-ocr-nj branch February 11, 2026 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants