Clam 2502 pdf first page rendering#1694
Open
jhumlick wants to merge 5 commits intoCLAM-2502-pdf-first-page-rendering-featurefrom
Open
Clam 2502 pdf first page rendering#1694jhumlick wants to merge 5 commits intoCLAM-2502-pdf-first-page-rendering-featurefrom
jhumlick wants to merge 5 commits intoCLAM-2502-pdf-first-page-rendering-featurefrom
Conversation
Add clamscan/clamd options to enable or disable PDF render fuzzy hashing, support DPI or WIDTHxHEIGHT canvas render settings, preserve rendered output with leave-temps, and add a clamscan unit test for canvas parsing. CLAM-2817
Log the rendered PDF image temp path when leave-temps is enabled, and fix stale callers after the Rust FFI signature changes for cli_magic_scan_buff() and fuzzy_hash_calculate_image(). CLAM-2817
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 82dbc5a28c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Update PDF rendered-image extraction in scanners.c to name temp outputs pdf-render-<source-basename>.<ext> instead of the generic pdf-render.<ext>. - Fall back to the old generic naming only if the source basename cannot be derived, and free the newly allocated filename/basename buffers on cleanup. - Change rendered-image fuzzy hash calculation to use should_calculate_image_fuzzy_hash(ctx) so PDF-rendered images honor the PDF-specific fuzzy hash option. CLAM-2817
- Define HAVE_PDFIUM during configure when PDFium support is actually enabled and found in CMakeLists.txt. - Compile-gate the rendered PDF image scan helper and its PDF scan callsite in scanners.c so PDFium-specific code is not used in non-PDFium builds. - Reject PDFium-only options at runtime with clear errors in clamscan/manager.c and clamd/server-th.c when built without PDFium. - Hide PDFium-only CLI help text in clamscan.c for non-PDFium builds. - Export the actual HAVE_PDFIUM feature state to Python tests from unit_tests/CMakeLists.txt. - Skip the PDFium-specific clamscan test class in pdf_render_canvas_test.py unless HAVE_PDFIUM=1. CLAM-2817
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add runtime PDF image fuzzy hash rendering feature using PDFium
Add clamscan/clamd options to enable or disable PDF render fuzzy hashing,
support DPI or WIDTHxHEIGHT canvas render settings, preserve rendered output
with leave-temps, and add a clamscan unit test for canvas parsing.
CLAM-2817