Add support for scanning Zstandard (zstd) compressed files#1700
Add support for scanning Zstandard (zstd) compressed files#1700burakemir wants to merge 5 commits intoCisco-Talos:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 023dc88b12
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8d12099f4c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
ClamAV did not previously detect or decompress zstd-compressed files. This adds full support using the streaming ZSTD_decompressStream() API, following the same pattern as gzip/bzip2/xz decompression. Changes: - Magic byte detection (28 b5 2f fd) and CL_TYPE_ZSTD file type - ARCH_CONF_ZSTD dconf flag for enable/disable control - cli_scanzstd() streaming decompression scanner - FindZSTD.cmake module and libzstd linkage - clamscan test with zstd-compressed test fixture - Document libzstd as a required dependency in INSTALL.md
The initial implementation exited after the first frame, so a file with a benign first frame and malicious payload in a later frame would be reported clean. Reset the decompressor on frame boundary and continue processing, matching how gzip handles concatenated streams with inflateReset(). Adds a test with a two-frame zstd file where the signature payload is in the second frame.
Two fixes: - A decompression error after valid frames were already written to the temp file would skip scanning entirely, allowing evasion by appending junk bytes after valid compressed content. Now jump to scanning on error instead of skipping it, matching gzip's behavior. - Valid zstd frames can decompress to zero bytes (e.g. compressed empty files). Remove the size==0 early-exit that incorrectly classified these as malformed.
|
Addressed the comments. @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e81340f06a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
- Drain decompressor internal buffer when output is full, preventing truncated scanning of highly compressible payloads. - Scan partial output on stream errors instead of skipping scan. - Move all variable declarations to top of function for C90 compliance (project uses CMAKE_C_STANDARD 90).
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 68e8000420
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
The pkg-config directories were passed positionally to find_library instead of via HINTS, so non-standard zstd installs discoverable only through pkg-config could fail at configure time.
zstd is a fast compression library and associated format https://en.wikipedia.org/wiki/Zstd
ClamAV did not previously detect or decompress zstd-compressed files. This adds full support using the streaming ZSTD_decompressStream() API, following the same pattern as gzip/bzip2/xz decompression.
Changes:
This whole PR was done by Claude, I asked to add a test and ran it - take it or leave it!
I ran with MAINTAINER_MODE=ON to check that bindgen produces the same renumbering in "sys.rs".