Skip to content

Add support for scanning Zstandard (zstd) compressed files#1700

Open
burakemir wants to merge 5 commits intoCisco-Talos:mainfrom
burakemir:zstd
Open

Add support for scanning Zstandard (zstd) compressed files#1700
burakemir wants to merge 5 commits intoCisco-Talos:mainfrom
burakemir:zstd

Conversation

@burakemir
Copy link
Copy Markdown

zstd is a fast compression library and associated format https://en.wikipedia.org/wiki/Zstd

ClamAV did not previously detect or decompress zstd-compressed files. This adds full support using the streaming ZSTD_decompressStream() API, following the same pattern as gzip/bzip2/xz decompression.

Changes:

  • Magic byte detection (28 b5 2f fd) and CL_TYPE_ZSTD file type
  • ARCH_CONF_ZSTD dconf flag for enable/disable control
  • cli_scanzstd() streaming decompression scanner
  • FindZSTD.cmake module and libzstd linkage
  • clamscan test with zstd-compressed test fixture
  • Document libzstd as a required dependency in INSTALL.md

This whole PR was done by Claude, I asked to add a test and ran it - take it or leave it!
I ran with MAINTAINER_MODE=ON to check that bindgen produces the same renumbering in "sys.rs".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 023dc88b12

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread libclamav/scanners.c Outdated
@burakemir
Copy link
Copy Markdown
Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8d12099f4c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread libclamav/scanners.c Outdated
Comment thread libclamav/scanners.c Outdated
ClamAV did not previously detect or decompress zstd-compressed files.
This adds full support using the streaming ZSTD_decompressStream() API,
following the same pattern as gzip/bzip2/xz decompression.

Changes:
- Magic byte detection (28 b5 2f fd) and CL_TYPE_ZSTD file type
- ARCH_CONF_ZSTD dconf flag for enable/disable control
- cli_scanzstd() streaming decompression scanner
- FindZSTD.cmake module and libzstd linkage
- clamscan test with zstd-compressed test fixture
- Document libzstd as a required dependency in INSTALL.md
The initial implementation exited after the first frame, so a file
with a benign first frame and malicious payload in a later frame
would be reported clean. Reset the decompressor on frame boundary
and continue processing, matching how gzip handles concatenated
streams with inflateReset().

Adds a test with a two-frame zstd file where the signature payload
is in the second frame.
Two fixes:
- A decompression error after valid frames were already written to the
  temp file would skip scanning entirely, allowing evasion by appending
  junk bytes after valid compressed content. Now jump to scanning on
  error instead of skipping it, matching gzip's behavior.
- Valid zstd frames can decompress to zero bytes (e.g. compressed empty
  files). Remove the size==0 early-exit that incorrectly classified
  these as malformed.
@burakemir
Copy link
Copy Markdown
Author

Addressed the comments.

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e81340f06a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread libclamav/scanners.c Outdated
- Drain decompressor internal buffer when output is full, preventing
  truncated scanning of highly compressible payloads.
- Scan partial output on stream errors instead of skipping scan.
- Move all variable declarations to top of function for C90 compliance
  (project uses CMAKE_C_STANDARD 90).
@burakemir
Copy link
Copy Markdown
Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 68e8000420

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread cmake/FindZSTD.cmake Outdated
The pkg-config directories were passed positionally to find_library
instead of via HINTS, so non-standard zstd installs discoverable only
through pkg-config could fail at configure time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant