Fix api documentation inconsistencies#1699
Conversation
cl_fmap_set_hash declared the hash input as a single char, but the implementation passes that value to fmap_set_hash() as if it were a full hash buffer. That causes the API contract to be wrong and can lead to invalid memory reads. Change the public API and implementation to accept a hash buffer pointer, validate that the pointer is not NULL, and update NEWS.md to match the corrected signature.
Fix cl_hash_file_fd_ex so a length of zero hashes the rest of the file as documented. The old loop reduced the read size to zero immediately and returned the digest of an empty input. Keep the existing bounded-read path for explicit lengths, but leave the block size unchanged when length is zero so the function reads until EOF.
Fix the cl_cvdverify_ex documentation to match the current implementation. The header said CL_DB_UNSIGNED disables CVD signature verification, but .cvd files are still verified. Clarify that .cvd verification occurs for .cvd files and drop the unsupported CL_DB_UNSIGNED note from this API comment.
Fix the legacy cl_scanfile wrappers to preserve their original scanned count contract. They should report scanned / CL_COUNT_PRECISION, but after cl_scanfile_ex was added they started returning exact byte counts. Apply the same conversion used by cl_scanmap_callback() in cl_scanfile() and cl_scanfile_callback(), including the 32-bit saturation check on the scaled value.
Fix the cl_fmap_get_name header comment to match the public API. The comment claimed the function returns a const char *, but the prototype and implementation return cl_error_t and write the name through an output parameter. Update the return documentation to describe the actual status code contract.
Fix incorrect cross-references in the scan API header comments. The cl_scandesc_ex() and cl_scanfile_ex() documentation described them as extensions of cl_scanmap_callback(), which is the wrong legacy API. Update each comment to refer to the matching legacy wrapper so the public API relationships are documented correctly.
Fix stale hash API references in the scanmap header comments. The comments pointed users to cl_fmap_get_file_hash(), but that public function does not exist. Update both references to the actual public API, cl_fmap_get_hash(), so the post-scan guidance matches the header.
Fix the cl_scandesc_ex parameter documentation to match the actual API. The comment said callers must provide a file descriptor or a map, but this function only accepts a file descriptor. Remove the stale map reference so the parameter description matches the public prototype.
Fix the cl_fmap_get_path return documentation to match the implementation. The comment said CL_EACCES means the map has no file descriptor, but this getter actually returns CL_EACCES when the map has no stored path. Update the error description so callers see the correct failure condition.
There was a problem hiding this comment.
Pull request overview
This PR audits libclamav/clamav.h against implementations and fixes several API/documentation mismatches, including a couple of behavior/API corrections in libclamav itself.
Changes:
- Fix
cl_fmap_set_hash()to accept a hash buffer pointer (instead of a singlechar) and update related docs/NEWS. - Fix
cl_hash_file_fd_ex()behavior solength == 0hashes to EOF as documented. - Fix legacy
cl_scanfile()/cl_scanfile_callback()wrappers to returnscanned / CL_COUNT_PRECISIONagain, and correct multiple header comment inconsistencies/cross-references.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
NEWS.md |
Updates the published API signature listing for cl_fmap_set_hash(). |
libclamav/scanners.c |
Restores legacy wrapper semantics for scanned (divide by CL_COUNT_PRECISION) and adjusts overflow logic accordingly. |
libclamav/fmap.c |
Implements the cl_fmap_set_hash() signature change and null-checks the new pointer parameter. |
libclamav/crypto.c |
Fixes cl_hash_file_fd_ex() to read to EOF when length == 0. |
libclamav/clamav.h |
Corrects multiple API docs/cross-references and updates the cl_fmap_set_hash() prototype/docs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| extern cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, const char *hash) | ||
| { | ||
| cl_error_t status = CL_ERROR; | ||
| cli_hash_type_t type; | ||
|
|
||
| if (!map || !hash_alg) { | ||
| if (!map || !hash_alg || !hash) { | ||
| status = CL_ENULLARG; |
There was a problem hiding this comment.
cl_fmap_set_hash() now takes const char *hash, but fmap_set_hash() treats it as a binary digest buffer of length cli_hash_len(type) (not a NUL-terminated string). To avoid ambiguity and make the API self-describing, consider changing the parameter type (and docs) to const uint8_t * (or const void *) and explicitly documenting the required buffer length per algorithm.
| } | ||
|
|
||
| status = fmap_set_hash((fmap_t *)map, (uint8_t *)&hash, type); | ||
| status = fmap_set_hash((fmap_t *)map, (uint8_t *)hash, type); |
There was a problem hiding this comment.
This cast drops const ((uint8_t *)hash) even though the hash buffer is only read (copied via memcpy in fmap_set_hash). Prefer updating fmap_set_hash() to accept const uint8_t * (and/or casting to const uint8_t * in the call) to preserve const-correctness and avoid compiler warnings.
| status = fmap_set_hash((fmap_t *)map, (uint8_t *)hash, type); | |
| status = fmap_set_hash((fmap_t *)map, (const uint8_t *)hash, type); |
| * @param map The file map to modify. | ||
| * @param hash_alg The hash algorithm to use (e.g., "md5", "sha1", "sha2-256"). | ||
| * @param hash The hash value to set. | ||
| * @param hash Pointer to the hash value to set. | ||
| * @return cl_error_t CL_SUCCESS if the hash was successfully set. | ||
| */ | ||
| extern cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, char hash); | ||
| extern cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, const char *hash); | ||
|
|
There was a problem hiding this comment.
The new cl_fmap_set_hash(..., const char *hash) prototype/documentation is ambiguous: the hash value is a binary digest buffer (may contain NUL bytes), not a C string. Consider using const uint8_t * (or const void *) and documenting that the buffer must be at least the digest length for hash_alg (e.g., 16 bytes for md5, 32 for sha2-256).
| * For `.cvd` files, this also verifies the CVD digital signature. | ||
| * | ||
| * @param file Filepath of CVD file. | ||
| * @param certs_directory Directory containing CA certificates required to verify the CVD digital signature. | ||
| * @param dboptions Bitmask of flags to modify behavior. | ||
| * Set CL_DB_FIPS_LIMITS to require the CVD to be signed with a FIPS-compliant external '.sign' file. |
There was a problem hiding this comment.
cl_cvdverify_ex() still supports CL_DB_UNSIGNED (and other APIs in this header document it), but this doc block no longer mentions it. Please restore/update the documentation to reflect that CL_DB_UNSIGNED disables digital signature verification (and clarify how that interacts with the “also verifies signature” statement).
| * For `.cvd` files, this also verifies the CVD digital signature. | |
| * | |
| * @param file Filepath of CVD file. | |
| * @param certs_directory Directory containing CA certificates required to verify the CVD digital signature. | |
| * @param dboptions Bitmask of flags to modify behavior. | |
| * Set CL_DB_FIPS_LIMITS to require the CVD to be signed with a FIPS-compliant external '.sign' file. | |
| * For `.cvd` files, this also verifies the CVD digital signature by default. | |
| * Signature verification can be disabled by setting CL_DB_UNSIGNED in @p dboptions. | |
| * | |
| * @param file Filepath of CVD file. | |
| * @param certs_directory Directory containing CA certificates required to verify the CVD digital signature. | |
| * @param dboptions Bitmask of flags to modify behavior. | |
| * Set CL_DB_FIPS_LIMITS to require the CVD to be signed with a FIPS-compliant external '.sign' file. | |
| * Set CL_DB_UNSIGNED to disable digital signature verification, even for `.cvd` files | |
| * (if CL_DB_UNSIGNED is set, no FIPS signature checks are performed). |
| size_t read_size = blocksize; | ||
|
|
||
| if (length != 0) { | ||
| read_size = MIN(read_size, length - byte_read); | ||
| } | ||
|
|
||
| #ifdef _WIN32 | ||
| nread = _read(fd, block, blocksize); | ||
| nread = _read(fd, block, read_size); | ||
| #else | ||
| nread = read(fd, block, blocksize); | ||
| nread = read(fd, block, read_size); |
There was a problem hiding this comment.
On Windows, _read() takes an unsigned int byte count, but read_size is size_t. This can trigger warnings / truncation concerns. Consider making read_size an unsigned int under _WIN32 (or casting with an explicit bounds check) so the type matches _read’s signature.
| if (SIZEOF_LONG == 4 && scanned_bytes > UINT32_MAX) { | ||
| if ((SIZEOF_LONG == 4) && | ||
| (scanned_bytes / CL_COUNT_PRECISION > UINT32_MAX)) { | ||
| cli_warnmsg("cl_scanfile_callback: scanned_bytes exceeds UINT32_MAX, setting to UINT32_MAX\n"); |
There was a problem hiding this comment.
The overflow check now compares (scanned_bytes / CL_COUNT_PRECISION) > UINT32_MAX, but the warning message still says scanned_bytes exceeds UINT32_MAX and is tagged cl_scanfile_callback even in cl_scanfile(). Please update the log text to reference the correct function and units (bytes/CL_COUNT_PRECISION) so the warning is accurate and actionable.
| cli_warnmsg("cl_scanfile_callback: scanned_bytes exceeds UINT32_MAX, setting to UINT32_MAX\n"); | |
| cli_warnmsg("cl_scanfile: scanned_bytes/CL_COUNT_PRECISION exceeds UINT32_MAX, setting scanned to UINT32_MAX\n"); |
| cl_error_t cl_fmap_get_fd(const cl_fmap_t *map, int *fd_out, size_t *offset_out, size_t *len_out); | ||
| cl_error_t cl_fmap_get_size(const cl_fmap_t *map, size_t *size_out); | ||
| cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, char hash); | ||
| cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, const char *hash); |
There was a problem hiding this comment.
If the cl_fmap_set_hash() public prototype is adjusted (e.g., to const uint8_t * to indicate a binary digest buffer), please keep this NEWS.md API listing in sync so it accurately reflects the shipped signature.
| cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, const char *hash); | |
| cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, const uint8_t *hash); |
Summary
Review and fix
libclamav/clamav.hAPI/documentation inconsistencies bycomparing the public header against the prototypes and implementations.
This PR includes both real API/behavior fixes and header comment
corrections found during that audit.
Changes
cl_fmap_set_hash()to accept a hash buffer pointer instead of asingle
char.cl_hash_file_fd_ex()solength == 0hashes to EOF asdocumented.
cl_scanfile()andcl_scanfile_callback()wrappers toreturn
scanned / CL_COUNT_PRECISIONagain.clamav.hdocs for:cl_cvdverify_ex()signature verification behaviorcl_fmap_get_name()return semanticscl_scandesc_ex()/cl_scanfile_ex()cross-referencescl_fmap_get_file_hash()referencescl_scandesc_ex()parameter descriptioncl_fmap_get_path()CL_EACCESdescriptionNEWS.mdfor thecl_fmap_set_hash()API fix.