Hardening: input validation and bounds tightening across 26 subsystems#31129
Conversation
|
Updated 6:22 PM PT - May 20th, 2026
❌ @Jarred-Sumner, your commit 941e4b4 has 1 failures in
🧪 To try this PR locally: bunx bun-pr 31129That installs a local version of the PR into your bun-31129 --bun |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR applies security hardening and protocol fixes: adds MySQL allowPublicKeyRetrieval control, caps parser depths and buffer sizes, tightens HTTP/WS/H2 validation, improves path and archive safety, fixes crypto/buffer indexing, and wires option propagation and tests. ChangesMySQL allowPublicKeyRetrieval Feature
Path Sanitization and Input Validation
Protocol Depth and Size Limits
WebSocket and HTTP Protocol Validation
HTTP/2 Frame Parser Refactoring
Cryptographic and Buffer Safety
Blob Multipart and SQL Helpers
Dev Server and Node API Security
Supporting Fixes and Type Safety
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/sql_jsc/postgres/PostgresRequest.rs (1)
468-482:⚠️ Potential issue | 🟠 Major | ⚡ Quick winStart plaintext startup after an
SSLRequestrefusal.When this branch accepts
b'N'for a non-failing mode likeSslMode::Prefer, it only recordsSslNotAvailableandcontinues. At that point the server is waiting for the client startup packet, butstart()is only reached fromon_open()or aftersetup_tls(), so the connection never progresses and eventually times out instead of falling back to plain TCP. Kick off the normal startup path here and return after doing so.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/sql_jsc/postgres/PostgresRequest.rs` around lines 468 - 482, In the b'N' branch where you set connection.tls_status to TlsStatus::SslNotAvailable, instead of just continuing, initiate the normal plaintext startup path (call the connection/start() entry used by on_open()/setup_tls()) when the SSL mode allows fallback (e.g., SslMode::Prefer) and then return Ok(()) so the connection proceeds; specifically, replace the trailing continue with code that triggers the startup (invoke start() or the same method used to begin plain TCP startup) and return, leaving the existing failure handling for Require/Verify modes unchanged.src/js/internal/sql/mysql.ts (1)
255-289:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClassify
PublicKeyRetrievalNotAllowedas non-retryable auth failure.This new path introduces
ERR_MYSQL_PUBLIC_KEY_RETRIEVAL_NOT_ALLOWED, butretry()still treats unknown codes as retryable. That can cause repeated failed reconnect attempts instead of failing fast.Proposed fix
switch (this.storedError?.code) { case "ERR_MYSQL_PASSWORD_REQUIRED": case "ERR_MYSQL_MISSING_AUTH_DATA": case "ERR_MYSQL_FAILED_TO_ENCRYPT_PASSWORD": case "ERR_MYSQL_INVALID_PUBLIC_KEY": + case "ERR_MYSQL_PUBLIC_KEY_RETRIEVAL_NOT_ALLOWED": case "ERR_MYSQL_UNSUPPORTED_PROTOCOL_VERSION": case "ERR_MYSQL_UNSUPPORTED_AUTH_PLUGIN": case "ERR_MYSQL_AUTHENTICATION_FAILED": // we can't retry these are authentication errors return false;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/js/internal/sql/mysql.ts` around lines 255 - 289, The new ERR_MYSQL_PUBLIC_KEY_RETRIEVAL_NOT_ALLOWED error (exposed as PublicKeyRetrievalNotAllowed) must be treated as a non-retryable authentication failure by the connection retry logic; update the retry()/isTransientError() handling to explicitly mark ERR_MYSQL_PUBLIC_KEY_RETRIEVAL_NOT_ALLOWED (and the PublicKeyRetrievalNotAllowed error class/identifier) as non-retryable so the reconnect loop fails fast instead of repeatedly retrying on this auth error.src/jsc/bindings/webcore/SerializedScriptValue.cpp (1)
3985-3991:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd length bounds check and explicit cast before calling
BIO_new_mem_buf.The
lengthparameter isuint64_tbutBIO_new_mem_bufexpects anint(orossl_ssize_tdepending on the cryptography library). Passing values exceedingINT_MAXwill silently truncate, causing the PEM buffer to be misparsed. Clamp the length and explicitly cast to prevent overflow.Suggested fix
bool read(BIO** bio, uint64_t length) { + if (length > std::numeric_limits<int>::max()) + return false; if (static_cast<uint64_t>(m_end - m_ptr) < length) return false; - *bio = BIO_new_mem_buf(m_ptr, length); + *bio = BIO_new_mem_buf(m_ptr, static_cast<int>(length)); if (!*bio) return false; m_ptr += length; return true; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/jsc/bindings/webcore/SerializedScriptValue.cpp` around lines 3985 - 3991, The code calls BIO_new_mem_buf(m_ptr, length) where length is uint64_t; add an explicit bounds check against INT_MAX and the available buffer size (using m_end and m_ptr) before calling BIO_new_mem_buf, clamp length to int if necessary, and pass a safe explicit cast (e.g., static_cast<int>(clampedLength)) to BIO_new_mem_buf; update the existing early-return condition to fail if length > static_cast<uint64_t>(INT_MAX) or if static_cast<uint64_t>(m_end - m_ptr) < length, then advance m_ptr by the original length on success and ensure *bio is created with the int-casted/clamped length.src/runtime/node/net/BlockList.rs (1)
429-447:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winTake the ref while the registry lock is still held.
Line 429 validates the pointer, but the guard is dropped before Line 446 bumps the refcount. A concurrent
BlockList__onStructuredCloneDestroycan remove the last entry and freethisin that gap, turning the laterref_()into a use-after-free. Keep lookup and ref acquisition in the same critical section.Suggested fix
- if !SERIALIZED_REFS.lock().contains(&int) { + let refs = SERIALIZED_REFS.lock(); + if !refs.contains(&int) { return Err(global.throw(format_args!( "BlockList.onStructuredCloneDeserialize failed" ))); } let this: *mut Self = int as *mut Self; @@ - unsafe { - (*this).ref_(); - Ok(Self::to_js_ptr(this, global)) - } + unsafe { (*this).ref_(); } + drop(refs); + unsafe { Ok(Self::to_js_ptr(this, global)) }Also applies to: 458-468
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/runtime/node/net/BlockList.rs` around lines 429 - 447, The pointer validation currently checks SERIALIZED_REFS.lock().contains(&int) but releases the lock before calling (*this).ref_(), which allows a concurrent BlockList__onStructuredCloneDestroy to free the object; fix by acquiring the SERIALIZED_REFS lock and performing the contains check and the ref bump (call to (*this).ref_() or equivalent) while the lock is still held, then drop the lock and return Self::to_js_ptr(this, global); apply the same change to the analogous block later (lines 458-468) so lookup and ref acquisition occur in the same critical section and avoid a use-after-free.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/install/repository.rs`:
- Around line 314-321: is_safe_resolved_tag currently accepts the
single-component names "." and ".." which can bypass path-checking; update the
function is_safe_resolved_tag to explicitly reject those two values (e.g.,
return false if resolved == b"." or resolved == b"..") in addition to the
existing checks so they cannot pass the checkout guard later.
In `@src/js/bun/sql.ts`:
- Line 747: The thrown error in onTransactionConnected uses the wrong variable
name in the message: update the $ERR_INVALID_ARG_VALUE call so it references
savepoint_callback (the actual variable being validated) instead of callback;
locate the validation in onTransactionConnected where savepoint_callback is
checked and change the argument name in the error constructor to
"savepoint_callback" so the error message reflects the correct variable being
validated.
In `@src/js/internal/sql/shared.ts`:
- Line 868: The property assignment uses truthy coercion
(allowPublicKeyRetrieval: !!options.allowPublicKeyRetrieval) which accepts
strings/numbers; change it to only accept an actual boolean by setting
allowPublicKeyRetrieval to true only when options.allowPublicKeyRetrieval is the
boolean true and otherwise set it to false (or use a typeof check to accept only
boolean values); update the assignment in shared.ts where
allowPublicKeyRetrieval is constructed so non-boolean inputs do not enable the
option.
In `@src/jsc/bindings/JSBuffer.cpp`:
- Around line 2431-2437: After recomputing currentByteLength in JSBuffer::...
(use castedThis->byteLength(), currentByteLength, currentRemaining, offset,
length), add an early check that if offset >= currentByteLength you
short-circuit and return (do not proceed to call parseEncoding() /
writeToBuffer()), because a post-parse resize can make offset out-of-range even
for a zero-length write; this prevents building vector() + offset with an
invalid pointer.
In `@src/libarchive/lib.rs`:
- Around line 1349-1356: The code currently applies case-insensitive
symlink-prefix matching for all Unix by calling
strings::eql_case_insensitive_ascii_check_length on prefix vs created_symlinks,
which causes incorrect skipping on case-sensitive filesystems; change the logic
in the block that references prefix and created_symlinks so the case-insensitive
comparison is only used when running on a case-insensitive filesystem (e.g.,
guard with a platform/filesystem check such as cfg!(target_os = "macos") or an
existing is_case_insensitive_fs() helper) and otherwise perform an exact
byte-wise match (use direct equality) instead of
strings::eql_case_insensitive_ascii_check_length.
In `@src/runtime/api/bun/h2_frame_parser.rs`:
- Around line 4064-4080: The code double-finalizes END_STREAM on the last
CONTINUATION: when stream.end_after_headers was already handled by
handle_headers_frame, this block should only clear
stream.is_waiting_more_headers and not re-run the END_STREAM transition. Modify
the block that checks stream.is_waiting_more_headers and
stream.end_after_headers so that if stream.state ==
StreamState::HALF_CLOSED_REMOTE you simply clear stream.is_waiting_more_headers
(and call identifier.ensure_still_alive()), but do NOT set stream.state =
StreamState::CLOSED, do NOT call stream.free_resources::<false>(self), and do
NOT dispatch JSH2FrameParser::Gc::onStreamEnd; only perform the state
transition/free/dispatch when the stream hasn't already been moved to
HALF_CLOSED_REMOTE by handle_headers_frame.
- Around line 4041-4054: The code checks inbound CONTINUATION frame size against
remote_settings (via the `settings` variable) but should use the local limit;
change the comparison to use `self.local_settings.get().max_frame_size` (or set
`settings` from `local_settings.get()` instead of
`remote_settings.get().unwrap_or_else(|| self.local_settings.get())`) so the `if
frame.length > ...` uses the local max; keep the existing `send_go_away` call
(`send_go_away`, `ErrorCode::FRAME_SIZE_ERROR`, `last_stream_id.get()`)
unchanged.
- Around line 4488-4505: lookup_inbound_stream is currently allocating a new
Stream for unknown inbound HEADERS by calling handle_received_stream_id even
when this parser is a client; change the logic so unknown HEADERS only allocate
when is_server is true. Specifically, in lookup_inbound_stream (and before
calling handle_received_stream_id) ensure you check self.is_server.get() and
return None for client parsers instead of creating a new stream (leave server
behavior unchanged); this prevents clients from accepting peer-initiated HEADERS
that should be protocol errors given no PUSH_PROMISE handling in dispatch_frame.
In `@src/runtime/bake/DevServer.rs`:
- Around line 1421-1477: The HMR websocket upgrade path isn't checking
is_allowed_dev_host, so modify the websocket upgrade handler (where
on_websocket_upgrade / the app.ws(...) HMR registration handles incoming upgrade
requests) to call is_allowed_dev_host(&dev, req) and reject disallowed hosts by
invoking host_forbidden(resp) (or sending a 403 and closing the upgrade) before
proceeding with the existing HMR handshake; ensure the same DevServer instance
used by dev_route_tramp is passed/accessible so the hostname comparison uses the
server config.
In `@src/runtime/node/net/BlockList.rs`:
- Around line 38-43: The current SERIALIZED_REFS guard storing raw pointer
addresses is vulnerable to allocator-reuse replay; replace the pointer-based
registry with a per-serialization capability (unique token): add a unique
id/nonce field to BlockList instances (or generate a fresh random/monotonic
token at serialize time), record that token in the serialized payload instead of
the raw usize, and change SERIALIZED_REFS to map tokens → weak references (or
ownership markers); update the lookup/contains logic (uses of
SERIALIZED_REFS.contains(&int) and the code in
BlockList__onStructuredCloneDestroy) to validate the token against the registry
during deserialize and to remove the token on destroy. Apply the same
token-based change to the other referenced spots (lines ~390-391, ~400,
~429-433, ~458-463) so all serialization/deserialization paths use the
capability token rather than a raw pointer address.
In `@src/runtime/node/path.rs`:
- Line 3298: Change the assert! guarding tmp_buf length to an explicit bounds
check that returns an error instead of panicking: validate that buf_size <=
tmp_buf.len() (using first_part_len and slice_len / MAX_PATH_SIZE_UPPER inputs)
before any unsafe pointer writes in the UNC device string construction code that
manipulates tmp_buf and buf_size; if the check fails, return an appropriate Err
(matching the function's Result error type, e.g., io::Error or the crate's
PathError) with a clear message about the path being too long. Apply the same
replacement for the second assertion site that checks tmp_buf length (the other
place using slice_len) so both potential panics become graceful errors.
In `@src/runtime/webcore/Blob.rs`:
- Around line 2667-2673: The early-return path in to_string_with_bytes that
handles a UTF-16LE BOM skips applying Lifetime::Transfer, so the Blob's backing
store isn't detached; before constructing OwnedString/returning (the block that
does OwnedString::new(BunString::clone_utf16(...)) and out.to_js(global)),
invoke the same detachment logic used by other transfer paths (i.e. apply
Lifetime::Transfer to the source Blob/backing store — reuse the existing helper
or code block in this file that sets Lifetime::Transfer) so the buffer is
detached from the original Blob prior to creating and returning the JS string.
In `@src/semver/lib.rs`:
- Around line 530-533: The current comparison uses unwrap_or_default() on both
this_buf.get(...) and that_buf.get(...), which makes two invalid/out-of-bounds
ranges compare equal; change it to first check whether both ranges are valid
(this_buf.get(a_off..a_off + a_len) and that_buf.get(b_off..b_off + b_len)) and
if either returns None immediately return false, otherwise pass the two valid
slices into strings::eql; reference the variables and call site: this_buf,
that_buf, a_off, a_len, b_off, b_len and strings::eql.
In `@test/js/sql/sql-mysql.auth.test.ts`:
- Around line 40-43: Add a negative-path test that constructs SQL without the
allowPublicKeyRetrieval flag to assert the secure default denies public key
retrieval: call new SQL(...) with the same MySQL URL but omit
allowPublicKeyRetrieval inside the test (mirroring the positive test using the
same container.host/container.port), attempt a query or connection inside an
await/throws expectation, and assert it rejects with error code
"ERR_MYSQL_PUBLIC_KEY_RETRIEVAL_NOT_ALLOWED"; refer to the SQL constructor used
in the positive test and the test helper/container variables so the new test
mirrors setup but expects the specified error.
---
Outside diff comments:
In `@src/js/internal/sql/mysql.ts`:
- Around line 255-289: The new ERR_MYSQL_PUBLIC_KEY_RETRIEVAL_NOT_ALLOWED error
(exposed as PublicKeyRetrievalNotAllowed) must be treated as a non-retryable
authentication failure by the connection retry logic; update the
retry()/isTransientError() handling to explicitly mark
ERR_MYSQL_PUBLIC_KEY_RETRIEVAL_NOT_ALLOWED (and the PublicKeyRetrievalNotAllowed
error class/identifier) as non-retryable so the reconnect loop fails fast
instead of repeatedly retrying on this auth error.
In `@src/jsc/bindings/webcore/SerializedScriptValue.cpp`:
- Around line 3985-3991: The code calls BIO_new_mem_buf(m_ptr, length) where
length is uint64_t; add an explicit bounds check against INT_MAX and the
available buffer size (using m_end and m_ptr) before calling BIO_new_mem_buf,
clamp length to int if necessary, and pass a safe explicit cast (e.g.,
static_cast<int>(clampedLength)) to BIO_new_mem_buf; update the existing
early-return condition to fail if length > static_cast<uint64_t>(INT_MAX) or if
static_cast<uint64_t>(m_end - m_ptr) < length, then advance m_ptr by the
original length on success and ensure *bio is created with the
int-casted/clamped length.
In `@src/runtime/node/net/BlockList.rs`:
- Around line 429-447: The pointer validation currently checks
SERIALIZED_REFS.lock().contains(&int) but releases the lock before calling
(*this).ref_(), which allows a concurrent BlockList__onStructuredCloneDestroy to
free the object; fix by acquiring the SERIALIZED_REFS lock and performing the
contains check and the ref bump (call to (*this).ref_() or equivalent) while the
lock is still held, then drop the lock and return Self::to_js_ptr(this, global);
apply the same change to the analogous block later (lines 458-468) so lookup and
ref acquisition occur in the same critical section and avoid a use-after-free.
In `@src/sql_jsc/postgres/PostgresRequest.rs`:
- Around line 468-482: In the b'N' branch where you set connection.tls_status to
TlsStatus::SslNotAvailable, instead of just continuing, initiate the normal
plaintext startup path (call the connection/start() entry used by
on_open()/setup_tls()) when the SSL mode allows fallback (e.g., SslMode::Prefer)
and then return Ok(()) so the connection proceeds; specifically, replace the
trailing continue with code that triggers the startup (invoke start() or the
same method used to begin plain TCP startup) and return, leaving the existing
failure handling for Require/Verify modes unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6a7b5e83-7476-4c09-bff3-359611852442
📒 Files selected for processing (49)
packages/bun-types/sql.d.tspackages/bun-uws/src/HttpParser.hpackages/bun-uws/src/HttpResponse.hsrc/bundler/options.rssrc/css/css_parser.rssrc/glob/matcher.rssrc/http/lib.rssrc/http_jsc/websocket_client.rssrc/http_jsc/websocket_client/WebSocketUpgradeClient.rssrc/http_types/URLPath.rssrc/install/bin.rssrc/install/lockfile/bun.lock.rssrc/install/repository.rssrc/js/bun/sql.tssrc/js/internal/debugger.tssrc/js/internal/sql/mysql.tssrc/js/internal/sql/postgres.tssrc/js/internal/sql/shared.tssrc/js/node/child_process.tssrc/jsc/bindings/JSBuffer.cppsrc/jsc/bindings/NodeHTTP.cppsrc/jsc/bindings/node/http/JSConnectionsList.cppsrc/jsc/bindings/sqlite/JSSQLStatement.cppsrc/jsc/bindings/webcore/SerializedScriptValue.cppsrc/jsc/bindings/webcrypto/CryptoKeyOKPOpenSSL.cppsrc/libarchive/lib.rssrc/patch/lib.rssrc/resolver/package_json.rssrc/runtime/api/bun/h2_frame_parser.rssrc/runtime/api/bun/h2_frame_parser.zigsrc/runtime/bake/DevServer.rssrc/runtime/node/net/BlockList.rssrc/runtime/node/path.rssrc/runtime/server/RequestContext.rssrc/runtime/server/server_body.rssrc/runtime/webcore/Blob.rssrc/runtime/webcore/s3/multipart.rssrc/semver/lib.rssrc/sql/mysql/protocol/AnyMySQLError.rssrc/sql_jsc/mysql/JSMySQLConnection.rssrc/sql_jsc/mysql/MySQLConnection.rssrc/sql_jsc/mysql/protocol/any_mysql_error_jsc.rssrc/sql_jsc/postgres/PostgresRequest.rssrc/sql_jsc/postgres/PostgresSQLConnection.rssrc/sys/tmp.rssrc/valkey/valkey_protocol.rstest/cli/inspect/inspect.test.tstest/js/sql/sql-mysql.auth.test.tstest/js/sql/sql-mysql.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/runtime/node/path.rs (1)
3298-3323:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd a bounds guard before the separator write at Line 3314.
The new checks protect the two copy operations, but there is still a panic edge: if Line 3298 leaves
buf_size == tmp_buf.len(), the write at Line 3314 (tmp_buf[buf_offset]) is out of bounds before Line 3318 is reached.🛠️ Minimal fix
buf_offset = buf_size; - buf_size += 1; + if buf_size >= tmp_buf.len() { + return Err(bun_sys::Error::from_code( + bun_sys::E::ENAMETOOLONG, + bun_sys::Tag::TODO, + )); + } + buf_size += 1; tmp_buf[buf_offset] = T::from_u8(CHAR_BACKWARD_SLASH);As per coding guidelines: “Don't
.unwrap()a fallible path that user input or the OS can hit at runtime — return the error.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/runtime/node/path.rs` around lines 3298 - 3323, The write to tmp_buf at tmp_buf[buf_offset] (around the separator insertion) can panic when buf_size == tmp_buf.len(); add a bounds guard before writing the separator: check whether buf_size < tmp_buf.len() (using the same error path as other checks) and return bun_sys::Error::from_code(bun_sys::E::ENAMETOOLONG, bun_sys::Tag::TODO) if not, then update buf_offset/buf_size and write CHAR_BACKWARD_SLASH; ensure the check uses the same tmp_buf, buf_size, buf_offset variables and error construction as the surrounding copy-bound checks to keep behavior consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/runtime/bake/DevServer.rs`:
- Around line 1604-1606: Remove the dead inherent method named
on_web_socket_upgrade from the DevServer impl: delete the entire inherent method
body so only the trait-based implementation WebSocketUpgradeServer<SSL> (which
already applies is_allowed_dev_host) remains; verify there are no call sites
referencing the inherent on_web_socket_upgrade and run tests/build to ensure
compilation succeeds after removing the duplicate upgrade logic.
---
Outside diff comments:
In `@src/runtime/node/path.rs`:
- Around line 3298-3323: The write to tmp_buf at tmp_buf[buf_offset] (around the
separator insertion) can panic when buf_size == tmp_buf.len(); add a bounds
guard before writing the separator: check whether buf_size < tmp_buf.len()
(using the same error path as other checks) and return
bun_sys::Error::from_code(bun_sys::E::ENAMETOOLONG, bun_sys::Tag::TODO) if not,
then update buf_offset/buf_size and write CHAR_BACKWARD_SLASH; ensure the check
uses the same tmp_buf, buf_size, buf_offset variables and error construction as
the surrounding copy-bound checks to keep behavior consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b192d293-15d7-4e9a-b9f1-2601b4c91614
📒 Files selected for processing (8)
src/install/repository.rssrc/js/bun/sql.tssrc/js/internal/sql/shared.tssrc/runtime/api/bun/h2_frame_parser.rssrc/runtime/bake/DevServer.rssrc/runtime/node/path.rssrc/semver/lib.rstest/js/sql/sql-mysql.auth.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/install/lockfile/bun.lock.rs`:
- Around line 2490-2501: The code currently assigns pkg.meta.integrity =
Integrity::parse(integrity_str) even when the parsed integrity tag is
unsupported, then only emits a warning; change this so unsupported/malformed
integrity is not kept: either only assign pkg.meta.integrity when integrity_str
is non-empty and Integrity::parse(...).tag.is_supported(), or after detecting an
unsupported tag (the branch that calls log.add_warning with integrity_expr.loc)
reset pkg.meta.integrity = Integrity::default() so the malformed value is not
serialized back out; update the logic around Integrity::parse, integrity_str,
pkg.meta.integrity, and the warning branch to implement this.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b42ecf63-4848-4448-b38f-4a485f528671
📒 Files selected for processing (3)
src/http/lib.rssrc/install/lockfile/bun.lock.rstest/regression/issue/malformed-integrity-base64.test.ts
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🔴
src/runtime/api/bun/h2_frame_parser.rs:4498-4506— On the client side (is_server == false), this still falls through tohandle_received_stream_id()for a HEADERS frame on an unknown stream id, so a server can allocate arbitraryStreamobjects and fireonStreamStart/onStreamHeaderson the client. Client request streams are always created locally before the response arrives anddispatch_framedoesn't implementPUSH_PROMISE, so an unknown HEADERS id on the client is always a protocol error — add|| !self.is_server.get()to the early return at line 4498.Extended reasoning...
What the gap is
lookup_inbound_stream()is the new helper this PR adds to gate per-frame stream allocation. Its doc comment says "Only a HEADERS frame may allocate new stream state (RFC 9113 §5.1)", and at line 4498 it correctly returnsNonefor any non-HEADERS frame on an unknown id. But the only role-specific check is the parity guard at line 4502, which fires only whenself.is_server.get()is true (rejecting even-numbered ids that a client cannot legally open). Whenis_server == false, an inbound HEADERS frame on an unknown stream id falls straight through tohandle_received_stream_id(stream_identifier)at line 4505.Why that's wrong on the client
On the client side, every request stream is created locally — the JS
request()path callshandle_received_stream_iddirectly and inserts the stream intoself.streamsbefore the request is sent — so by the time the server's response HEADERS arrives, the lookup at line 4495 hits and returns the existing entry. Per RFC 9113 §5.1.1, the only way a server may initiate a stream is via PUSH_PROMISE on an even-numbered id, never via a bare HEADERS frame, anddispatch_framein this file does not implementPUSH_PROMISEat all. Therefore a client receiving HEADERS on a stream id that is not already inself.streamsis always a protocol error from a malicious or buggy server.What
handle_received_stream_iddoes with ithandle_received_stream_id(lines 4434–4485) has nois_servercheck of its own. For an unknown id it unconditionally:- bumps
last_stream_id, - heap-allocates a new
Streamand inserts it intoself.streams, and - dispatches the JS
onStreamStartcallback.
Control then returns to
dispatch_frame→handle_headers_frame, which will HPACK-decode the header block and dispatchonStreamHeadersfor a stream the client never asked for.Step-by-step proof
- Client creates an
H2FrameParserwithis_server = falseand connects to a server. - Client sends a request on stream 1;
self.streamsnow contains{1}. - The server sends a HEADERS frame with
stream_identifier = 7(or any odd id the client never opened). read_bytesparses the frame header and callslookup_inbound_stream(7, HTTP_FRAME_HEADERS).- Line 4492:
7 != 0→ continue. Line 4495:7 ∉ self.streams→ no early return. Line 4498: frame type is HEADERS → continue. Line 4502:is_server.get()is false, so the parity check is skipped entirely. Line 4505:handle_received_stream_id(7)runs. - A new
Streamis allocated, inserted intoself.streams, andonStreamStartfires on the client for a server-initiated stream.handle_headers_framethen decodes the block and callsonStreamHeaders.
A malicious server can repeat this with arbitrary ids to allocate unbounded
Streamobjects on the client (each is a heap-boxed struct plus a hash-map entry) and to surface unsolicitedonStreamStartevents into JS, instead of the connection being torn down with a PROTOCOL_ERROR.Not a regression, but an incomplete fix
Before this PR,
read_bytescalledhandle_received_stream_iddirectly for every inbound frame type (and the.zigsibling still does), so the new function is a strict improvement — DATA, RST_STREAM, WINDOW_UPDATE, etc. on unknown ids no longer allocate. But the PR's stated purpose is exactly this hardening, the function's own RFC-citing doc comment promises it, and the client-side HEADERS path is the obvious remaining gap. Since the parity guard already encodes "who is allowed to open a stream from the wire" for the server role, the symmetric client-role check belongs in the same place.Fix
if frame_type != FrameType::HTTP_FRAME_HEADERS as u8 || !self.is_server.get() { return None; }
(after which the now-redundant
self.is_server.get() &&on the parity check at line 4502 can be dropped). ReturningNoneroutes the frame intohandle_headers_framewithstream_ = None, which already sends GOAWAYPROTOCOL_ERROR("Headers frame on connection stream") — the right outcome for a server-initiated HEADERS.🔬 also observed by coderabbitai
- bumps
-
🔴
src/runtime/api/bun/h2_frame_parser.rs:4064-4081— When a HEADERS frame carriesEND_STREAMbut notEND_HEADERS,handle_headers_framealready moves the stream toHALF_CLOSED_REMOTEand dispatchesonStreamEnd; the new CONTINUATION block then seesend_after_headersstill true, findsstate == HALF_CLOSED_REMOTE, transitions it toCLOSED+free_resources(), and dispatchesonStreamEnda second time. That double-fires the callback and tears the stream down before the local side can send its response. The state check at L4069 should mirrorhandle_headers_frame'sHALF_CLOSED_LOCAL → CLOSED(L4204), notHALF_CLOSED_REMOTE → CLOSED— or the whole block should be skipped since the HEADERS handler already performed the transition.Extended reasoning...
What the bug is
The new
END_STREAMfinalization block inhandle_continuation_frame(h2_frame_parser.rs:4064-4081) runs in addition to the existing finalization inhandle_headers_frame(4196-4215), and uses an inverted state check. The result is that for any HEADERS frame split across CONTINUATION frames withEND_STREAMset,onStreamEndfires twice and the stream is freed while the local side still owes a response.How it manifests — step-by-step trace
Consider a server receiving a request whose headers span two frames:
Frame 1: HEADERS, flags = END_STREAM (no END_HEADERS), stream 1
handle_headers_frameruns:- L4186:
stream.end_after_headers = true(END_STREAM is set) - L4194-4195:
stream.is_waiting_more_headers = true(END_HEADERS is not set) - L4196:
if stream.end_after_headers→ enters the block - L4200-4201:
is_waiting_more_headersis true →stream.state = HALF_CLOSED_REMOTE - L4211: dispatches
onStreamEnd(first fire)
Frame 2: CONTINUATION, flags = END_HEADERS, stream 1
handle_continuation_frameruns:- L4064-4065:
stream.is_waiting_more_headers = false(END_HEADERS is set) - L4066:
!is_waiting_more_headers && end_after_headers→ both true (the PR removed the line that overwroteend_after_headersfrom the CONTINUATION frame's flags, so it staystruefrom step 1 — which is correct per spec, since CONTINUATION has no END_STREAM flag) - L4069:
stream.state == HALF_CLOSED_REMOTE→ true (just set in step 1) - L4070-4072:
stream.state = CLOSED;stream.free_resources::<false>(self) - L4076: dispatches
onStreamEnd(second fire)
Why this is a regression introduced by this PR
Before this PR, the CONTINUATION handler did
stream.end_after_headers = frame.flags & END_STREAM != 0before the finalization block. Per RFC 9113 §6.10, CONTINUATION frames define only theEND_HEADERSflag, so a well-formed peer never sets bit 0 on a CONTINUATION frame, andend_after_headerswas always overwritten tofalse. The finalization block was therefore dead code in practice. This PR (correctly) removed that overwrite soend_after_headersis preserved from the HEADERS frame — but that makes the previously-dead block live, and exposes its inverted state check.Why nothing else prevents it
handle_headers_frameis the only other place that touches this state, and it has already run by the time the CONTINUATION arrives. There is no "already-dispatched" guard; the CONTINUATION block re-evaluates from scratch. And because the HEADERS handler put the stream into exactlyHALF_CLOSED_REMOTE, the CONTINUATION handler'sHALF_CLOSED_REMOTE → CLOSEDarm is guaranteed to match.Impact
- Double
onStreamEnd: JS-side stream listeners see two end events for one stream. - Premature close:
HALF_CLOSED_REMOTEmeans "the peer is done sending; we still may send". Transitioning it toCLOSEDand callingfree_resources()tears the stream down before the local endpoint (e.g. an HTTP/2 server) has sent its response headers/body. Compare withhandle_headers_frameL4204, which only goesCLOSEDfromHALF_CLOSED_LOCAL(i.e. when we had already finished sending and the peer's END_STREAM completes the pair). - This affects the common case of any request whose header block exceeds one frame and has no body (e.g. a large-cookie GET request), so it is not an edge case.
Fix
Either:
- Drop the finalization block at L4066-4081 entirely —
handle_headers_framealready performed the END_STREAM transition and dispatchedonStreamEndwhen it setis_waiting_more_headers = true; the CONTINUATION handler only needs to clearis_waiting_more_headers. Or, - Move the END_STREAM transition out of
handle_headers_frame's split-header path so it happens exactly once whenEND_HEADERSfinally arrives, and fix the CONTINUATION state check to match L4204 (HALF_CLOSED_LOCAL → CLOSED, else→ HALF_CLOSED_REMOTE).
Option 1 is the minimal fix; option 2 is the spec-cleaner one (RFC 9113 §8.1: a header block is processed as a unit when END_HEADERS is received).
🔬 also observed by coderabbitai
- L4186:
…ogical is_server check
08338d7 to
941e4b4
Compare
There was a problem hiding this comment.
All of my earlier inline findings have been addressed and this revision passes clean, but given the breadth (50 files, 26 subsystems) and the security-sensitive surfaces touched — auth handshakes, webcrypto DER parsing, structured-clone deserialization, HTTP/2 framing, archive extraction — this still warrants a human sign-off.
Extended reasoning...
Overview
This is a 37-commit hardening sweep across 50 files: MySQL allowPublicKeyRetrieval gating, Postgres SASL nonce/SSL-mode checks, HTTP/2 per-stream header accounting and inbound-stream allocation rules, WebSocket key/version validation, uWS empty-header-name rejection, DNS-rebinding guard for the bake DevServer (HTTP + HMR upgrade), archive symlink-traversal normalization, patch/install path-component validation, package.json exports percent-decode segment check, CSS nesting / glob brace budgets, Valkey RESP blob caps, Buffer.write post-reentrancy revalidation, SerializedScriptValue RSA prime indexing + BIO length check, OKP DER bounds, BlockList structured-clone pointer registry + nonce, multipart name escaping, S3 UploadId slice fix, child_process .bat/.cmd rejection, and assorted bounds tightening.
Security risks
Nearly every change here is security-relevant by design. The highest-leverage areas are: (1) the BlockList structured-clone path, which gates dereference of an attacker-supplied pointer behind a registry + per-instance nonce; (2) MySQL public-key retrieval now default-deny over plain TCP (behavioral change with new error code); (3) the H2 inbound-stream lookup refactor, which changes which frames can allocate stream state; (4) DevServer Host-header allowlisting, which now also covers the HMR WebSocket upgrade; (5) archive symlink normalization switching from a fake-root join to relative ..-preserving normalization with case-insensitive prefix match. Each of these went through at least one review round and the fixes look right, but they are exactly the kind of changes that benefit from a second pair of human eyes.
Level of scrutiny
High. This PR directly modifies auth flows, crypto key parsing, protocol state machines, and path-traversal defenses in production code paths. It also introduces user-visible behavioral changes (MySQL default-deny, Windows .bat/.cmd EINVAL, Postgres array-type allowlist) that carry compatibility risk. None of this fits the "simple/mechanical/obvious" bar for bot approval.
Other factors
The PR has been through extensive iterative review: CodeRabbit raised ~14 findings and I raised 8 inline issues across four passes; the author addressed all of them (CONTINUATION frame-size source, spawnSync error-delivery contract, HMR upgrade host check, OKP DER bounds in both importPkcs8 and importSpki, redundant is_server guard, BlockList nonce, postgres array-type allowlist widening, etc.), and every thread is resolved. cargo check and rust:check-all pass per the description, and the CI build is running. The current bug-hunting pass surfaced nothing new. That's a good signal, but the sheer surface area and the fact that several fixes required multiple iterations argue for human review before merge.
General hardening pass tightening input validation, buffer bounds, and protocol handling across the runtime.
Areas touched (37 commits):
cargo check -p bun_binandbun run rust:check-all(all 10 targets) pass.