-
Notifications
You must be signed in to change notification settings - Fork 4.6k
ws client: cap close-reason transcode buffer at 125 to remove unsafe cast #30778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
883d520
c615fa7
02b389a
36a7850
2450260
1790bd0
3e26e65
4ee16c1
bbf49b2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1658,7 +1658,21 @@ | |
| if !this.has_tcp() { | ||
| return; | ||
| } | ||
| let mut close_reason_buf = [0u8; 128]; | ||
| // `send_close_with_body` takes `&mut [u8; 125]`; keep the array at | ||
| // 125 bytes so the borrow covers the full provenance and matches | ||
| // the other caller (line 1006, server-initiated close echo). | ||
| let mut close_reason_buf = [0u8; 125]; | ||
|
robobun marked this conversation as resolved.
|
||
| // RFC 6455 §5.5.1: the close-frame payload is ≤ 125 bytes total and | ||
| // the 7-bit header length field is the payload length — there is no | ||
| // extended-length encoding for control frames. `send_close_with_body` | ||
| // prepends the 2-byte status code separately and computes the header | ||
| // length as `(body_len + 2) & 0x7F`. A reason of 124 or 125 UTF-8 | ||
| // bytes therefore writes a header length of 126 or 127, which are | ||
| // the 16/64-bit extended-length sentinels per §5.2 and malform the | ||
| // frame. Cap the transcode cursor at 123 bytes so the overflow arms | ||
| // below (`WriteZero` / `NoSpaceLeft`) bail via `break 'inner` at the | ||
| // correct boundary. | ||
| const MAX_REASON_BYTES: usize = 123; | ||
|
Check warning on line 1675 in src/http_jsc/websocket_client.rs
|
||
|
Comment on lines
+1665
to
+1675
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 The new comment justifying Extended reasoning...What the issue isThe comment block added at
That description of fn send_close_with_body(&mut self, code: u16, body: Option<&mut [u8; 125]>, body_len: usize) {
// ... reason text is limited to 123 bytes.
let body_len = body_len.min(123); // ← line 1346
...
header.set_len(((body_len + 2) & 0x7F) as u8); // ← line 1367The How the comment went staleThe rationale was correct when written. Per the PR timeline, the original 2026-05-15 review identified the 124/125-byte malformed-frame edge against the then-current Step-by-step proofTake the comment's own example — a reason that transcodes to 124 UTF-8 bytes — and suppose the cursor cap were absent so 124 reaches
So the comment's central claim ("writes a header length of 126 or 127 … and malform the frame") is unreachable on the post-rebase code. Why the cursor cap still has value (the correct rationale)Without ImpactZero runtime impact — this is purely comment accuracy. But the comment is brand-new in this PR (not pre-existing prose), it makes a concrete present-tense claim about a specific function's behavior, and that claim is verifiably false against the file as it stands. A future reader auditing FixEither trim the comment to the load-bearing fact: // RFC 6455 §5.5.1: close-frame payload is 2-byte status + ≤123-byte reason.
// Cap the cursor at 123 so overflow falls through to the no-reason close
// rather than reaching send_close_with_body's .min(123) truncation (which
// could split a UTF-8 sequence and trip terminate(InvalidUtf8)).
const MAX_REASON_BYTES: usize = 123;or drop the "computes the header length as |
||
| // SAFETY: reason is null or a valid *const ZigString from C++ | ||
| if let Some(str) = unsafe { reason.as_ref() } { | ||
| 'inner: { | ||
|
|
@@ -1668,9 +1682,9 @@ | |
| // replicate the encoding switch directly: 8-bit copies bytes, | ||
| // 16-bit transcodes via `to_owned_slice()` (UTF-16 → UTF-8). | ||
| use std::io::Write; | ||
| let mut cursor = std::io::Cursor::new(&mut close_reason_buf[..]); | ||
| let mut cursor = std::io::Cursor::new(&mut close_reason_buf[..MAX_REASON_BYTES]); | ||
| if str.is_16bit() { | ||
| // Allocates; close-reason is bounded ≤125 bytes and this | ||
| // Allocates; close-reason is bounded ≤123 bytes and this | ||
| // path is cold (close handshake). | ||
| let utf8 = str.to_owned_slice(); | ||
| if cursor.write_all(&utf8).is_err() { | ||
|
|
@@ -1697,16 +1711,9 @@ | |
| cursor.set_position((pos + result.written as usize) as u64); | ||
| } | ||
| let wrote_len = cursor.position() as usize; | ||
| // 125-byte close-frame payload budget minus the 2-byte status code. | ||
| if wrote_len > 123 { | ||
| break 'inner; | ||
| } | ||
| // SAFETY: `close_reason_buf` is a 128-byte stack array, so its | ||
| // first 125 bytes form a valid `[u8; 125]` (align 1); `cursor` | ||
| // (the only other borrow) ended at `wrote_len` above, so this | ||
| // `&mut` is unique. | ||
| let buf = unsafe { &mut *close_reason_buf.as_mut_ptr().cast::<[u8; 125]>() }; | ||
| this.send_close_with_body(code, Some(buf), wrote_len); | ||
| // Cursor is over a 123-byte view; overflow takes `break 'inner` above. | ||
| debug_assert!(wrote_len <= MAX_REASON_BYTES); | ||
| this.send_close_with_body(code, Some(&mut close_reason_buf), wrote_len); | ||
| return; | ||
| } | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,5 +1,6 @@ | ||||||||||||||||||
| import { TCPSocketListener } from "bun"; | ||||||||||||||||||
| import { describe, expect, test } from "bun:test"; | ||||||||||||||||||
| import { bunEnv, bunExe } from "harness"; | ||||||||||||||||||
|
|
||||||||||||||||||
| const hostname = "127.0.0.1"; | ||||||||||||||||||
| const port = 0; | ||||||||||||||||||
|
|
@@ -124,4 +125,107 @@ describe("WebSocket", () => { | |||||||||||||||||
| server?.stop(true); | ||||||||||||||||||
| } | ||||||||||||||||||
| }); | ||||||||||||||||||
|
|
||||||||||||||||||
| // Regression: the close() reason was transcoded into a fixed 128-byte | ||||||||||||||||||
| // stack buffer and then pointer-cast to `&mut [u8; 125]` before being | ||||||||||||||||||
| // passed on with `body_len = cursor.position()`. A UTF-16 reason of | ||||||||||||||||||
| // 42 code units of U+0800 passes the C++ 123-char limit (which bounds | ||||||||||||||||||
| // code units, not UTF-8 bytes) but transcodes to 126 UTF-8 bytes, | ||||||||||||||||||
| // overrunning the 125-byte reference, panicking the Rust side (`range | ||||||||||||||||||
| // end index 126 out of range for slice of length 125`) and aborting | ||||||||||||||||||
| // the process across `extern "C"`. | ||||||||||||||||||
| // | ||||||||||||||||||
| // The subprocess wrapper is deliberate: a panic in `WebSocket::close` | ||||||||||||||||||
| // terminates the WHOLE bun process, which would crash the test runner | ||||||||||||||||||
| // itself and leave no JUnit output. Spawning a child isolates the | ||||||||||||||||||
| // expected crash so the parent test can assert on the child's exit | ||||||||||||||||||
| // code. | ||||||||||||||||||
| // The 30s timeout (vs the 5s default) covers the pre-fix failure arm: | ||||||||||||||||||
| // the Rust panic in `WebSocket::close` fires SIGILL, which the debug | ||||||||||||||||||
| // crash handler catches and runs `llvm-symbolizer` on — ~6s in ASAN | ||||||||||||||||||
| // builds. The happy path is ~1s. | ||||||||||||||||||
| test("close() with reason that transcodes beyond 123 UTF-8 bytes does not crash", async () => { | ||||||||||||||||||
| await using proc = Bun.spawn({ | ||||||||||||||||||
| cmd: [bunExe(), "-e", CLOSE_LONG_REASON_FIXTURE], | ||||||||||||||||||
| env: bunEnv, | ||||||||||||||||||
| stdout: "pipe", | ||||||||||||||||||
| stderr: "ignore", | ||||||||||||||||||
| }); | ||||||||||||||||||
| const [stdout, exitCode] = await Promise.all([proc.stdout.text(), proc.exited]); | ||||||||||||||||||
| // With the fix: the fixture reaches the close listener, prints | ||||||||||||||||||
| // `close:1000`, and exits cleanly. Without the fix: the child aborts | ||||||||||||||||||
| // inside `WebSocket::close` before the listener fires — stdout is | ||||||||||||||||||
| // empty and exitCode is non-zero (SIGILL from the Rust panic). | ||||||||||||||||||
| expect({ stdout: stdout.trim(), exitCode }).toEqual({ | ||||||||||||||||||
| stdout: "close:1000", | ||||||||||||||||||
|
Comment on lines
+148
to
+160
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win Pipe the child This test is guarding a crash path; with ♻️ Suggested change await using proc = Bun.spawn({
cmd: [bunExe(), "-e", CLOSE_LONG_REASON_FIXTURE],
env: bunEnv,
stdout: "pipe",
- stderr: "ignore",
+ stderr: "pipe",
});
- const [stdout, exitCode] = await Promise.all([proc.stdout.text(), proc.exited]);
+ const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
+ const cleanStderr = stderr
+ .split("\n")
+ .filter(line => !line.startsWith("WARNING: ASAN interferes"))
+ .join("\n")
+ .trim();
// With the fix: the fixture reaches the close listener, prints
// `close:1000`, and exits cleanly. Without the fix: the child aborts
// inside `WebSocket::close` before the listener fires — stdout is
// empty and exitCode is non-zero (SIGILL from the Rust panic).
- expect({ stdout: stdout.trim(), exitCode }).toEqual({
+ expect({ stdout: stdout.trim(), stderr: cleanStderr, exitCode }).toMatchObject({
stdout: "close:1000",
+ stderr: "",
exitCode: 0,
});Based on learnings, crash-prone spawned-process tests should keep subprocess output in the failure diff, and 🤖 Prompt for AI Agents |
||||||||||||||||||
| exitCode: 0, | ||||||||||||||||||
| }); | ||||||||||||||||||
| }, 30_000); | ||||||||||||||||||
| }); | ||||||||||||||||||
|
|
||||||||||||||||||
| // Raw-socket WebSocket handshake + close(1000, reason-transcoding-to-126-UTF-8-bytes). | ||||||||||||||||||
| // Runs in a child process so a panic in the native close path aborts the | ||||||||||||||||||
| // child, not the test runner. | ||||||||||||||||||
| const CLOSE_LONG_REASON_FIXTURE = /* js */ ` | ||||||||||||||||||
| const MAX_HEADER_SIZE = 16 * 1024; | ||||||||||||||||||
| const server = Bun.listen({ | ||||||||||||||||||
| hostname: "127.0.0.1", | ||||||||||||||||||
| port: 0, | ||||||||||||||||||
| socket: { | ||||||||||||||||||
| data(socket, data) { | ||||||||||||||||||
| if (socket.data?.handshakeComplete) { | ||||||||||||||||||
| if (!socket.data.receivedCloseFrame && data.length > 0 && data[0] === 0x88) { | ||||||||||||||||||
| socket.data.receivedCloseFrame = true; | ||||||||||||||||||
| socket.write(new Uint8Array([0x88, 0x00])); | ||||||||||||||||||
| socket.flush(); | ||||||||||||||||||
| socket.end(); | ||||||||||||||||||
| } | ||||||||||||||||||
| return; | ||||||||||||||||||
| } | ||||||||||||||||||
| socket.data ||= { handshakeBuffer: new Uint8Array(0), handshakeComplete: false, receivedCloseFrame: false }; | ||||||||||||||||||
| const prev = socket.data.handshakeBuffer; | ||||||||||||||||||
| const merged = new Uint8Array(prev.length + data.length); | ||||||||||||||||||
| merged.set(prev); | ||||||||||||||||||
| merged.set(data, prev.length); | ||||||||||||||||||
| socket.data.handshakeBuffer = merged; | ||||||||||||||||||
| if (merged.length > MAX_HEADER_SIZE) { socket.end(); return; } | ||||||||||||||||||
| const text = new TextDecoder("utf-8").decode(merged); | ||||||||||||||||||
| if (text.indexOf("\\r\\n\\r\\n") === -1) return; | ||||||||||||||||||
| const magic = /Sec-WebSocket-Key:\\s*(.*)\\r\\n/i.exec(text); | ||||||||||||||||||
| if (!magic) { socket.end(); return; } | ||||||||||||||||||
| const hasher = new Bun.CryptoHasher("sha1"); | ||||||||||||||||||
| hasher.update(magic[1].trim()); | ||||||||||||||||||
| hasher.update("258EAFA5-E914-47DA-95CA-C5AB0DC85B11"); | ||||||||||||||||||
| const accept = hasher.digest("base64"); | ||||||||||||||||||
| socket.write( | ||||||||||||||||||
| "HTTP/1.1 101 Switching Protocols\\r\\n" + | ||||||||||||||||||
| "Upgrade: websocket\\r\\n" + | ||||||||||||||||||
| "Connection: Upgrade\\r\\n" + | ||||||||||||||||||
| "Sec-WebSocket-Accept: " + accept + "\\r\\n" + | ||||||||||||||||||
| "\\r\\n", | ||||||||||||||||||
| ); | ||||||||||||||||||
| socket.flush(); | ||||||||||||||||||
| socket.data.handshakeComplete = true; | ||||||||||||||||||
| }, | ||||||||||||||||||
| }, | ||||||||||||||||||
| }); | ||||||||||||||||||
|
|
||||||||||||||||||
| const { promise, resolve } = Promise.withResolvers(); | ||||||||||||||||||
| const ws = new WebSocket("ws://" + server.hostname + ":" + server.port); | ||||||||||||||||||
| ws.addEventListener("open", () => { | ||||||||||||||||||
| // 42 code units × 3 UTF-8 bytes = 126 bytes — one byte past the | ||||||||||||||||||
| // 125-byte close-frame payload cap. C++ spec check bounds on UTF-16 | ||||||||||||||||||
| // code-unit count (42 < 123), so this reaches the native close path. | ||||||||||||||||||
| ws.close(1000, "\\u0800".repeat(42)); | ||||||||||||||||||
|
Comment on lines
+216
to
+219
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Replace This fixture is building a repetitive string in a test, and this repo explicitly avoids ♻️ Suggested change- // 42 code units × 3 UTF-8 bytes = 126 bytes — one byte past the
+ // 126 UTF-8 bytes total — one byte past the
// 125-byte close-frame payload cap. C++ spec check bounds on UTF-16
// code-unit count (42 < 123), so this reaches the native close path.
- ws.close(1000, "\\u0800".repeat(42));
+ ws.close(1000, Buffer.alloc(126, "\\u0800").toString());As per coding guidelines, "To create repetitive strings in tests, use 📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||
| }); | ||||||||||||||||||
| ws.addEventListener("close", event => { | ||||||||||||||||||
| console.log("close:" + event.code); | ||||||||||||||||||
| resolve(); | ||||||||||||||||||
| }); | ||||||||||||||||||
| ws.addEventListener("error", () => { | ||||||||||||||||||
| console.log("error"); | ||||||||||||||||||
| resolve(); | ||||||||||||||||||
| }); | ||||||||||||||||||
| await promise; | ||||||||||||||||||
| server.stop(true); | ||||||||||||||||||
| `; | ||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 nit: the new comment references "the other caller (line 1006, server-initiated close echo)" — hardcoded line numbers in comments rot as soon as anything above them is edited in this ~2000-line file. The semantic anchor ("server-initiated close echo") is already enough to find the call site; suggest dropping "line 1006, ".
Extended reasoning...
What the issue is
The comment added at
src/http_jsc/websocket_client.rs:1635reads:This embeds a hardcoded source line number as a cross-reference. In a ~2000-line file, any edit above line 1006 — a new import, a doc comment, a refactor of an earlier method — shifts the target, and the comment silently goes stale. The comment is being added fresh in this PR, so it's the right time to fix it before it lands.
Why the line number adds no durable value
The comment already includes a stable semantic anchor: "server-initiated close echo". A reader can grep for
send_close_with_body(there are exactly two callers) or for "close echo" and find the other call site regardless of where it has drifted to. The line number is supplementary, not the primary anchor — which means when it drifts it becomes pure noise: a reader who jumps to line 1006 and finds unrelated code has to fall back on the semantic description anyway.Step-by-step proof of rot
send_close_with_bodycall is at line 1006 (verified).impl<const SSL: bool> WebSocket<SSL>.send_close_with_bodyanyway — exactly what they'd have done if the line number weren't there.Addressing the "too trivial" objection
One reviewer noted the semantic anchor mitigates the rot, so the line number is merely belt-and-suspenders. That's true, and is why this is a nit, not a blocker. But the comment is brand-new in this PR — the marginal cost of dropping four characters now is zero, vs. leaving a known-to-rot reference in a freshly-authored comment. Since the PR is already touching this exact line, it's the cheapest possible time to address it.
Impact
Zero runtime impact. Pure comment maintainability.
Fix
Drop the line number, keep the semantic referent:
// the other caller (server-initiated close echo).