Skip to content
2 changes: 1 addition & 1 deletion src/codegen/bake-codegen.ts
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ async function run() {
side: JSON.stringify(side),
IS_ERROR_RUNTIME: String(file === "error"),
IS_BUN_DEVELOPMENT: String(!!debug),
OVERLAY_CSS: css("../runtime/bake/client/overlay.css", !!debug),
OVERLAY_CSS: JSON.stringify(css("../runtime/bake/client/overlay.css", !!debug)),
},
minify: {
syntax: !debug,
Expand Down
33 changes: 20 additions & 13 deletions src/http_jsc/websocket_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1658,7 +1658,21 @@
if !this.has_tcp() {
return;
}
let mut close_reason_buf = [0u8; 128];
// `send_close_with_body` takes `&mut [u8; 125]`; keep the array at
// 125 bytes so the borrow covers the full provenance and matches
// the other caller (line 1006, server-initiated close echo).
Comment on lines +1661 to +1663
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 nit: the new comment references "the other caller (line 1006, server-initiated close echo)" — hardcoded line numbers in comments rot as soon as anything above them is edited in this ~2000-line file. The semantic anchor ("server-initiated close echo") is already enough to find the call site; suggest dropping "line 1006, ".

Extended reasoning...

What the issue is

The comment added at src/http_jsc/websocket_client.rs:1635 reads:

// the other caller (line 1006, server-initiated close echo).

This embeds a hardcoded source line number as a cross-reference. In a ~2000-line file, any edit above line 1006 — a new import, a doc comment, a refactor of an earlier method — shifts the target, and the comment silently goes stale. The comment is being added fresh in this PR, so it's the right time to fix it before it lands.

Why the line number adds no durable value

The comment already includes a stable semantic anchor: "server-initiated close echo". A reader can grep for send_close_with_body (there are exactly two callers) or for "close echo" and find the other call site regardless of where it has drifted to. The line number is supplementary, not the primary anchor — which means when it drifts it becomes pure noise: a reader who jumps to line 1006 and finds unrelated code has to fall back on the semantic description anyway.

Step-by-step proof of rot

  1. Today: the other send_close_with_body call is at line 1006 (verified).
  2. A future PR adds a 10-line helper near the top of impl<const SSL: bool> WebSocket<SSL>.
  3. The server-initiated close echo is now at line 1016; the comment still says "line 1006".
  4. A reader following the comment lands in the middle of an unrelated function body.
  5. The reader has to grep for "server-initiated close echo" / send_close_with_body anyway — exactly what they'd have done if the line number weren't there.

Addressing the "too trivial" objection

One reviewer noted the semantic anchor mitigates the rot, so the line number is merely belt-and-suspenders. That's true, and is why this is a nit, not a blocker. But the comment is brand-new in this PR — the marginal cost of dropping four characters now is zero, vs. leaving a known-to-rot reference in a freshly-authored comment. Since the PR is already touching this exact line, it's the cheapest possible time to address it.

Impact

Zero runtime impact. Pure comment maintainability.

Fix

Drop the line number, keep the semantic referent:

// the other caller (server-initiated close echo).

let mut close_reason_buf = [0u8; 125];
Comment thread
robobun marked this conversation as resolved.
// RFC 6455 §5.5.1: the close-frame payload is ≤ 125 bytes total and
// the 7-bit header length field is the payload length — there is no
// extended-length encoding for control frames. `send_close_with_body`
// prepends the 2-byte status code separately and computes the header
// length as `(body_len + 2) & 0x7F`. A reason of 124 or 125 UTF-8
// bytes therefore writes a header length of 126 or 127, which are
// the 16/64-bit extended-length sentinels per §5.2 and malform the
// frame. Cap the transcode cursor at 123 bytes so the overflow arms
// below (`WriteZero` / `NoSpaceLeft`) bail via `break 'inner` at the
// correct boundary.
const MAX_REASON_BYTES: usize = 123;

Check warning on line 1675 in src/http_jsc/websocket_client.rs

View check run for this annotation

Claude / Claude Code Review

Stale comment: malformed-frame rationale no longer reflects send_close_with_body's .min(123) clamp

The new comment justifying `MAX_REASON_BYTES = 123` says `send_close_with_body` "computes the header length as `(body_len + 2) & 0x7F`" and that a 124/125-byte reason "writes a header length of 126 or 127 … and malform[s] the frame" — but after the 2026-05-21 rebase, `send_close_with_body` does `let body_len = body_len.min(123);` (line 1346) *before* computing the header length, so that scenario can no longer occur. The 123-byte cursor cap is still worthwhile — it routes overflow to a clean no-r
Comment on lines +1665 to +1675
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new comment justifying MAX_REASON_BYTES = 123 says send_close_with_body "computes the header length as (body_len + 2) & 0x7F" and that a 124/125-byte reason "writes a header length of 126 or 127 … and malform[s] the frame" — but after the 2026-05-21 rebase, send_close_with_body does let body_len = body_len.min(123); (line 1346) before computing the header length, so that scenario can no longer occur. The 123-byte cursor cap is still worthwhile — it routes overflow to a clean no-reason close instead of letting .min(123) truncate mid-UTF-8-sequence and trip terminate(InvalidUtf8) at line 1379 — but the comment's stated rationale is now factually wrong and should be updated (or trimmed, per src/CLAUDE.md's "avoid adding comments where not necessary").

Extended reasoning...

What the issue is

The comment block added at src/http_jsc/websocket_client.rs:1665–1675 justifies MAX_REASON_BYTES = 123 like this:

send_close_with_body prepends the 2-byte status code separately and computes the header length as (body_len + 2) & 0x7F. A reason of 124 or 125 UTF-8 bytes therefore writes a header length of 126 or 127, which are the 16/64-bit extended-length sentinels per §5.2 and malform the frame.

That description of send_close_with_body is no longer accurate. The current implementation (lines 1342–1367, brought in via #31129 on main and untouched by this PR) reads:

fn send_close_with_body(&mut self, code: u16, body: Option<&mut [u8; 125]>, body_len: usize) {
    // ... reason text is limited to 123 bytes.
    let body_len = body_len.min(123);          // ← line 1346
    ...
    header.set_len(((body_len + 2) & 0x7F) as u8); // ← line 1367

The .min(123) clamp runs before the header length is computed, so body_len + 2 is at most 125 and the 126/127 extended-length sentinels can never be emitted regardless of what the caller passes.

How the comment went stale

The rationale was correct when written. Per the PR timeline, the original 2026-05-15 review identified the 124/125-byte malformed-frame edge against the then-current send_close_with_body, which had no clamp; commit 0754956 added the 123-byte cursor cap and this comment to fix it. On 2026-05-21 the branch was rebased onto a main that now includes #31129's independent .min(123) defense inside send_close_with_body (the rebase note even says main has "another variation of the overflow fix"). The comment was carried forward unchanged and now describes callee behavior that no longer exists.

Step-by-step proof

Take the comment's own example — a reason that transcodes to 124 UTF-8 bytes — and suppose the cursor cap were absent so 124 reaches send_close_with_body:

  1. send_close_with_body(code, Some(&mut close_reason_buf), 124) is called.
  2. Line 1346: let body_len = 124.min(123)body_len = 123.
  3. Line 1367: header.set_len(((123 + 2) & 0x7F) as u8)set_len(125).
  4. 125 is a valid 7-bit literal length per RFC 6455 §5.2 — not 126 or 127, not an extended-length sentinel. The frame is well-formed.

So the comment's central claim ("writes a header length of 126 or 127 … and malform the frame") is unreachable on the post-rebase code.

Why the cursor cap still has value (the correct rationale)

Without MAX_REASON_BYTES, a 124-byte UTF-8 reason would reach send_close_with_body, get hard-truncated to 123 bytes by .min(123), and — if byte 123 falls mid-multibyte-sequence — fail is_valid_utf8 at line 1379, calling self.terminate(ErrorCode::InvalidUtf8) instead of sending any close frame. The cursor cap makes overflow take break 'inner and fall through to the clean send_close_with_body(code, None, 0) path. That is a real, useful behavioral difference — it's just not the malformed-frame story the comment tells.

Impact

Zero runtime impact — this is purely comment accuracy. But the comment is brand-new in this PR (not pre-existing prose), it makes a concrete present-tense claim about a specific function's behavior, and that claim is verifiably false against the file as it stands. A future reader auditing send_close_with_body will find the .min(123) clamp, see this comment asserting it doesn't exist, and have to spend time reconciling the two. src/CLAUDE.md says "avoid adding comments where not necessary"; an inaccurate 10-line comment is worse than none.

Fix

Either trim the comment to the load-bearing fact:

// RFC 6455 §5.5.1: close-frame payload is 2-byte status + ≤123-byte reason.
// Cap the cursor at 123 so overflow falls through to the no-reason close
// rather than reaching send_close_with_body's .min(123) truncation (which
// could split a UTF-8 sequence and trip terminate(InvalidUtf8)).
const MAX_REASON_BYTES: usize = 123;

or drop the "computes the header length as (body_len + 2) & 0x7F … 126 or 127 … malform the frame" sentences entirely, since send_close_with_body now defends that boundary itself.

// SAFETY: reason is null or a valid *const ZigString from C++
if let Some(str) = unsafe { reason.as_ref() } {
'inner: {
Expand All @@ -1668,9 +1682,9 @@
// replicate the encoding switch directly: 8-bit copies bytes,
// 16-bit transcodes via `to_owned_slice()` (UTF-16 → UTF-8).
use std::io::Write;
let mut cursor = std::io::Cursor::new(&mut close_reason_buf[..]);
let mut cursor = std::io::Cursor::new(&mut close_reason_buf[..MAX_REASON_BYTES]);
if str.is_16bit() {
// Allocates; close-reason is bounded ≤125 bytes and this
// Allocates; close-reason is bounded ≤123 bytes and this
// path is cold (close handshake).
let utf8 = str.to_owned_slice();
if cursor.write_all(&utf8).is_err() {
Expand All @@ -1697,16 +1711,9 @@
cursor.set_position((pos + result.written as usize) as u64);
}
let wrote_len = cursor.position() as usize;
// 125-byte close-frame payload budget minus the 2-byte status code.
if wrote_len > 123 {
break 'inner;
}
// SAFETY: `close_reason_buf` is a 128-byte stack array, so its
// first 125 bytes form a valid `[u8; 125]` (align 1); `cursor`
// (the only other borrow) ended at `wrote_len` above, so this
// `&mut` is unique.
let buf = unsafe { &mut *close_reason_buf.as_mut_ptr().cast::<[u8; 125]>() };
this.send_close_with_body(code, Some(buf), wrote_len);
// Cursor is over a 123-byte view; overflow takes `break 'inner` above.
debug_assert!(wrote_len <= MAX_REASON_BYTES);
this.send_close_with_body(code, Some(&mut close_reason_buf), wrote_len);
return;
}
}
Expand Down
104 changes: 104 additions & 0 deletions test/js/web/websocket/websocket-close-fragmented.test.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { TCPSocketListener } from "bun";
import { describe, expect, test } from "bun:test";
import { bunEnv, bunExe } from "harness";

const hostname = "127.0.0.1";
const port = 0;
Expand Down Expand Up @@ -124,4 +125,107 @@ describe("WebSocket", () => {
server?.stop(true);
}
});

// Regression: the close() reason was transcoded into a fixed 128-byte
// stack buffer and then pointer-cast to `&mut [u8; 125]` before being
// passed on with `body_len = cursor.position()`. A UTF-16 reason of
// 42 code units of U+0800 passes the C++ 123-char limit (which bounds
// code units, not UTF-8 bytes) but transcodes to 126 UTF-8 bytes,
// overrunning the 125-byte reference, panicking the Rust side (`range
// end index 126 out of range for slice of length 125`) and aborting
// the process across `extern "C"`.
//
// The subprocess wrapper is deliberate: a panic in `WebSocket::close`
// terminates the WHOLE bun process, which would crash the test runner
// itself and leave no JUnit output. Spawning a child isolates the
// expected crash so the parent test can assert on the child's exit
// code.
// The 30s timeout (vs the 5s default) covers the pre-fix failure arm:
// the Rust panic in `WebSocket::close` fires SIGILL, which the debug
// crash handler catches and runs `llvm-symbolizer` on — ~6s in ASAN
// builds. The happy path is ~1s.
test("close() with reason that transcodes beyond 123 UTF-8 bytes does not crash", async () => {
await using proc = Bun.spawn({
cmd: [bunExe(), "-e", CLOSE_LONG_REASON_FIXTURE],
env: bunEnv,
stdout: "pipe",
stderr: "ignore",
});
const [stdout, exitCode] = await Promise.all([proc.stdout.text(), proc.exited]);
// With the fix: the fixture reaches the close listener, prints
// `close:1000`, and exits cleanly. Without the fix: the child aborts
// inside `WebSocket::close` before the listener fires — stdout is
// empty and exitCode is non-zero (SIGILL from the Rust panic).
expect({ stdout: stdout.trim(), exitCode }).toEqual({
stdout: "close:1000",
Comment on lines +148 to +160
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Pipe the child stderr instead of discarding it.

This test is guarding a crash path; with stderr: "ignore", a regression only gives you stdout/exitCode and drops the most useful crash diagnostics.

♻️ Suggested change
     await using proc = Bun.spawn({
       cmd: [bunExe(), "-e", CLOSE_LONG_REASON_FIXTURE],
       env: bunEnv,
       stdout: "pipe",
-      stderr: "ignore",
+      stderr: "pipe",
     });
-    const [stdout, exitCode] = await Promise.all([proc.stdout.text(), proc.exited]);
+    const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
+    const cleanStderr = stderr
+      .split("\n")
+      .filter(line => !line.startsWith("WARNING: ASAN interferes"))
+      .join("\n")
+      .trim();
     // With the fix: the fixture reaches the close listener, prints
     // `close:1000`, and exits cleanly. Without the fix: the child aborts
     // inside `WebSocket::close` before the listener fires — stdout is
     // empty and exitCode is non-zero (SIGILL from the Rust panic).
-    expect({ stdout: stdout.trim(), exitCode }).toEqual({
+    expect({ stdout: stdout.trim(), stderr: cleanStderr, exitCode }).toMatchObject({
       stdout: "close:1000",
+      stderr: "",
       exitCode: 0,
     });

Based on learnings, crash-prone spawned-process tests should keep subprocess output in the failure diff, and bunEnv-based tests should filter the known ASAN startup warning before asserting stderr is empty.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/js/web/websocket/websocket-close-fragmented.test.ts` around lines 148 -
160, The test currently discards child stderr ("stderr: \"ignore\"") which drops
crash diagnostics; change Bun.spawn so proc is created with stderr: "pipe"
(alongside stdout: "pipe"), await proc.stderr.text() together with
proc.stdout.text() and proc.exited, and update the assertion to include stderr
in the failure output; when asserting stderr is empty, filter out the known ASAN
startup warning coming from bunEnv before comparing so benign ASAN messages
don't fail the test.

exitCode: 0,
});
}, 30_000);
});

// Raw-socket WebSocket handshake + close(1000, reason-transcoding-to-126-UTF-8-bytes).
// Runs in a child process so a panic in the native close path aborts the
// child, not the test runner.
const CLOSE_LONG_REASON_FIXTURE = /* js */ `
const MAX_HEADER_SIZE = 16 * 1024;
const server = Bun.listen({
hostname: "127.0.0.1",
port: 0,
socket: {
data(socket, data) {
if (socket.data?.handshakeComplete) {
if (!socket.data.receivedCloseFrame && data.length > 0 && data[0] === 0x88) {
socket.data.receivedCloseFrame = true;
socket.write(new Uint8Array([0x88, 0x00]));
socket.flush();
socket.end();
}
return;
}
socket.data ||= { handshakeBuffer: new Uint8Array(0), handshakeComplete: false, receivedCloseFrame: false };
const prev = socket.data.handshakeBuffer;
const merged = new Uint8Array(prev.length + data.length);
merged.set(prev);
merged.set(data, prev.length);
socket.data.handshakeBuffer = merged;
if (merged.length > MAX_HEADER_SIZE) { socket.end(); return; }
const text = new TextDecoder("utf-8").decode(merged);
if (text.indexOf("\\r\\n\\r\\n") === -1) return;
const magic = /Sec-WebSocket-Key:\\s*(.*)\\r\\n/i.exec(text);
if (!magic) { socket.end(); return; }
const hasher = new Bun.CryptoHasher("sha1");
hasher.update(magic[1].trim());
hasher.update("258EAFA5-E914-47DA-95CA-C5AB0DC85B11");
const accept = hasher.digest("base64");
socket.write(
"HTTP/1.1 101 Switching Protocols\\r\\n" +
"Upgrade: websocket\\r\\n" +
"Connection: Upgrade\\r\\n" +
"Sec-WebSocket-Accept: " + accept + "\\r\\n" +
"\\r\\n",
);
socket.flush();
socket.data.handshakeComplete = true;
},
},
});

const { promise, resolve } = Promise.withResolvers();
const ws = new WebSocket("ws://" + server.hostname + ":" + server.port);
ws.addEventListener("open", () => {
// 42 code units × 3 UTF-8 bytes = 126 bytes — one byte past the
// 125-byte close-frame payload cap. C++ spec check bounds on UTF-16
// code-unit count (42 < 123), so this reaches the native close path.
ws.close(1000, "\\u0800".repeat(42));
Comment on lines +216 to +219
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace .repeat(42) with the repo’s Buffer.alloc(...).toString() pattern.

This fixture is building a repetitive string in a test, and this repo explicitly avoids .repeat() for that case.

♻️ Suggested change
-  // 42 code units × 3 UTF-8 bytes = 126 bytes — one byte past the
+  // 126 UTF-8 bytes total — one byte past the
   // 125-byte close-frame payload cap. C++ spec check bounds on UTF-16
   // code-unit count (42 < 123), so this reaches the native close path.
-  ws.close(1000, "\\u0800".repeat(42));
+  ws.close(1000, Buffer.alloc(126, "\\u0800").toString());

As per coding guidelines, "To create repetitive strings in tests, use Buffer.alloc(count, fill).toString() instead of "string".repeat(count) for better performance in debug builds."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// 42 code units × 3 UTF-8 bytes = 126 bytes — one byte past the
// 125-byte close-frame payload cap. C++ spec check bounds on UTF-16
// code-unit count (42 < 123), so this reaches the native close path.
ws.close(1000, "\\u0800".repeat(42));
// 126 UTF-8 bytes total — one byte past the
// 125-byte close-frame payload cap. C++ spec check bounds on UTF-16
// code-unit count (42 < 123), so this reaches the native close path.
ws.close(1000, Buffer.alloc(126, "\\u0800").toString());
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/js/web/websocket/websocket-close-fragmented.test.ts` around lines 216 -
219, Replace the use of the string .repeat call in the ws.close call: locate the
ws.close(1000, "\\u0800".repeat(42)) invocation and replace the
repetitive-string construction with the repository pattern using
Buffer.alloc(...).toString() (i.e., build the repeated "\u0800" payload via
Buffer.alloc(count, fill).toString() rather than String.prototype.repeat) so the
test uses Buffer.alloc for creating the repetitive payload.

});
ws.addEventListener("close", event => {
console.log("close:" + event.code);
resolve();
});
ws.addEventListener("error", () => {
console.log("error");
resolve();
});
await promise;
server.stop(true);
`;
Loading