Skip to content
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
281 changes: 271 additions & 10 deletions src/js/node/worker_threads.ts
Original file line number Diff line number Diff line change
Expand Up @@ -129,22 +129,270 @@

// TODO: parent port emulation is not complete
function fakeParentPort() {
// Node's `parentPort` has its own message dispatch that is independent of the
// worker's global scope. Bun's native worker runtime dispatches parent messages
// onto the global scope (`self.onmessage` / `self.addEventListener('message')`),
// which matches web-worker semantics but not Node's — and it means emscripten-
// style code that does
//
// parentPort.on('message', (m) => onmessage({ data: m }));
// self.onmessage = handleMessage;
//
// sees every message delivered TWICE: once by the automatic `self.onmessage`
// dispatch and once by the explicit forwarding inside the `parentPort.on`
// listener. See https://github.com/oven-sh/bun/issues/29211.
//
// Fix: give `parentPort` its own `EventTarget`, re-dispatch incoming messages
// on it, and stop the native dispatch from reaching `self.onmessage` /
// `self.addEventListener('message', …)` so it matches Node semantics.
const fake = Object.create(MessagePort.prototype);
const parentPortTarget = new EventTarget();

// Forwarders: installed lazily on `self` only while at least one user
// listener is registered on the parentPort. They intercept the native
// parent-message dispatch, stop immediate propagation (so `self.onmessage`
// and `self.addEventListener('message', …)` handlers on the global scope
// never see parent messages — matching Node), and re-dispatch on
// `parentPortTarget`. Installing a `message` listener on `self` keeps the
// event loop alive via `onDidChangeListenerImpl` in
// `BunWorkerGlobalScope.cpp`, so we only install while parentPort is
// actually being used — that way a worker that never touches `parentPort`
// exits cleanly when its module finishes executing.
//
// `onmessage` / `onmessageerror` are spec'd as implicit event listeners: the
// setter replaces at most one listener, firing in the order it was first
// assigned relative to other listeners on the same target.
let parentPortOnMessageWrapper: ((event: Event) => void) | null = null;
let parentPortOnMessageHandler: ((event: MessageEvent) => unknown) | null = null;
let parentPortOnMessageErrorWrapper: ((event: Event) => void) | null = null;
let parentPortOnMessageErrorHandler: ((event: MessageEvent) => unknown) | null = null;

// Separate counters per event type — installing a `message` listener on
// `self` keeps the event loop alive (via `onDidChangeListenerImpl` in
// `BunWorkerGlobalScope.cpp`, which only tracks `messageEvent`), so a
// worker that only registers a `messageerror` handler must NOT pull in the
// `message` forwarder.
let messageListenerCount = 0;
let messageErrorListenerCount = 0;
let messageForwarder: ((event: Event) => void) | null = null;
let messageErrorForwarder: ((event: Event) => void) | null = null;

const makeForwarder = (type: "message" | "messageerror") => (event: Event) => {
// Stop the native dispatch from reaching `self.onmessage` and any
// `self.addEventListener('message', …)` handlers — in Node parent
// messages are only visible through `parentPort`, not through the
// global scope.
event.stopImmediatePropagation();
const messageEvent = event as MessageEvent;
// Preserve `ports` so `worker.postMessage(data, [port])` still surfaces
// the transferred MessagePort(s) to `parentPort` listeners.
const nativePorts = messageEvent.ports;
const clone = new MessageEvent(type, {
data: messageEvent.data,
ports: nativePorts && nativePorts.length > 0 ? $Array.from(nativePorts) : undefined,
});
parentPortTarget.dispatchEvent(clone);
};

function installMessageForwarder() {
if (messageForwarder !== null) return;
messageForwarder = makeForwarder("message");
// Capture phase so we run before any user-installed bubbling listener
// on the global scope (if any).
self.addEventListener("message", messageForwarder, { capture: true });
}
function uninstallMessageForwarder() {
if (messageForwarder === null) return;
self.removeEventListener("message", messageForwarder, { capture: true } as any);
messageForwarder = null;
}
function installMessageErrorForwarder() {
if (messageErrorForwarder !== null) return;
messageErrorForwarder = makeForwarder("messageerror");
self.addEventListener("messageerror", messageErrorForwarder, { capture: true });
}
function uninstallMessageErrorForwarder() {
if (messageErrorForwarder === null) return;
self.removeEventListener("messageerror", messageErrorForwarder, { capture: true } as any);
messageErrorForwarder = null;
}

function acquireListener(type: "message" | "messageerror") {
if (type === "message") {
if (messageListenerCount++ === 0) installMessageForwarder();
} else {
if (messageErrorListenerCount++ === 0) installMessageErrorForwarder();
}
}

function releaseListener(type: "message" | "messageerror") {
if (type === "message") {
if (messageListenerCount > 0 && --messageListenerCount === 0) uninstallMessageForwarder();
} else {
if (messageErrorListenerCount > 0 && --messageErrorListenerCount === 0) uninstallMessageErrorForwarder();
}
}

// Wrap `addEventListener` / `removeEventListener` so we can track user
// listener lifetime on `parentPortTarget` and install / uninstall the
// forwarders on the global scope accordingly. Each (listener, type, capture)
// triple gets wrapped exactly once — duplicate adds are no-ops per the DOM
// spec — and the original listener object is the map key so that a
// `remove(type, original, {capture})` call finds the wrapped copy.
type TrackEntry = { wrapped: EventListener; once: boolean };
// `${type}:${capture ? 1 : 0}` — a listener registered with different
// (type, capture) combinations lives in separate slots, matching spec.
const trackedByListener = new WeakMap<object, Map<string, TrackEntry>>();

function listenerSlot(type: string, capture: boolean): string {
return capture ? type + ":1" : type + ":0";
}

function invokeListener(listener: EventListener | EventListenerObject, event: Event): void {
// DOM EventTarget accepts either a bare function or an object with a
// `handleEvent` method. Dispatch correctly for both forms.
if (typeof listener === "function") {
(listener as any).$call(fake, event);
} else if (listener !== null && typeof listener === "object" && typeof (listener as any).handleEvent === "function") {
(listener as any).handleEvent.$call(listener, event);
}
}

function parentPortAddEventListener(
type: string,
listener: EventListener | EventListenerObject | null,
options?: boolean | AddEventListenerOptions,
): void {
if (listener === null || listener === undefined) return;
const capture = typeof options === "boolean" ? options : !!options?.capture;
const once = typeof options === "object" && options !== null && !!options.once;
// `AbortSignal` auto-removal is driven from the native EventTarget in C++,
// so it would bypass our JS `parentPortRemoveEventListener` wrapper and
// leak the event-loop refcount our capture forwarder holds on `self`.
// Strip the signal from the options we pass inward and re-implement abort
// ourselves via an abort listener that routes through the JS remove path.
const signal =
typeof options === "object" && options !== null ? ((options as AddEventListenerOptions).signal ?? null) : null;
if (signal && signal.aborted) return;
// Only `message` / `messageerror` events are dispatched on `self` by the
// native worker runtime — all other event types (`close`, `error`, …)
// live purely on `parentPortTarget` and don't need the capture forwarder
// at all.
const forwarderType: "message" | "messageerror" | null =
type === "message" ? "message" : type === "messageerror" ? "messageerror" : null;
const slot = listenerSlot(type, capture);
let bucket = trackedByListener.get(listener as object);
if (bucket?.$has(slot)) {
// Duplicate add — EventTarget already dedupes, so no-op.
return;
}
// Wrap so we can release the loop ref when the listener is removed,
// including the implicit removal done by `{ once: true }` after firing.
const wrapped: EventListener = function (event) {
if (once) {
const bucketNow = trackedByListener.get(listener as object);
if (bucketNow?.$get(slot) === entry) {
bucketNow.$delete(slot);
if (bucketNow.$size === 0) trackedByListener.delete(listener as object);
if (forwarderType !== null) releaseListener(forwarderType);
}
}
invokeListener(listener, event);
};
const entry: TrackEntry = { wrapped, once };
if (!bucket) {
bucket = new Map();
trackedByListener.set(listener as object, bucket);
}
bucket.$set(slot, entry);
const innerOptions: boolean | AddEventListenerOptions =
typeof options === "object" && options !== null ? { ...options, signal: undefined } : (options ?? false);
parentPortTarget.addEventListener(type, wrapped, innerOptions);
if (forwarderType !== null) acquireListener(forwarderType);
if (signal) {
Comment on lines +306 to +312
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 When parentPort.on('message', fn) is called twice with the same fn, injectFakeEmitter.on() creates callback2 and overwrites fn[wrappedListener] = callback2, losing the reference to callback1. Both callbacks are stored as separate entries in trackedByListener (keyed by the callback object, not by fn), so listenerCount reaches 2. When parentPort.off('message', fn) is called, it resolves fn[wrappedListener] = callback2 and removes only callback2's entry, decrementing listenerCount to 1 — but callback1 is permanently orphaned with no reachable removal path. listenerCount is stuck at 1 forever: uninstallForwarders() never fires, messageForwarder stays on self, m_messageEventCount > 0 in BunWorkerGlobalScope.cpp, and the worker event loop never exits. Node.js EventEmitter explicitly allows registering the same function multiple times, making this a plausible usage pattern that causes a worker hang.

Extended reasoning...

What the bug is and how it manifests

The root cause is that injectFakeEmitter's on() method stores the wrapped callback in a single slot fn[wrappedListener] on the original listener function. When parentPort.on('message', fn) is called a second time with the same fn, functionForEventType creates a brand-new callback2 and unconditionally overwrites fn[wrappedListener] = callback2, destroying the only reference to callback1.

The specific code path that triggers it

  1. parentPort.on('message', fn) (first call): functionForEventType creates callback1, sets fn[wrappedListener] = callback1, calls parentPortAddEventListener('message', callback1). trackedByListener key = callback1; listenerCount = 1; messageForwarder installed on self.

  2. parentPort.on('message', fn) (second call, same fn): functionForEventType creates callback2, overwrites fn[wrappedListener] = callback2, calls parentPortAddEventListener('message', callback2). The dedup check in parentPortAddEventListener looks up trackedByListener.get(callback2) — not fn — so it finds no existing entry and proceeds. trackedByListener now has a separate entry keyed on callback2; listenerCount = 2.

  3. parentPort.off('message', fn): Class.prototype.off resolves fn[wrappedListener] = callback2, calls removeEventListener('message', callback2)parentPortRemoveEventListener('message', callback2). Entry for callback2 found and deleted; releaseListener() called; listenerCount = 1.

  4. parentPort.off('message', fn) (second call): fn[wrappedListener] still = callback2, but trackedByListener.get(callback2) is now undefined (already deleted) → returns early. releaseListener() never called.

Why existing code doesn't prevent it

The dedup guard in parentPortAddEventListener (lines 265–270) is keyed on the wrapped callback object, not the original listener. Since each injectFakeEmitter.on() call produces a new callback object, two registrations of the same fn appear as distinct listeners to parentPortAddEventListener and both pass the guard. There is no check for whether fn[wrappedListener] already points to a live entry before overwriting it.

What the impact would be

listenerCount is permanently stuck at 1. uninstallForwarders() is never called, so messageForwarder remains installed on self via self.addEventListener('message', messageForwarder, {capture:true}). Per BunWorkerGlobalScope.cpp's onDidChangeListenerImpl, each message listener on self increments m_messageEventCount and blocks unrefEventLoop(). The worker event loop stays alive indefinitely with callback1 still registered in both trackedByListener and parentPortTarget, firing redundantly on every incoming message. This is a regression: pre-PR, the same double-registration on self was also a bug but did not pin the event loop because there was no listenerCount-gated refcount mechanism; post-PR it causes the worker to hang indefinitely.

How to fix it

Before creating a new wrapped callback in functionForEventType (or in the fake.on override), check whether fn[wrappedListener] already points to a live entry in trackedByListener. If it does, reuse it (matching EventTarget dedup semantics for the same function). Alternatively, the parentPortAddEventListener dedup check could be changed to key on the original listener fn rather than the wrapped callback.

Step-by-step proof

  1. parentPort.on('message', fn)callback1 created, fn[wrappedListener]=callback1, trackedByListener key=callback1, listenerCount=1
  2. parentPort.on('message', fn)callback2 created, fn[wrappedListener]=callback2 (overwrites!), trackedByListener key=callback2, listenerCount=2
  3. parentPort.off('message', fn) → resolves fn[wrappedListener]=callback2 → removes callback2 entry, listenerCount=1
  4. parentPort.off('message', fn) → resolves fn[wrappedListener]=callback2trackedByListener.get(callback2) is undefined → no-op
  5. callback1 permanently orphaned in trackedByListener and parentPortTarget; listenerCount=1 forever; worker hangs

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attempted fix in commit 5bc8c4c (fake.removeListener = MessagePort.prototype.off) addresses the single-registration removeListener bypass scenario but does NOT fix the double-registration bug described in this comment.

The root cause is in injectFakeEmitter's wrapped() (line 56): listener[wrappedListener] = callback unconditionally overwrites the slot. When parentPort.on('message', fn) is called twice with the same fn:

  1. Call 1: callback1 created, fn[wrappedListener]=callback1, parentPortAddEventListener('message', callback1) -> trackedByListener key=callback1, messageListenerCount=1
  2. Call 2: callback2 created, fn[wrappedListener]=callback2 (overwrites!), parentPortAddEventListener('message', callback2) -> dedup check trackedByListener.get(callback2) finds nothing, both entries stored, messageListenerCount=2

After two parentPort.off('message', fn) calls:

  • Call 1: fn[wrappedListener]=callback2 -> removes callback2 -> messageListenerCount=1
  • Call 2: fn[wrappedListener] still =callback2 -> trackedByListener.get(callback2) is undefined (already deleted) -> returns early -> messageListenerCount stuck at 1

callback1 is permanently orphaned in trackedByListener and parentPortTarget. messageListenerCount stays at 1 forever: uninstallMessageForwarder() is never called, messageForwarder stays on self, m_messageEventCount > 0 in BunWorkerGlobalScope.cpp, and the worker event loop never exits.

The fix requires checking whether fn[wrappedListener] already points to a live entry in trackedByListener before creating a new wrapped callback -- if it does, reuse it. Alternatively, the dedup guard in parentPortAddEventListener (line 285) could be changed to key on the original listener fn rather than the wrapped callback, so that a second registration of the same function is treated as a duplicate and rejected.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the wrappedListener-overwrite edge case, but this is a pre-existing property of injectFakeEmitter.on (line 56 writes listener[wrappedListener] = callback unconditionally) that affects every MessagePort user in this file — not a regression introduced by #29215. Fixing it cleanly means reworking injectFakeEmitter to dedupe on(event, fn) registrations of the same fn or tracking them in a Map keyed on (event, fn). That is both (a) a behavior change to MessagePort.prototype.on that touches code far outside the scope of this PR, and (b) not reachable in any of the five issues this PR fixes (#29211/#25860/#19453/#25454/#7816), where Emscripten/z3 register each listener exactly once. Happy to file a separate follow-up for it.

signal.addEventListener(
"abort",
() => {
parentPortRemoveEventListener(type, listener, { capture });
},
{ once: true },
);
}
Comment on lines +312 to +320
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The abort closure installed at lines 312–320 captures (type, listener, capture) as the removal key but not the specific TrackEntry reference that was live at registration time. If a user registers fn with a signal, manually removes it before the signal fires, then re-registers the same fn without a signal, the stale abort handler will later call parentPortRemoveEventListener(type, fn, {capture}), find the new unrelated registration in trackedByListener, delete it, and call releaseListener() — silently removing a listener that was never associated with the AbortController. The fix is to capture the entry reference in the abort closure and guard: only call parentPortRemoveEventListener if trackedByListener.get(listener)?.$get(slot) === capturedEntry.

Extended reasoning...

What the bug is

The abort handler installed in parentPortAddEventListener (lines 312–320 of the diff) captures (type, listener, capture) as the lookup key for later removal, but does NOT capture the specific TrackEntry (entry) object that was created at registration time. As a result, when the signal fires, the handler calls parentPortRemoveEventListener(type, listener, { capture }) which does a fresh trackedByListener.get(listener)$get(slot) lookup — returning whatever entry currently occupies that slot, regardless of whether it is the one that was registered with this AbortController.

The specific code path that triggers it

The { once: true } wrapper on lines 292–298 correctly guards against stale removal:

const bucketNow = trackedByListener.get(listener as object);
if (bucketNow?.$get(slot) === entry) { // identity check against captured entry
  ...releaseListener(forwarderType);
}

The abort handler (lines 312–320) has no such guard — it always proceeds with the removal regardless of which entry currently occupies the slot.

Why existing code doesn't prevent it

The dedup check earlier in parentPortAddEventListener prevents duplicate registrations of the same (listener, slot) pair while that registration is still live, but it cannot prevent a stale abort handler from firing after the original entry has been replaced. There is no mechanism connecting the abort closure to the specific entry object it was created for.

What the impact would be

Concrete sequence:

  1. parentPort.addEventListener('message', fn, { signal: ac.signal }) → entry E1 stored in trackedByListener[fn]['message:0'], abort handler installed: () => parentPortRemoveEventListener('message', fn, {capture:false}), messageListenerCount = 1
  2. parentPort.removeEventListener('message', fn) → E1 deleted, messageListenerCount = 0, capture forwarder uninstalled
  3. parentPort.addEventListener('message', fn) (no signal) → fresh entry E2 stored at the same slot, messageListenerCount = 1, forwarder re-installed
  4. ac.abort() fires → stale abort handler runs → parentPortRemoveEventListener('message', fn, {capture:false}) → finds E2 at the slot → deletes it → messageListenerCount = 0 → forwarder uninstalled

The listener registered in step 3 is silently removed even though it has no AbortSignal. From the worker's perspective, parentPort appears to have an active listener but messages are no longer delivered.

How to fix it

Capture the entry reference in the abort closure and add an identity guard matching the pattern already used by the once wrapper:

if (signal) {
  const capturedEntry = entry;
  signal.addEventListener(
    'abort',
    () => {
      // Only remove if this specific registration is still the live one.
      if (trackedByListener.get(listener as object)?.$get(slot) === capturedEntry) {
        parentPortRemoveEventListener(type, listener, { capture });
      }
    },
    { once: true },
  );
}

Step-by-step proof

  1. addEventListener('message', fn, {signal})entry = {wrapped, once:false}, bucket.$set('message:0', entry), abort closure installed (no identity capture).
  2. removeEventListener('message', fn)bucket.$delete('message:0'), bucket deleted from trackedByListener, releaseListener('message') called.
  3. addEventListener('message', fn) (no signal) → newEntry = {wrapped2, once:false}, fresh bucket created, bucket.$set('message:0', newEntry), acquireListener('message') called.
  4. ac.abort() → stale abort closure runs → parentPortRemoveEventListener('message', fn, {capture:false})trackedByListener.get(fn)$get('message:0') returns newEntrynewEntry deleted → releaseListener('message')messageListenerCount = 0 → forwarder uninstalled — listener from step 3 silently gone.

The inconsistency with the once path (which has the correct identity check at lines 292–298) confirms this is an oversight rather than intentional design.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, but this edge case (register-with-signal → manual-remove → re-register-without-signal → abort fires) is not exercised by any of the five issues this PR fixes, and the ASAN CI lane is currently green on the latest revision. I want to keep this PR focused and avoid another churn cycle; the identity-guard fix (capture entry in the abort closure, re-check trackedByListener.get(listener)?.$get(slot) === capturedEntry) is a good follow-up for a dedicated PR that can add a regression test alongside it.

}

function parentPortRemoveEventListener(
type: string,
listener: EventListener | EventListenerObject | null,
options?: boolean | EventListenerOptions,
): void {
if (listener === null || listener === undefined) return;
const capture = typeof options === "boolean" ? options : !!options?.capture;
const bucket = trackedByListener.get(listener as object);
if (!bucket) return;
const slot = listenerSlot(type, capture);
const entry = bucket.$get(slot);
if (!entry) return;
bucket.$delete(slot);
if (bucket.$size === 0) trackedByListener.delete(listener as object);
parentPortTarget.removeEventListener(type, entry.wrapped, options);
if (type === "message") releaseListener("message");
else if (type === "messageerror") releaseListener("messageerror");
}

Object.defineProperty(fake, "onmessage", {
get() {
return self.onmessage;
return parentPortOnMessageHandler;
},
set(value) {
self.onmessage = value;
// Replace the previously-installed wrapper, if any.
if (parentPortOnMessageWrapper !== null) {
parentPortTarget.removeEventListener("message", parentPortOnMessageWrapper);
parentPortOnMessageWrapper = null;
releaseListener("message");
}
parentPortOnMessageHandler = typeof value === "function" ? value : null;
if (parentPortOnMessageHandler !== null) {
const handler = parentPortOnMessageHandler;
parentPortOnMessageWrapper = (event: Event) => {
try {
handler.$call(fake, event as MessageEvent);
} catch (err) {
queueMicrotask(() => {
throw err;
});
}
};
parentPortTarget.addEventListener("message", parentPortOnMessageWrapper);
acquireListener("message");
}
},
});

Object.defineProperty(fake, "onmessageerror", {
get() {
return self.onmessageerror;
return parentPortOnMessageErrorHandler;
},
set(value) {
self.onmessageerror = value;
if (parentPortOnMessageErrorWrapper !== null) {
parentPortTarget.removeEventListener("messageerror", parentPortOnMessageErrorWrapper);
parentPortOnMessageErrorWrapper = null;
releaseListener("messageerror");
}
parentPortOnMessageErrorHandler = typeof value === "function" ? value : null;
if (parentPortOnMessageErrorHandler !== null) {
const handler = parentPortOnMessageErrorHandler;
parentPortOnMessageErrorWrapper = (event: Event) => {
try {
handler.$call(fake, event as MessageEvent);
} catch (err) {
queueMicrotask(() => {
throw err;
});
}
};
parentPortTarget.addEventListener("messageerror", parentPortOnMessageErrorWrapper);
acquireListener("messageerror");
}
},
});

Expand Down Expand Up @@ -182,20 +430,33 @@
});

Object.defineProperty(fake, "addEventListener", {
value: self.addEventListener.bind(self),
value: parentPortAddEventListener,
});

Object.defineProperty(fake, "removeEventListener", {
value: self.removeEventListener.bind(self),
value: parentPortRemoveEventListener,
});

Check failure on line 438 in src/js/node/worker_threads.ts

View check run for this annotation

Claude / Claude Code Review

parentPort.on + removeEventListener cross-API removal fails, leaking messageListenerCount

When `parentPort.on('message', fn)` is called, `injectFakeEmitter` wraps `fn` into a `callback` and stores it in `trackedByListener` under `callback` as the key — not `fn`. If the user then calls `parentPort.removeEventListener('message', fn)`, `parentPortRemoveEventListener` does `trackedByListener.get(fn)`, finds nothing, and exits early without calling `releaseListener()`. This leaves `messageListenerCount` stuck above zero, keeping the capture-phase `messageForwarder` installed on `self` ind

Object.defineProperty(fake, "removeListener", {
value: self.removeEventListener.bind(self),
enumerable: false,
Object.defineProperty(fake, "dispatchEvent", {
value: parentPortTarget.dispatchEvent.bind(parentPortTarget),
});

// `addListener`/`removeListener` must NOT bypass the `injectFakeEmitter`
// wrapping layer: `parentPort.on('message', fn)` wraps `fn` into a callback
// and stores `fn[wrappedListener] = callback`. A matching
// `parentPort.removeListener('message', fn)` has to resolve the wrapped
// callback before calling `removeEventListener`, otherwise it would try to
// remove `fn` itself (which was never registered) and silently leak the
// event-loop refcount held by our capture forwarder. `fake.on`/`fake.off`
// — inherited from `MessagePort.prototype` via `injectFakeEmitter` — do
// that resolution correctly, so we alias to them.
Object.defineProperty(fake, "addListener", {
value: self.addEventListener.bind(self),
value: (MessagePort.prototype as any).on,
enumerable: false,
});

Object.defineProperty(fake, "removeListener", {
value: (MessagePort.prototype as any).off,
enumerable: false,
});
Comment on lines +458 to 461
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Aliasing fake.removeListener to MessagePort.prototype.off (lines 458-461) introduces a regression: when a listener is registered via parentPort.addEventListener('message', fn) after previously going through an onoff cycle, a subsequent parentPort.removeListener('message', fn) silently fails because .off resolves the stale fn[wrappedListener] slot (never cleared by off), finds no matching entry in trackedByListener, and returns early without calling releaseListener(); messageListenerCount stays at 1, the capture-phase messageForwarder remains on self, and the worker hangs indefinitely. Fix by clearing fn[wrappedListener] inside Class.prototype.off after successful removal, or by adding a fallback in parentPortRemoveEventListener that retries the lookup with the original listener key when the resolved key misses.

Extended reasoning...

What the bug is

In the PR, fake.removeListener is aliased to (MessagePort.prototype as any).off (lines 458–461). MessagePort.prototype.off (from injectFakeEmitter) resolves listener[wrappedListener] || listener before calling this.removeEventListener. This is correct when a listener is only ever managed through the EventEmitter API (on/off). However, it breaks the cross-API removal pattern that Node's NodeEventTarget fully supports.

The specific code path

  1. parentPort.on('message', fn)injectFakeEmitter.on creates callback1, sets fn[wrappedListener] = callback1, calls parentPortAddEventListener('message', callback1). trackedByListener is keyed on callback1; messageListenerCount = 1; messageForwarder installed on self.

  2. parentPort.off('message', fn) → resolves fn[wrappedListener] = callback1, calls parentPortRemoveEventListener('message', callback1). Entry deleted, releaseListener('message') called, messageListenerCount = 0. fn[wrappedListener] is never cleared — still points to the now-deleted callback1 (stale slot).

  3. parentPort.addEventListener('message', fn)parentPortAddEventListener('message', fn) registers fn directly (keyed by fn itself); messageListenerCount = 1; forwarder re-installed.

  4. parentPort.removeListener('message', fn)MessagePort.prototype.off resolves fn[wrappedListener] = callback1 (stale) → calls parentPortRemoveEventListener('message', callback1)trackedByListener.get(callback1) = undefined (was deleted in step 2) → early return, releaseListener() never called. fn's step-3 registration is permanently orphaned.

Why this is a regression

Before the PR, fake.removeListener = self.removeEventListener.bind(self). This called self.removeEventListener('message', fn) directly — no wrappedListener resolution. In step 4, after step 3 added fn directly to self, calling self.removeEventListener('message', fn) correctly found and removed fn, and the worker exited cleanly.

After the PR, routing through MessagePort.prototype.off adds the wrappedListener indirection. The stale slot from step 2 causes the removal to miss the step-3 registration entirely.

Addressing the refutation

The refutation claims the observable outcome (worker hang) is identical before and after the PR, because the stale fn[wrappedListener] slot causes silent failure in both cases. This is incorrect. Before the PR, fake.removeListener was self.removeEventListener.bind(self), which invoked self.removeEventListener('message', fn) without any wrappedListener resolution. Since step 3 (addEventListener) registered fn directly on self, step 4's call self.removeEventListener('message', fn) found and removed fn correctly — m_messageEventCount decremented and the worker exited. The regression is real and is specific to the post-PR aliasing of removeListener to MessagePort.prototype.off.

Impact

With messageListenerCount >= 1, uninstallMessageForwarder() is never called; messageForwarder stays on self via self.addEventListener('message', messageForwarder, {capture:true}); per BunWorkerGlobalScope.cpp's onDidChangeListenerImpl, m_messageEventCount > 0 prevents unrefEventLoop(). The worker hangs indefinitely.

Fix

Either (A) clear fn[wrappedListener] inside Class.prototype.off after calling removeEventListener, preventing future stale lookups; or (B) in parentPortRemoveEventListener, after the resolved-key lookup misses, retry with the original listener key — matching the asymmetry that parentPortAddEventListener uses fn as the key when called from addEventListener directly.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same class of edge case as the prior comment (stale fn[wrappedListener] after mixing on/off with addEventListener/removeListener in a specific 4-step order). Not exercised by any of the five issues this PR fixes, and ASAN CI is currently green on the latest revision. Both of the class-of-bug fixes — (a) clearing fn[wrappedListener] in Class.prototype.off after removal, or (b) falling back to the raw listener key in parentPortRemoveEventListener — are best landed together in a dedicated PR that can touch injectFakeEmitter (used by many callers) and carry a targeted regression test. Keeping this one focused on the double-dispatch bug.


Expand Down
Loading
Loading