fix: ignore callbacks that arrive after the function stopped waiting#5056
Merged
jcs090218 merged 1 commit intoMay 12, 2026
Merged
Conversation
This was referenced May 3, 2026
The success and error callbacks unconditionally `(throw 'lsp-done '_)' to break the synchronous busy-loop's `(catch 'lsp-done ...)'. If the response arrives after the function has unwound (cancellation is best-effort), the catch is no longer on the stack and the throw escapes to the top level. Capture a closed-over `catch-active' flag in the callbacks; the `unwind-protect' cleanup invalidates the flag before requesting cancel, so any callback fired by an already-queued response after unwind becomes a no-op.
44193be to
7a1bd53
Compare
jcs090218
approved these changes
May 12, 2026
Member
|
Thank you! |
Contributor
Author
|
Thanks to you for the very prompt review!! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
lsp-request-while-no-inputsends an asynchronous request and then waitsfor the response inside a synchronous loop. The success and error callbacks
end with
(throw 'lsp-done '_)to break out of that loop. If the responsearrives after the function has already given up and returned — for
example because the user typed during the wait, or the timeout fired —
there is no longer a
(catch 'lsp-done …)on the call stack to receivethe throw. The throw then escapes all the way to the top level and shows
up as a stray error in some unrelated piece of code.
This PR fixes the race by giving the two callbacks a flag in the function's
local scope. The cleanup code in
unwind-protectclears that flag beforeit asks for cancellation. Any callback that fires later sees the flag as
niland quietly does nothing instead of trying to throw to a destinationthat no longer exists.
No public API changes. The normal path is unchanged. The only difference
a user can observe is that a class of confusing stray errors goes away.
Background
Two pieces of context that the rest of the PR depends on.
catch/throwin Emacs Lispcatchandthroware Emacs Lisp's way of jumping out of a deeply nestedpiece of code without going through the normal return path. They are
not the same as
condition-case/signal, which are used for errorhandling. A
(throw 'TAG VAL)walks up the call stack looking for amatching
(catch 'TAG …)and jumps back to it, discarding all theintermediate function calls along the way.
The property that matters for this bug:
throwonly works if the matchingcatchis currently on the call stack. If no such catch is in progresswhen
throwruns, Emacs raises ano-catcherror in whatever codehappens to be running at that moment. There is no notion of a "future"
or "already-completed" catch — there is only "active right now" or
"nothing there."
What
lsp-request-while-no-inputdoesIt is a synchronous wrapper around an asynchronous request, with one extra
property: if the user types while it is waiting, the wait is given up and
the function returns. The function's shape:
Each callback stores the result in a local variable and then calls
(throw 'lsp-done '_).whileloop that callssit-for,wrapped in
(catch 'lsp-done …). The throw from a callback is whatbreaks out of the loop.
input-pending-pbecomes true (the usertyped something).
unwind-protectcleanup runs on exit and asks the server to cancelthe request by calling
lsp-cancel-request-by-token, in case noresponse arrived in time.
The bug
Cancellation is not guaranteed.
lsp-cancel-request-by-tokenonly asksthe server to cancel; it cannot stop a response that is already on its way
back. Any of the following sequences leaves the callbacks alive after the
function has stopped waiting:
typed. The server will reply regardless.
the server and put it in the queue of messages to be processed.
response gets back to us first.
When the response then arrives, lsp-mode's usual machinery invokes the
registered callback. The callback runs:
resp-resultis a leftover variable (the surroundinglethas alreadyended, so the
setfwrites to a binding that nothing reads any more —harmless). The
throwis the real problem: there is no(catch 'lsp-done …)on the call stack any more. Emacs raises a
no-catcherror in whateverpiece of code happens to be running when the callback fires — an idle
timer, an unrelated command, the next call into lsp-mode, or Emacs's main
event loop. The error usually surfaces as:
with a backtrace that has nothing to do with the original
lsp-request-while-no-inputcall, because the function it came from haslong since returned.
The race shows up in practice because
lsp-request-while-no-inputis ona high-traffic path: it is called from
lsp-completion.elandlsp-inline-completion.el, both of which drive auto-completion UIs likeCorfu — often once per keystroke. Fast typing means many overlapping
requests, each one setting up its own
(catch 'lsp-done …)and thentearing it down again. Sometimes a response arrives in the gap between
teardown and the next setup, and that is when the bug fires.
The fix
Add a flag in the function's local scope that both callbacks check before
throwing, and clear that flag from the
unwind-protectcleanup beforeasking for cancellation:
How it works:
catch-activeistand the callbacksbehave the same as before — they store the result and throw.
unwind-protectcleanup runs whenever the function exits, by anypath. The first thing it does is
(setq catch-active nil). From thatmoment on, any callback that fires sees the flag as
nil, skips thesetfand thethrow, and returns silently.setqon the local variable is enough because the callbacksrefer to the same lexical binding (lsp-mode is compiled with
lexical-binding: t).lsp-cancel-request-by-tokenis called, so even if cancellationsomehow runs a callback synchronously, the flag is already off and the
callback is harmless.
This is the smallest change that closes the race. It does not need to
coordinate with the response queue, and it does not change anything about
how cancellation works.
Why this is preferable to the obvious alternatives
A few alternatives one might reach for, and why they are judged to be worse:
catchinside the callbacks. This wouldadd unused catch blocks to other code paths and silently throw away the
response value; misleading and unnecessary.
lsp-cancel-request-by-tokensynchronous. Far more invasiveand changes how cancellation works elsewhere; the fact that cancellation
is asynchronous and not guaranteed is intentional.
lsp-mode's bookkeeping removes the entry when the response arrives, not
when the request is cancelled. Removing it from this call site would
mean reaching into private bookkeeping that other code expects to own.
(condition-case nil … (no-catch nil))to swallowthe error. This would hide the symptom but leave the real problem in
place: a callback that still thinks it has somewhere to throw to. It
would also lose the return value if the throw happened to land inside
the
condition-case.The flag is local, costs essentially nothing, and correctly captures the
rule: "the destination of the
throwonly exists between sending therequest and the function returning."
Compatibility
lsp-request-while-no-inputkeeps the samesignature, return value, and behaviour on the normal path.
while the function is still waiting (the common case),
catch-activeis still
twhen the callback runs and the throw goes through exactlyas before.
lexical-binding: t), so the local flag captured by the callbacksworks as written.
lsp--throw-on-inputis unaffected. The cleanup still throws'inputwhen the user has typed. Only the'lsp-donethrow comingfrom a late response callback is suppressed.
Reproducing the bug before the patch
The race depends on timing, but it is reliably triggered when the
function is called frequently. A practical recipe:
requests (for example
ltex-ls-plusagainst a moderately sized buffer,or any server that does real work for
textDocument/completion).lsp-completion-enableand use Corfu withcorfu-auto tand ashort
corfu-auto-delay(e.g. 0.1).lsp-request-while-no-inputcall; the next keystroke gives up on theprevious one before it has finished.
*Messages*for entries of the formno-catch: lsp-done '_.They appear from time to time, attributed to whatever command happens
to be running when the late response arrives.
With the patch applied, those messages stop appearing, and the completion
behaviour is otherwise unchanged.
Origin
This issue surfaced while debugging
lsp-ltex-plus(anlsp-modeclientfor the
ltex-ls-plusgrammar/spell server).ltex-ls-plusruns slowchecks alongside frequent server-initiated traffic, which made the bug
easy to trigger from interactive completion. To unblock users while a
proper fix is discussed upstream,
lsp-ltex-plushas been shipping thepatch as a local override of
lsp-request-while-no-inputsince releasev0.3.3(https://github.com/alberti42/emacs-ltex-plus/releases). Thisis mentioned only as context — the bug is in
lsp-moderegardless ofwhich client is connected, and the fix belongs upstream so other clients
benefit without each one having to ship its own override.
Related work
This is the second of three independent fixes to how
lsp-modehandlesmessages when there are many requests waiting for responses at the same
time. The other two are:
lsp--parser-on-messageclassify messages by theirmethodfield before theirid. Today, when a server-initiatedrequest happens to use an id that matches a pending client request,
it is mis-routed as a response. PR 1 fixes that.
lsp--create-filter-functionkeep processing the restof a batch even if one handler throws. Today, a single handler that
throws can abandon the other messages that were parsed in the same
batch. PR 3 catches errors around each message instead of around the
whole batch.
Each PR addresses a separate problem and is independently useful; they
are split apart to keep review focused. PRs 1 and 3 do not depend on this
one — applying any subset of the three is safe.