fix: classify JSON-RPC messages by method before id#5055
Merged
Conversation
Per JSON-RPC 2.0, the presence of a `method' field unambiguously identifies a message as either a request (with id) or a notification (without id). Responses are exclusively id-bearing messages without a method. The previous parser classified messages id-first (via `lsp--get-message-type'), which fails when a server-initiated request's id happens to collide with the id of a pending client request: the parser routes the server's request as a response, looks up the wrong handler, fires it with bogus arguments, and never answers the server. The actual request is silently dropped. The misclassification is amplified by `lsp:json-response-id', the only id-extraction path through `lsp--get-message-type'. It is designed for response-shaped messages and signals an error when given a request (where id is at the top level rather than nested inside `result' / `error'). The error is swallowed by the surrounding `with-demoted-errors', so the whole message disappears with no diagnostic. Symptom: the LTeX+ language server (and any server that initiates `window/workDoneProgress/create' or `workspace/configuration' requests shortly after a client request) hangs waiting indefinitely for an acknowledgment that lsp-mode never sends. Particularly visible with high-traffic clients (Corfu auto-completion at short delay). This commit: - Classifies each message by checking `method' first. If present, the message is a request (with id) or a notification (without id). Only fall back to id-based classification for actual responses. - Replaces `lsp:json-response-id' with a representation-agnostic helper that reads the top-level id directly, so request ids are extracted from request shapes correctly. - Reads `method' and `id' via the same helper, so routing works regardless of `lsp-use-plists'. The `with-demoted-errors' wrap is preserved to keep the original "a single bad message doesn't crash the parser" robustness; the kind-first reclassification removes the most common cause of demoted errors.
This was referenced May 3, 2026
jcs090218
approved these changes
May 10, 2026
Contributor
Author
|
Thanks !! This was the most important fix, without which https://github.com/alberti42/emacs-ltex-plus could not run at all. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
lsp--parser-on-messagecurrently decides what kind of JSON-RPC message it islooking at by examining the
idfield first. This is unsafe: a server-initiatedrequest and a client-initiated response can carry the same
idvalue, inwhich case the server's request is mis-routed as a response, fed to the wrong
handler, and never replied to. The server then waits forever for an answer it
will never receive.
This PR reorders the classification so that the presence of a
methodfield ischecked before the
id. That single change is enough to eliminate thecollision: by the JSON-RPC spec, only requests and notifications carry a
method, and only responses lack one — somethodis an unambiguousdiscriminator, while
idis not.No public APIs change. No dispatch ordering changes. The pcase arm order stays
identical. The only observable difference is that messages which were
previously mis-classified are now classified correctly.
Background: the three shapes of a JSON-RPC message
JSON-RPC 2.0 (spec) defines three
message shapes. They are distinguished by which fields are present:
methodidresult/errorThe key observation:
methodis present iff the message originates a newexchange (request or notification).
idis present in both requests andresponses — so it cannot, on its own, tell the two apart.
A correct dispatcher must therefore look at
methodfirst.The bug
lsp--parser-on-messagecurrently routes vialsp--get-message-type, whichchecks
idfirst. The decision tree is roughly:This works as long as the
idof a server-initiated request never coincideswith the
idof a pending client request. But nothing in the JSON-RPC specforbids that coincidence, and in practice it happens all the time.
A concrete collision
ltex-ls-plus(and other servers built onlsp4j) numbers itsserver-initiated requests with string ids:
"1","2","3", …Meanwhile, lsp-mode numbers its client-initiated requests with integer
ids:
1,2,3, …That sounds safe — different types, no clash — but on the wire both end up
serialized as the same digits, and lsp-mode normalizes string ids to integers
before lookup. So when the client has a pending request with id
4and theserver happens to send a request with id
"4", the dispatcher sees a messagewith
id = 4, finds an entry in its response-handler table, and dispatchesit as a response. The wrong handler runs; the server's actual request is
silently dropped.
What the user sees
The most common symptom is a server stuck "checking" forever. The server is
politely waiting for a reply (to
window/workDoneProgress/create,workspace/configuration, or similar) that the client has already consumedunder the wrong handler. No error appears anywhere — the message simply
disappears.
A reproducible setup:
ltex-ls-plusas the server andlsp-ltex-plusas the client.lsp-completion-enable tand a fast-typing client like Corfu with a shortcorfu-auto-delay.idcollision occurs and the server's nextwindow/workDoneProgress/createis swallowed. The progress UI never closes;the next
textDocument/publishDiagnosticsnever fires.A second, smaller bug amplifies the first one.
lsp--get-message-type'sid-extraction path goes through
lsp:json-response-id, which is shaped forresponse messages (where
idlives at the top level, alongsideresult/error). When it is handed a request shape, it can signal an error. Thaterror is swallowed by the surrounding
with-demoted-errors, so the message isnot just mis-routed — it is dropped entirely, with no log entry. From the
outside it looks like the server simply never sent the request.
The fix
Replace the id-first classification with a
method-first one. The newdecision tree is:
In code:
That is the substantive change. Everything else in the diff is plumbing to
support it cleanly:
A small
json-gethelper reads top-level fields without going through theresponse-shaped accessor. It works on both hash-table and plist
representations, so the parser remains correct under both
lsp-use-plistssettings.
The
idis normalised (string → integer) only on the response branches,where lsp-mode's handler table expects integer keys. Request and
notification handlers are passed the original
json-dataand don't need anormalised id.
The response and response-error branches now guard the handler lookup with
(when handler ...)instead ofcl-assert id. The assert was alreadyunreachable in practice (the old classifier guaranteed
idwas present),but removing it lets the new code degrade gracefully if a stale id ever
arrives — which is preferable to crashing inside
with-demoted-errorsandlosing the message silently. The pcase arm order is unchanged.
The surrounding
with-demoted-errorswrap is preserved: a malformed messageshould still not crash the parser. The new classification simply removes the
most common cause of demoted errors, so in steady state the wrap should
rarely trigger.
Compatibility
lsp--parser-on-messagekeeps the same signature andreturn contract.
batch ordering is the caller's responsibility and is untouched.
cl-labelsandcondare already used throughoutlsp-mode.
json-gethelper, so
lsp-use-plistsusers are unaffected.outside the client's range) see no behavioural change — the new
classification produces the same answer as the old one in the
non-colliding case.
Reproducing the bug before the patch
For maintainers who want to confirm: run lsp-mode against
ltex-ls-pluswithLSP-side completion enabled and watch the JSON-RPC log. Look for a message of
the form
{"jsonrpc":"2.0","id":"<N>","method":"window/workDoneProgress/create", ...}arriving from the server, where
<N>is an integer that also appears as theidof a recent client request. With the current code, no{"jsonrpc":"2.0","id":"<N>","result":null}reply is ever sent back. Withthis patch, the reply is sent.
A small auditing script is included in my working notes
(
docs/parse-ltex-stdin-emacs.py) that scans an LSP wire log and flags theoverlap. I can clean it up and contribute it separately if useful.
Origin
This issue surfaced while debugging
lsp-ltex-plus(anlsp-modeclient forthe
ltex-ls-plusgrammar/spell server).ltex-ls-plusinitiates frequentserver-side requests during interactive editing, which made the collision
easy to hit and easy to reproduce. (LSP 3.17 inherits JSON-RPC 2.0's
message shapes unchanged, so the fix applies to every LSP server, not just
this one.) To unblock users while a proper fix is
discussed upstream,
lsp-ltex-plushas been shipping the patch as a localoverride of
lsp--parser-on-messagesince releasev0.1.0(https://github.com/alberti42/emacs-ltex-plus/releases). It is mentioned
here only as context — the bug exists in
lsp-modeindependently of anyparticular client, and the fix belongs upstream so other clients benefit
without each having to ship its own override.
Related work
This PR is the first of three independent fixes to the same general area
(message dispatch under high request volume / interactive completion). The
other two will follow as separate PRs:
lsp-request-while-no-inputso a stale callback thatarrives after the synchronous wait has unwound does not throw to a tag that
no longer exists.
lsp--create-filter-functionresilient against a singlehandler that exits non-locally: catch per-message instead of per-batch, so
one throwing handler cannot abandon the rest of the parsed batch.
Each PR addresses a distinct invariant and is independently valuable; they
are split out to keep review focused.