Skip to content

Revert CQ shared store: Delete from index on remove or roll over (#13959)#16142

Merged
lhoguin merged 2 commits intorabbitmq:mainfrom
amazon-mq:lukebakken/cq-gc
Apr 20, 2026
Merged

Revert CQ shared store: Delete from index on remove or roll over (#13959)#16142
lhoguin merged 2 commits intorabbitmq:mainfrom
amazon-mq:lukebakken/cq-gc

Conversation

@lukebakken
Copy link
Copy Markdown
Collaborator

Summary

Reverts 0278980 ("CQ shared store: Delete from index on remove or roll over", PR #13959), which introduced a regression in the classic queue message store GC. Three independent improvements from that commit are retained.

Fixes #16141.

Problem

PR #13959 replaced the scan_and_vacuum_message_file call in delete_file with an eager index cleanup mechanism (current_file_removes). As a side effect, messages removed from non-current files now produce not_found index lookups during scan_and_vacuum_message_file instead of previously_valid ones. This was noted in the PR review by @gomoripeti but not addressed before merge.

Under high throughput with many queues, the byte-by-byte scan_next_byte scanning mode triggered by not_found entries causes GC compaction to fall far enough behind the publish rate that disk usage can grow without bound. The stall also causes consumer latency spikes and broker unresponsiveness on established TCP connections.

Reproduction: 100 classic queues at 500 msg/s (120 KB messages) with a slow-ack consumer queue in the same vhost (acks held 1-30 min, up to 1000 in flight). On an m7g.large with a 196 GB EBS volume, disk fell from 185.4 GB to ~169 GB in ~100 minutes. Consumer latency reached a median of 1.5s and a max of 568s.

Reproduction scripts: https://github.com/lukebakken/rmq-gc-lag

Changes

Two commits:

  1. Revert 0278980 in full.
  2. Restore three independent improvements from that commit that are unrelated to the broken current_file_removes mechanism:
    • Relax index_update_fields assertion (true= to _=) so a missing key does not crash the process
    • Add prioritise_cast/3 to rabbit_msg_store_gc so delete requests are processed before compaction requests, avoiding unnecessary compaction of files already pending deletion
    • compact_file/2 early-exit guard (file already deleted) was already present after the revert

Testing

Ran the reproduction workload against this branch for 60 minutes (three consecutive 20-minute monitoring windows) at 500 msg/s with ~1000 unacked messages. Disk held stable in a 0.5 GB oscillation band (184.96-185.47 GB). Ready messages held at 0 throughout. No latency spikes.

For comparison, the same workload against unpatched main lost ~16 GB of disk in ~100 minutes with ready messages growing to 3500-4200.

@michaelklishin
Copy link
Copy Markdown
Collaborator

@lukebakken Loïc will be back next week. We will have a day or two to decide before the final 4.3.0 cut-off/freeze.

Restore three independent improvements from the reverted commit that
are unrelated to the broken current_file_removes mechanism:

- Relax index_update_fields assertion: true= -> _= so a missing key
  does not crash the process
- Add prioritise_cast/3 to rabbit_msg_store_gc so delete requests are
  processed before compaction requests, avoiding unnecessary compaction
  of files that are already pending deletion
- compact_file/2 early-exit guard was already present after the revert
@lukebakken
Copy link
Copy Markdown
Collaborator Author

Well this is something that shouldn't be rushed!

Copy link
Copy Markdown
Contributor

@lhoguin lhoguin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I went a bit too far on this. Deletion was too expensive so the PR fixed that, at the cost of making some compactions more expensive. A plain revert would mean we'd just go back to that state, but since we keep some changes I'm OK with merging this now and seeing how it fares overall in the wild.

@lhoguin
Copy link
Copy Markdown
Contributor

lhoguin commented Apr 20, 2026

In particular perhaps the prioritisation of delete is good enough to avoid the bigger issues of deletes.

@lhoguin lhoguin merged commit 3bffa63 into rabbitmq:main Apr 20, 2026
189 checks passed
@lhoguin
Copy link
Copy Markdown
Contributor

lhoguin commented Apr 20, 2026

At some point we should rework the file format to avoid these issues entirely. We need the file itself to contain information about messages existing, or holes existing, and what size they are. The in-memory index should only ever refer to existing messages and act more like a cache over what is in the files, to be used by readers.

@michaelklishin
Copy link
Copy Markdown
Collaborator

@lukebakken so that you don't have to guess: the earliest this can ship now is 4.3.1.

@michaelklishin michaelklishin added the backport-pending Use with PRs that are yet to be backported for any reason label Apr 20, 2026
@michaelklishin
Copy link
Copy Markdown
Collaborator

michaelklishin commented Apr 20, 2026

@Mergifyio backport v4.3.x

(the PR will be marked as delayed for 4.3.1)

@mergify
Copy link
Copy Markdown

mergify bot commented Apr 20, 2026

backport v4.3.x

✅ Backports have been created

Details

@lukebakken
Copy link
Copy Markdown
Collaborator Author

Thanks @lhoguin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-pending Use with PRs that are yet to be backported for any reason bug effort-high performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Classic queue message store GC cannot keep pace under high throughput after 0278980ba0

3 participants