Fix SVN changelog watermark skipping commits#114
Merged
Conversation
The SVN change-detection tracked progress with a single watermark (svn_revision_plugin) that markChangedFromSVNLog advanced to currentRev — the repo HEAD scraped from the HTML directory listing — regardless of how far the changelog REPORT actually scanned. The listing and the DAV REPORT are served from different load-balanced mirrors with replication lag, so the REPORT routinely didn't include the newest revisions the listing's HEAD already reflected. Those in-between revisions were never returned and never revisited, because the next run started at currentRev+1. Any plugin whose only changed revision fell in a skipped gap was stranded at its old version until its next release happened to land in a scanned range — a steady ~5% of the corpus. Advance the watermark only to the highest revision the REPORT actually returned (maxRev), never to the listing HEAD. When the REPORT replica lags, the cursor stops where the REPORT stopped and the gap is rescanned next run instead of being skipped. Scan in bounded 10k-revision chunks so a large catch-up after downtime can't exceed the server's log-item limit and silently truncate, and persist progress on mid-scan errors so runs are resumable. This stops new strandings; existing stale packages are recovered separately with update --force. Ref https://discourse.roots.io/t/wp-packages-didnt-update-a-plugin-already-updated-on-wpackagist/30378 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A plugin (colissimo-shipping-methods-for-woocommerce) was stuck at 2.9.0 while wp.org had 2.10.0. It wasn't a one-off: sampling the March-frozen corpus against the live wp.org API found ~5% of plugins serving a stale version.
Root cause
SVN change-detection tracks progress with a single watermark,
svn_revision_plugin. Each discover run reads HEAD from the HTML directory listing (ParseSVNListing), fetches the changelog forlastRev+1..HEADvia a separate DAV REPORT, then advances the watermark to that HEAD unconditionally.The listing and the REPORT come from different load-balanced SVN mirrors with replication lag, so the REPORT frequently doesn't include the newest revisions the listing's HEAD already reflects. The code marked the slugs it got and jumped the watermark to HEAD anyway, so revisions in the gap were never scanned and never revisited — the next run started past them. Any package whose only changed revision landed in a skipped gap was stranded until its next release happened to fall in a scanned range. Nothing reconciled it: the HTML listing carries no per-dir dates, so the shell-upsert path couldn't catch it either.
A fetch error was always safe (it returned before persisting the watermark, so the next run retried). Only a successful-but-incomplete REPORT lost data.
Fix
maxRev), never to the listing HEAD. When the REPORT replica lags, the cursor stops where the REPORT stopped and the gap is rescanned next run.Recovery
This stops new strandings; it doesn't back-fill the existing stale set. Those were recovered separately with
update --force --type plugin.Ref https://discourse.roots.io/t/wp-packages-didnt-update-a-plugin-already-updated-on-wpackagist/30378
🤖 Generated with Claude Code