Add manifest cache to prevent memory explosion on large builds#319
Add manifest cache to prevent memory explosion on large builds#319jornmineur wants to merge 5 commits into11ty:mainfrom
Conversation
- Replace mtime+size check with content hash for reliable CI cache restoration - Add urlFormat to #canSkipBuffer() exclusions - Style cleanup: use let instead of const per project conventions
|
Conversation: #315 |
Test sequence nr was missing
|
Is there anything I need to do to enable the creation of the manifest with this branch? I replaced my v5 dependency, but the only behaviour I see is the increased memory usage I mentioned elsewhere, even on repeated builds, and I don’t see an |
…iles – should be removed as a condition in #canSkipBuffer.
Not at all, you encountered a bug that I had missed! 😄 Based on what you describe, the manifest process was completely bypassed. The most likely reason is that your build config uses the default output directory, which of course is perfectly valid. The problem is that the code was checking for the presence of an explicitly defined If you're indeed using the default output directory (in other words, haven't defined a specific/custom output directory), then the latest commit should solve the problem. Thanks Aankhen, your input is super helpful! (*) The idea for the Turns out it didn't help at all, and even backfired when the default |
|
I’m glad to hear my blog’s idiosyncratic build process is helping. I owe you my thanks: all this experimentation helped me finally find the source of the runaway memory usage. I (re-)discovered that, for historical reasons, I wasn’t passing file paths but file contents to Now I could compare your branch to v6.0.4 in a slightly more useful manner. I ran each of them (in dev mode so that the server would remain) in separate Docker containers, recording Eleventy’s reported times and using a handy Python script to plot memory usage via Gnuplot. First, here’s a cold build (no existing generated images nor manifest): v6’s memory usage climbed inexorably towards 12 GB (presumably because of sharp’s allocations since Node isn’t exhausting its heap), and the build itself also took 1,360 seconds. In comparison, your branch took 1,197 seconds and memory usage leveled out at 2 GB. I next let it build a couple of times with the manifest then measured the third run (83 seconds with your branch, 90 with v6): Neither version used nearly as much memory as before, but v6 still used around 300+ MB more than your version. Finally, I started a Bash terminal for each and let it build, serve, settle, and exit (Ctrl+C) thrice: You can see that, in each instance, your branch finished the build more quickly (the difference is around the same as above) and used 30–60% less memory. All of which is to say: this is great work and I think these benchmarks clearly demonstrate the improvement! Thanks for putting the PR together. |
|
These benchmarks are so valuable and insightful! Great to see the improvements. On my end I'm seeing even more dramatic gains, but that's likely because the site has many category pages sharing the same images — lots of manifest cache hits. Really appreciate your help with this! 🙏 |
|
Thank you for your work. And yes, that makes sense about reusing the images in your case. By the way, I apologize for the misspellings in the legends! I’ve updated the images so they have correctly-labeled lines. |
|
The results are amazing! And @Aankhen 's test methodology is also really nice! I'm just asking out of curiosity, when an image is processed and saved to disk, it's stored as an entry to manifest-cache, right? And when the image is being processed the next time, a manifest-cache lookup is made, and the already processed file is returned. But since we have |
|
The latest commit now uses This give us one source of truth, with proper cache counter tracking as a bonus. I tested from a cold start and from a full build, and both worked fine, with no noticeable difference in build time, memory and CPU load. |
|
Many thanks for fixing this @jornmineur , I've got a big Eleventy site with 1,200 images and 13,000 pages and your fix halves its build time (with no memory explosion causing swap writes) |
PR Summary
This PR addresses issue #302 .
An earlier PR which attempted to solve the problem in a different way turned out to be a dead end, hence this new PR.
Problem summary
When processing large numbers of images, memory usage explodes because:
memCacheand never releasedSolution
Two-part fix:
1. Persistent manifest cache (
src/manifest-cache.js).cache/eleventy-img-manifest.json2. No buffer loading for production local files (in
queue())When the optimization applies
The new code path is used when ALL of these are true:
dryRunmodestatsOnlymodetransformOnRequestOtherwise, falls back to existing behavior.
Files changed
src/manifest-cache.js— Persistent cache implementationsrc/image.js:ManifestCacheimport and singleton instancequeue()to check manifest first, skip buffer loadinggetHash()to not store file contents in#contents#canSkipBuffer(),#outputFilesExist(),hasLoadedBufferhelper methodstest/test.js— Added test to verify buffers aren't loadedManifest structure
{ "./src/images/hero.jpg::abc123hash": { "mtime": 1705123456789, "size": 245678, "stats": { "avif": [{"format": "avif", "width": 400, "url": "...", ...}], "jpeg": [{"format": "jpeg", "width": 400, "url": "...", ...}] } } }Key is
sourcePath::optionsHash. Buffers are stripped before storing.Graceful degradation
Performance impact
statSync+ manifest lookup)