Background
The Search API caches Text and AutoComplete objects in IMemoryCache. Text objects can be very large — a 100k-word book is roughly 25–40 MB in memory (Word dictionary + NormalisedFullText/RawFullText strings). The working assumption is few concurrent users, so a handful of large objects can sit in memory comfortably.
Current implementation
TextCache uses IMemoryCache with:
- Sliding expiration (default 30 min, configurable)
- Absolute expiration cap (default 4 h) — prevents LOH objects living indefinitely; forces periodic release and re-load
- Entry-count limit (
CacheMaxEntries, default 20) with Size = 1 per entry — LRU eviction when full
Entry count was chosen over word-count/byte-count sizing because a size-proportional approach has a fatal flaw: a Text object larger than SizeLimit words can never be admitted to the cache at all, silently degrading to uncached storage reads on every request.
See instructions/cache-usage.md for full notes on LOH behaviour and ECS considerations.
Open questions / future considerations
Is entry-count sizing too coarse? 20 small texts don't need eviction; 20 large ones might. IMemoryCache doesn't provide truly dynamic, memory-aware eviction without custom code — the options would be:
- Keep entry count (current) — simple, every object cacheable, tune
CacheMaxEntries per deployment
- Byte/word count with a per-entry cap so no single object exceeds
SizeLimit — more proportional but more complex and still somewhat arbitrary
- A background
IHostedService watching Process.WorkingSet64 and calling memoryCache.Compact(percentage) under pressure — genuinely dynamic but significant complexity
ECS-specific: No cache sharing between tasks; each ECS task warms its own cache independently. Consider ALB sticky sessions (target group stickiness by book ID) if cache hit rate matters under horizontal scaling. Monitor actual container memory via CloudWatch Container Insights — size task memory allocation based on observed usage, not estimates.
Current stance
Leave as-is. The Wellcome Collection equivalent has run in production for years without memory issues. The absolute expiration + configurable entry count is sufficient. CacheMaxEntries is a one-line config change if real usage data shows it needs tuning.
/cc for future revisit when deploying to ECS or if memory pressure is observed.
Background
The Search API caches
TextandAutoCompleteobjects inIMemoryCache. Text objects can be very large — a 100k-word book is roughly 25–40 MB in memory (Word dictionary + NormalisedFullText/RawFullText strings). The working assumption is few concurrent users, so a handful of large objects can sit in memory comfortably.Current implementation
TextCacheusesIMemoryCachewith:CacheMaxEntries, default 20) withSize = 1per entry — LRU eviction when fullEntry count was chosen over word-count/byte-count sizing because a size-proportional approach has a fatal flaw: a
Textobject larger thanSizeLimitwords can never be admitted to the cache at all, silently degrading to uncached storage reads on every request.See
instructions/cache-usage.mdfor full notes on LOH behaviour and ECS considerations.Open questions / future considerations
Is entry-count sizing too coarse? 20 small texts don't need eviction; 20 large ones might.
IMemoryCachedoesn't provide truly dynamic, memory-aware eviction without custom code — the options would be:CacheMaxEntriesper deploymentSizeLimit— more proportional but more complex and still somewhat arbitraryIHostedServicewatchingProcess.WorkingSet64and callingmemoryCache.Compact(percentage)under pressure — genuinely dynamic but significant complexityECS-specific: No cache sharing between tasks; each ECS task warms its own cache independently. Consider ALB sticky sessions (target group stickiness by book ID) if cache hit rate matters under horizontal scaling. Monitor actual container memory via CloudWatch Container Insights — size task memory allocation based on observed usage, not estimates.
Current stance
Leave as-is. The Wellcome Collection equivalent has run in production for years without memory issues. The absolute expiration + configurable entry count is sufficient.
CacheMaxEntriesis a one-line config change if real usage data shows it needs tuning./cc for future revisit when deploying to ECS or if memory pressure is observed.