Skip to content

Search API cache: memory sizing strategy for large Text objects #9

Description

@tomcrane

Background

The Search API caches Text and AutoComplete objects in IMemoryCache. Text objects can be very large — a 100k-word book is roughly 25–40 MB in memory (Word dictionary + NormalisedFullText/RawFullText strings). The working assumption is few concurrent users, so a handful of large objects can sit in memory comfortably.

Current implementation

TextCache uses IMemoryCache with:

  • Sliding expiration (default 30 min, configurable)
  • Absolute expiration cap (default 4 h) — prevents LOH objects living indefinitely; forces periodic release and re-load
  • Entry-count limit (CacheMaxEntries, default 20) with Size = 1 per entry — LRU eviction when full

Entry count was chosen over word-count/byte-count sizing because a size-proportional approach has a fatal flaw: a Text object larger than SizeLimit words can never be admitted to the cache at all, silently degrading to uncached storage reads on every request.

See instructions/cache-usage.md for full notes on LOH behaviour and ECS considerations.

Open questions / future considerations

Is entry-count sizing too coarse? 20 small texts don't need eviction; 20 large ones might. IMemoryCache doesn't provide truly dynamic, memory-aware eviction without custom code — the options would be:

  • Keep entry count (current) — simple, every object cacheable, tune CacheMaxEntries per deployment
  • Byte/word count with a per-entry cap so no single object exceeds SizeLimit — more proportional but more complex and still somewhat arbitrary
  • A background IHostedService watching Process.WorkingSet64 and calling memoryCache.Compact(percentage) under pressure — genuinely dynamic but significant complexity

ECS-specific: No cache sharing between tasks; each ECS task warms its own cache independently. Consider ALB sticky sessions (target group stickiness by book ID) if cache hit rate matters under horizontal scaling. Monitor actual container memory via CloudWatch Container Insights — size task memory allocation based on observed usage, not estimates.

Current stance

Leave as-is. The Wellcome Collection equivalent has run in production for years without memory issues. The absolute expiration + configurable entry count is sufficient. CacheMaxEntries is a one-line config change if real usage data shows it needs tuning.

/cc for future revisit when deploying to ECS or if memory pressure is observed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions