Search API cache: memory sizing strategy for large Text objects

## Background

The Search API caches `Text` and `AutoComplete` objects in `IMemoryCache`. Text objects can be very large — a 100k-word book is roughly 25–40 MB in memory (Word dictionary + NormalisedFullText/RawFullText strings). The working assumption is few concurrent users, so a handful of large objects can sit in memory comfortably.

## Current implementation

`TextCache` uses `IMemoryCache` with:
- **Sliding expiration** (default 30 min, configurable)
- **Absolute expiration cap** (default 4 h) — prevents LOH objects living indefinitely; forces periodic release and re-load
- **Entry-count limit** (`CacheMaxEntries`, default 20) with `Size = 1` per entry — LRU eviction when full

Entry count was chosen over word-count/byte-count sizing because a size-proportional approach has a fatal flaw: a `Text` object larger than `SizeLimit` words can never be admitted to the cache at all, silently degrading to uncached storage reads on every request.

See `instructions/cache-usage.md` for full notes on LOH behaviour and ECS considerations.

## Open questions / future considerations

**Is entry-count sizing too coarse?** 20 small texts don't need eviction; 20 large ones might. `IMemoryCache` doesn't provide truly dynamic, memory-aware eviction without custom code — the options would be:
- Keep entry count (current) — simple, every object cacheable, tune `CacheMaxEntries` per deployment
- Byte/word count with a per-entry cap so no single object exceeds `SizeLimit` — more proportional but more complex and still somewhat arbitrary
- A background `IHostedService` watching `Process.WorkingSet64` and calling `memoryCache.Compact(percentage)` under pressure — genuinely dynamic but significant complexity

**ECS-specific:** No cache sharing between tasks; each ECS task warms its own cache independently. Consider ALB sticky sessions (target group stickiness by book ID) if cache hit rate matters under horizontal scaling. Monitor actual container memory via CloudWatch Container Insights — size task memory allocation based on observed usage, not estimates.

## Current stance

Leave as-is. The Wellcome Collection equivalent has run in production for years without memory issues. The absolute expiration + configurable entry count is sufficient. `CacheMaxEntries` is a one-line config change if real usage data shows it needs tuning.

/cc for future revisit when deploying to ECS or if memory pressure is observed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Search API cache: memory sizing strategy for large Text objects #9

Background

Current implementation

Open questions / future considerations

Current stance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Search API cache: memory sizing strategy for large Text objects #9

Description

Background

Current implementation

Open questions / future considerations

Current stance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions