Skip to content

[docs-infra] Memoize formatInlineTypeAsHast#1308

Draft
JCQuintas wants to merge 1 commit intomasterfrom
jcquintas/fix-highlightTypes-memoization
Draft

[docs-infra] Memoize formatInlineTypeAsHast#1308
JCQuintas wants to merge 1 commit intomasterfrom
jcquintas/fix-highlightTypes-memoization

Conversation

@JCQuintas
Copy link
Copy Markdown
Member

Summary

formatInlineTypeAsHast is called once per prop type during type extraction. Each call invokes Shiki's Oniguruma WASM tokenizer, which fragments the WASM heap and drives OOM on large consumer projects (mui-x hit this with ~300 pages).

Adds a promise-cache keyed on unionPrintWidth + typeText. Results are deep-cloned on read so downstream HAST mutation stays safe. Exposes clearInlineTypeHastCache() for tests.

Adds 4 unit tests covering cache hit, mutation isolation, and key disambiguation.

Large component prop graphs reference a small number of shared nested
types thousands of times. Without memoization, each reference re-runs
transformHtmlCodeInline → parseSource → Oniguruma's WASM TextMate
tokenizer on the same input. Oniguruma runs in a fixed-size
WebAssembly linear memory buffer, and the repeated scratch allocations
fragment that buffer until it overruns with

    RuntimeError: memory access out of bounds
        at wasm://wasm/001ce0fe:wasm-function[218]:0x26b63
        at wasm://wasm/001ce0fe:wasm-function[202]:0x23386

(`001ce0fe` is the vscode-oniguruma wasm blob consumed via
@wooorm/starry-night in pipeline/parseSource.)

Instrumenting a mui-x DataGrid extraction showed:

  - 1947 total calls to formatInlineTypeAsHast
  - 5 unique input strings
  - average ~389 duplicate calls per input

The duplication comes from two places both walking the type graph
independently:

  - buildHighlightedExports in highlightTypes.ts, which builds a
    `{ExportName.Props, .DataAttributes, .CssVariables}` map, each
    entry calling formatInlineTypeAsHast on a synthetic object type
    string.
  - highlightComponentTypeMeta in highlightTypesMeta.ts, which walks
    every prop and calls formatInlineTypeAsHast on its `typeText`,
    `shortType`, default, and detailedType fields.

For a large component with many props that share structural types
(GridColDef, GridRowModesModel, GridCallbackDetails, ...), this
cross-product hits the tokenizer thousands of times with identical
inputs.

Fix: wrap formatInlineTypeAsHast with a process-lifetime Map keyed on
`(unionPrintWidth, typeText)`. Return a structuredClone of the cached
HAST so downstream mutations can't poison the cache. Expose a
`clearInlineTypeHastCache` helper for test isolation.

Verified with mui-x DataGrid POC (external):

  /x/api/data-grid/data-grid   63s (was: crash after ~100s)
  /x/api/charts/gauge          8s cold / 140ms warm (unchanged)

Also adds 4 regression tests under `typeHighlighting > memoization`:
cached equivalence, distinct-identity isolation, keying on
unionPrintWidth, and mutation idempotence. Tests now at 73 (was 69)
for this file.
@code-infra-dashboard
Copy link
Copy Markdown

code-infra-dashboard Bot commented Apr 14, 2026

Deploy preview

https://deploy-preview-1308--mui-internal.netlify.app/

Performance

Total duration: 18.41 ms +2.28 ms(+14.1%) | Renders: 4 (+0) | Paint: 72.92 ms 🔺+3.94 ms(+5.7%)

Test Duration Renders
DataGrid mount with paint timing 2.46 ms 🔺+0.55 ms(+28.7%) 1 (+0)
HeavyList mount 11.73 ms 🔺+1.79 ms(+18.0%) 1 (+0)
Counter click 4.22 ms ▼-0.06 ms(-1.4%) 2 (+0)

Details of benchmark changes

Bundle size

Bundle Parsed size Gzip size
@base-ui/react 0B(0.00%) 0B(0.00%)
@mui/x-charts-pro 0B(0.00%) 0B(0.00%)

Details of bundle changes


Check out the code infra dashboard for more information about this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant