docs: add SGLang HiCache L3 observability note by xzh25 · Pull Request #2603 · kvcache-ai/Mooncake

xzh25 · 2026-06-24T13:27:53Z

Summary

Adds a conservative documentation note for the AI Studio A800 SGLang HiCache + Mooncake Store L3 experiment.

The note records:

the tested platform/model/workload scope
one verified L3 read-back case with exists_hit=3095 and get_success=3095
prior Store write-back counters
the negative performance result, where Store reload did not beat no-store in this constrained setup
follow-up work needed before claiming a production optimization

Validation

Applied cleanly to current kvcache-ai/Mooncake:main
Documentation-only change
Source submission repository: https://github.com/xzh25/mooncake-kvpool-optimizer

gemini-code-assist

Code Review

This pull request adds a new documentation page detailing SGLang HiCache + Mooncake L3 observability on an AI Studio A800 runtime, along with updating the index file to link to it. The reviewer pointed out a typo in the model name ('Qwen3-0.6B') and suggested correcting it to a valid model version.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-24T13:28:31Z

+## Scope
+
+- Platform: Baidu AI Studio A800 runtime.
+- Model: Qwen3-0.6B.


The model name Qwen3-0.6B appears to be a typo, as there is no Qwen3 model series currently released. Please correct this to the actual model used in the experiment, such as Qwen1.5-0.6B or Qwen2-0.5B.

Suggested change

- Model: Qwen3-0.6B.

- Model: Qwen1.5-0.6B.

ykwd · 2026-06-25T02:48:25Z

Thanks for putting this together and for documenting the experiment carefully. I think this is useful as a personal small-scale experiment and can provide some reference value. However, I’m not sure it is a good fit to merge into the official docs at this point. I’d suggest keeping this as an external note or discussion for now.

stmatengss · 2026-06-25T03:20:55Z

+
+## Scope
+
+- Platform: Baidu AI Studio A800 runtime.


Could you make the example more broadly applicable for wider use?

stmatengss · 2026-06-25T03:21:24Z

 |----------|----------|---------------|
 | [PD Disaggregation Performance](../sglang-benchmark-results-v1) | SGLang PD disaggregation with Mooncake Transfer Engine | 1P1D PD disaggregation achieves approximately **30% lower ITL** while maintaining comparable throughput against two regular instances. |
 | [HiCache with Mooncake Backend Benchmark](../sglang-hicache-benchmark-results-v1) | SGLang HiCache using Mooncake Store as L3 storage | Mooncake-backed HiCache improves prefill performance in multi-turn workloads by maintaining higher KV cache hit rates as conversation rounds grow. |
+| [AI Studio A800 L3 Observability](../sglang-hicache-l3-aistudio-observability) | SGLang HiCache with Mooncake Store on a single AI Studio A800 runtime | L3 write-back and one read-back case were observed, but Store reload did not beat no-store in this constrained setup. |


Should be L3 Observability?

docs: add SGLang HiCache L3 observability note

5a12de1

xzh25 requested review from ShangmingCai, stmatengss and ykwd as code owners June 24, 2026 13:27

gemini-code-assist Bot reviewed Jun 24, 2026

View reviewed changes

github-actions Bot added documentation Improvements or additions to documentation run-ci labels Jun 24, 2026

stmatengss reviewed Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: add SGLang HiCache L3 observability note#2603

docs: add SGLang HiCache L3 observability note#2603
xzh25 wants to merge 1 commit into
kvcache-ai:mainfrom
xzh25:codex/sglang-hicache-l3-observability

xzh25 commented Jun 24, 2026 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Uh oh!

ykwd commented Jun 25, 2026

Uh oh!

stmatengss Jun 25, 2026

Uh oh!

stmatengss Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

xzh25 commented Jun 24, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

ykwd commented Jun 25, 2026

Uh oh!

stmatengss Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

stmatengss Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xzh25 commented Jun 24, 2026 •

edited by github-actions Bot

Loading