Skip to content

feat(vllm): add LMCache sidecar in MP mode with period-notation flags#246

Open
genevera wants to merge 1 commit into
av:mainfrom
genevera:lmcache-quickstart
Open

feat(vllm): add LMCache sidecar in MP mode with period-notation flags#246
genevera wants to merge 1 commit into
av:mainfrom
genevera:lmcache-quickstart

Conversation

@genevera

Copy link
Copy Markdown
Contributor

Adds an LMCache multiprocess (MP) sidecar container that runs alongside vLLM, following the official Docker quickstart.

Changes

  • New compose.x.lmcache.vllm.yml: Defines the lmcache-server service using lmcache/standalone:nightly with the quickstart CLI flags (--l1-size-gb, --eviction-policy LRU, --max-workers, --port 6555), GPU device reservations for CUDA IPC, and shared memory (ipc: host, shm_size: 16g)
  • profiles/default.env: Adds HARBOR_LMCACHE_L1_SIZE_GB, HARBOR_LMCACHE_MAX_WORKERS, HARBOR_LMCACHE_PORT; sets HARBOR_VLLM_EXTRA_ARGS to use period-notation --kv-transfer-config.* flags (appended, not replacing existing args)
  • services/vllm/Dockerfile: Switches default base image to lmcache/vllm-openai (required for LMCacheMPConnector support in the vLLM container)
  • New docs/2.1.VLLM-LMCache-Integration.md: Architecture and configuration docs

Key design decisions

  • Period notation over JSON: --kv-transfer-config.kv_connector LMCacheMPConnector instead of inline JSON — easier to read and extend
  • Compose network (not --network host): Host is tcp://lmcache-server via Docker Compose DNS, matching Harbor's harbor-network convention
  • Append semantics: The kv-transfer-config flags are in HARBOR_VLLM_EXTRA_ARGS alongside any other flags; the compose overlay does not override the env var

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants