[Store] Refactor accelerator device registry and staging copies#2583
[Store] Refactor accelerator device registry and staging copies#2583Aionw wants to merge 8 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors platform-specific accelerator memory management and copy operations into an object-oriented AcceleratorDevice abstraction and registry pattern, replacing direct preprocessor macro checks. The review feedback identifies several critical issues: a restriction in PinnedBufferPool that limits pinned host memory allocation to single-accelerator environments and causes performance degradation in multi-accelerator setups; a compilation error in ascend_accelerator_device.cpp when USE_ASCEND_DIRECT is defined due to a pointer type mismatch; and performance overhead in the CUDA, HIP, and Ascend implementations from repeatedly querying device counts in the hot path, which should instead be cached statically.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
04d3c38 to
9eccb30
Compare
9eccb30 to
27b2346
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a unified device abstraction layer (AcceleratorDevice, AcceleratorRegistry, and RuntimeAccelerator) to replace platform-specific conditional compilation across the codebase, updating PinnedBufferPool, Client, FileStorage, and MemcpyWorkerPool to use this new interface. The review feedback highlights several critical improvements: resolving an inconsistency in AscendAcceleratorDevice where CurrentDeviceId() returns a physical ID instead of a logical ID, adding a nullptr check in FindDeviceForPointer to prevent driver errors, optimizing CopyMaybeAccelerator by passing explicit copy directions, refactoring move constructors in PinnedHostBuffer and Buffer to use member initializers, avoiding self-move-assignment in PinnedBufferPool, and handling duplicate device registrations in AcceleratorRegistry consistently.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
9512d7f to
cf737bb
Compare
cf737bb to
0def852
Compare
…evice-rfc # Conflicts: # mooncake-store/src/CMakeLists.txt
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Description
Closes #2582.
This PR introduces a Store-side
AcceleratorDeviceabstraction for localaccelerator memory operations. Vendor-specific pointer query, context switch,
copy, and pinned-host allocation are moved behind per-vendor device
implementations, while Store call sites use the registry's available
accelerator list and pointer-based dispatch.
The latest revision also simplifies the accelerator registry around static
device registration:
RegisterAcceleratorDeviceentrypoint and keepsregistration available only through static
AcceleratorDeviceRegistrarinstances
vendor-indexed table
RuntimeAccelerators(false)path avoids taking a mutex after initializationensure=trueas the explicit refresh pathIt also tightens runtime copy behavior:
FindDeviceForPointerIsDevicePointerwrapperFindDeviceForPointerresults at D2H staging sites to avoid queryingthe same pointer twice
relying on
kAutoPinnedBufferPoolModule
mooncake-transfer-engine)mooncake-store)mooncake-ep)mooncake-pg)mooncake-integration)mooncake-p2p-store)mooncake-wheel)mooncake-common)mooncake-rl)Type of Change
How Has This Been Tested?
Test commands:
Test by UT
Test results:
runtime_accelerator_testpasses with 9/9 tests.Note: targeted
pre-commitwas run, but the projectmooncake-code-formathook cannot complete in this environment because
clang-format-20is notinstalled. The other relevant hooks for the changed files passed.
Checklist
./scripts/code_format.shpre-commit run --all-filesand all hooks passAI Assistance Disclosure
This PR was implemented with AI assistance.
AI generated the implementation. Human review covered 100% of the diff.