-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[None][feat] Nixl support for GDS #5488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/bot run |
/bot run |
7210604
to
96a28f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the goal of this PR to use NIXL for loading/unloading KVBlocks from disc and use GDS when doing this?
cpp/tensorrt_llm/executor/cache_transmission/nixl_utils/transferAgent.cpp
Outdated
Show resolved
Hide resolved
d5ddede
to
57f2533
Compare
Thank you for your contribution. This looks well modularized. May I ask where the related tests are located? Is it possible to reuse any existing tests? And I didn't see changes to Python runtime, may I ask how this part is being handled? |
📝 WalkthroughWalkthroughWires a loopback agent into KV-cache transfer flows, adds file-backed descriptors and a BaseLoopbackAgent/NixlLoopbackAgent, threads explicit transfer mode and a non-optional directory string through KV-cache APIs/constructors, and updates tests and build flags to exercise DRAM and GDS paths. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor App as Application
participant Exec as Executor
participant KVCM as KVCacheManager
participant BM as BlockManager
participant WBM as WindowBlockManager
participant TM as KVCacheTransferManager
participant LBA as BaseLoopbackAgent
participant FS as Filesystem
participant VRAM as DeviceMemory
App->>Exec: generate(request)
Exec->>KVCM: addSequence(request (mode, directory))
KVCM->>BM: loadOrAllocateBlocks(..., mode, directory)
BM->>WBM: getFreeBlock(..., mode, directory)
alt offload needed
WBM->>TM: offloadBlock(block, mode, directory)
TM->>LBA: registerFiles(fileDescs) / registerMemory(memoryDescs)
TM->>LBA: submitLoopbackRequests(memoryDescs,filedescs,isOffload=true)
LBA-->>TM: TransferStatus
TM->>LBA: deregisterFiles/Memory
TM-->>WBM: offload complete
else onboard needed
WBM->>TM: onboardBlock(offloadBlock, mode, directory)
TM->>LBA: registerFiles(fileDescs) / registerMemory(memoryDescs)
TM->>LBA: submitLoopbackRequests(memoryDescs,filedescs,isOffload=false)
LBA-->>TM: TransferStatus
TM->>LBA: deregisterFiles/Memory
TM-->>WBM: onboard complete
end
WBM-->>BM: block ready
BM-->>KVCM: allocation result
KVCM-->>Exec: sequence ready
Exec-->>App: produce tokens
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
Tests were added in the 2 final commits. |
PR_Github #17675 [ run ] triggered by Bot |
Passing mode & directory parameters to relevant onboard & offload functions. Signed-off-by: Tomer Shmilovich <[email protected]>
507570f
to
becd2a4
Compare
PR_Github #17675 [ run ] completed with state |
/bot run --disable-fail-fast |
PR_Github #17723 [ run ] triggered by Bot |
PR_Github #17723 [ run ] completed with state |
/bot run --disable-fail-fast |
PR_Github #17917 [ run ] triggered by Bot |
PR_Github #17917 [ run ] completed with state |
becd2a4
to
6b92b32
Compare
Implement class LoopbackAgent. Signed-off-by: Tomer Shmilovich <[email protected]>
Signed-off-by: Tomer Shmilovich <[email protected]>
Signed-off-by: Tomer Shmilovich <[email protected]>
Signed-off-by: Guy Lev <[email protected]>
Signed-off-by: Guy Lev <[email protected]>
Signed-off-by: Tomer Shmilovich <[email protected]>
6b92b32
to
ed5fd42
Compare
/bot run --disable-fail-fast |
PR_Github #18018 [ run ] triggered by Bot |
PR_Github #18018 [ run ] completed with state |
/bot run --disable-fail-fast |
PR_Github #18073 [ run ] triggered by Bot |
PR_Github #18073 [ run ] completed with state |
Signed-off-by: nv-guomingz <[email protected]>
Signed-off-by: nv-guomingz <[email protected]>
Signed-off-by: nv-guomingz <[email protected]>
Signed-off-by: Tomer Shmilovich <[email protected]> Signed-off-by: Guy Lev <[email protected]> Co-authored-by: Guy Lev <[email protected]>
Signed-off-by: nv-guomingz <[email protected]>
Nixl support for GDS
Commit 1 ("Pass mode & directory") is a dependency, seperate PR in: #5983
Summary by CodeRabbit
New Features
Tests
Chores