-
Notifications
You must be signed in to change notification settings - Fork 143
SNMG ANN #231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNMG ANN #231
Conversation
tfeher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Victor for the PR, it looks really great! I have just a few smaller comments.
tfeher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few additional comments
tfeher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are almost there, but the matrix extent type has to be fixed: we should distinguish the neighbor index type (IdxT, usually int64), and the mdspan extent type which is always int64. In a previous review I have marked such changes in ann_mp.cuh, but actually the public API in ann_mg.hpp has also a few places where this need to be fixed
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
CI error here seemed to have been a connection error: |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/merge |
| const cuvs::neighbors::index_params* index_params, | ||
| raft::mdspan<const T, matrix_extent<int64_t>, row_major, Accessor> index_dataset) | ||
| { | ||
| interface.mutex_->lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @viclafargue @achirkin, I'm studying the code for the multi-GPU index feature—great work! I noticed a potential issue: the lock system does not prevent data races on device memory since the API does not explicitly synchronize the stream (at least, I found this to be the case with deserialize). If this behavior is intentional, please feel free to disregard my comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rhdong, thank you for noticing this! It is indeed probably safer to synchronize before unlocking to unsure that all cudaMemcpy calls made during deserialization have completed. Will update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again! Did an edit to fix this issue : 4e0f512
This PR implements a distributed (single-node-multiple-GPUs) implementation of ANN indexes. It allows to build, extend and search an index on multiple GPUs.
Before building the index, the user has to choose between two modes :
Sharding mode : The index dataset is split, each GPU trains its own index with its respective share of the dataset. This is intended to both increase the search throughput and the maximal size of the index.
Index duplication mode : The index is built once on a GPU and then copied over to others. Alternatively, the index dataset is sent to each GPU to be built there. This intended to increase the search throughput.
SNMG indexes can be serialized and de-serialized. Local models can also be deserialized and deployed in index duplication mode.
Migrated from rapidsai/raft#1993