Skip to content

Conversation

@benfred
Copy link
Member

@benfred benfred commented Jul 4, 2025

Add the ability to get the graph and dataset from a CAGRA index to the c-api and python apis, as well as being able to reconstruct the cagra index from a graph and dataset.

The eventual goal here is to be more flexible in terms of allowing other serialization formats. Rather than supporting every format inside of cuvs, by exposing the raw data needed to recreate a cagra index - we can let consumers of cuvs decide how they want to serialize an index.

Add the ability to get the graph and dataset from a CAGRA index to the c-api
and python apis, as well as being able to reconstruct the cagra index from
a graph and dataset.

The eventual goal here is to be more flexible in terms of allowing other
serialization formats. Rather than supporting every format inside of cuvs,
by exposing the raw data needed to recreate a cagra index - we can let
consumers of cuvs decide how they want to serialize an index.
@benfred benfred added the improvement Improves an existing functionality label Jul 4, 2025
@benfred benfred requested review from a team as code owners July 4, 2025 22:51
@benfred benfred added the non-breaking Introduces a non-breaking change label Jul 4, 2025
@benfred benfred requested a review from AyodeAwe July 4, 2025 22:52
files: |
(?x)
[.](cmake|cpp|cu|cuh|h|hpp|sh|pxd|py|pyx|rs)$|
[.](cmake|cpp|cu|cuh|h|hpp|sh|pxd|py|pyx|rs|java)$|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

*/

enum cuvsKMeansInitMethod {
typedef enum {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ldematte
Copy link
Contributor

ldematte commented Jul 10, 2025

@benfred I tried to check this out and build locally, but it does not compile, claiming template errors around matrix_vector_op.
However, I have the same problem on the current "main" branch-25.08 -- so I suppose it's unrelated to your changes, and something else broke it (I'm 100% sure it was compiling a week ago)

Scratch that, I needed to update my conda environment as the raft include files changed.

Copy link
Contributor

@ldematte ldematte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments while I was writing a Java wrapper for this

@ldematte
Copy link
Contributor

TL;DR: I liked the API as it was in 2f22d53 (cuvsCagraIndexGetGraphView + cuvsCagraIndexCopyGraph), but I like the last implementation ("view" + cuvsCopyMatrix) even more.

If anything, in a follow-up PR, it would be great to have partial copy (e.g. cuvsCopyMatrix with fromRow, toRow or something similar) so in the Java side I don't have to replicate what is done there and/or in raft::copy_matrix<T>.

@benfred
Copy link
Member Author

benfred commented Jul 17, 2025

If anything, in a follow-up PR, it would be great to have partial copy (e.g. cuvsCopyMatrix with fromRow, toRow or something similar) so in the Java side I don't have to replicate what is done there and/or in raft::copy_matrix.

How about having something like a cuvsSliceMatrix or cuvsSliceRows functions to do this ? (where the cuvsSliceRows function doesn't copy data, just adjusts the shape / strides / data pointer to slice the matrix without copying data). We could then pass the slice to the copy matrix function , like:

// get the cagra graph
DLManagedTensor graph;
cuvsCagraIndexGetGraph(index, &graph);

// get the first 1K rows from the graph
DLManagedTensor subgraph;
cuvsSliceMatrix(&graph, 0, 1024,  &subgraph);

// copy the subgraph to host  memory
cuvsCopyMatrix(&subgraph,  &subgraphHost);

(this actually makes me think that I should rename cuvsCopyMatrix to cuvsMatrixCopy , so that we can have cuvsMatrixSlice etc - and to be consistent with other cuvs API's)

@mythrocks
Copy link
Contributor

if you think there is value for C users too, and want to add a specific C API, I can use that. My 2c: it's not worth the effort: C users can/will just use cudaMemcpy/cudaMemcpy2D (but I'll let you decide on this point)...

@benfred / @cjnolet will keep me honest here.

I think the C-lang user should not use cudaMemcpy directly. The dataset is likely to be padded to an 8-byte boundary when stored in __device__ memory. A naive cudaMemcpy will copy/interpret padding bytes that aren't actual data.

Using a cuVS-specific copy() API would be preferable to insulate the user from padding. If a future CAGRA implementation does away with the padding, the cuVSMatrixCopy() user would be insulated from the change. The cudaMemcpy() user might not.

benfred added 4 commits July 17, 2025 12:53
add ability to page through returned rows via a new 'cuvsMatrixSliceRows' function
@ldematte
Copy link
Contributor

How about having something like a cuvsSliceMatrix or cuvsSliceRows functions to do this

That's even better!

this actually makes me think that I should rename cuvsCopyMatrix to cuvsMatrixCopy

++

Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nitpicks, but this looks good to go.

@mythrocks
Copy link
Contributor

/merge

from cuvs.common.resources import auto_sync_resources


cdef class DeviceTensorView:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we return device_matrix_view without intermediate copies? I am not 100% sure I see the benefits of adding this class compared to device views?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python API is using the c-api (instead of the c++ api) - meaning we can't use device_matrix_view directly inside python. Instead we're using dlpack.DLManagedTensor inside our C-API.

Previous to this PR, our c-api only accepted dlpack arrays as inputs (and would either use the contents in functions like cagra::build - or fill pre-allocated arrays with the outputs in functions like cagra::search). We hadn't yet exposed memory that was allocated in our C++ codebase to python.

This PR changes that - and we are now returning dlpack DLManagedTensor objects that return memory that is owned by and allocated inside our C++ codebase. This code does that without copying the data , with the flow going : device_matrix_view (c++) -> DLManagedTensor (C) -> DeviceTensorView (python) . At each step we aren't copying the data, there isn't an intermediate copy - so much as intermediate objects that have a pointer to the original data, and also extra information such as the shape/dtype/strides etc.

This DeviceTensorView code is necessary because we didn't have anything that would take a DLManagedTensor and return something that could be easily consumed in python. the closest object we had was the pylibraft.device_ndarray object - but that object wouldn't have worked here.

*
* @endcode
*/
cuvsError_t cuvsCagraIndexFromGraph(cuvsResources_t res,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick- can we please rename this to cuvsCagraIndexFromParams or cuvsCagraIndexFromArgs? I'd like to keep tthe API design consistent and having to specify specific args in the name will get unwieldy quickly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed to cuvsCagraIndexFromArgs in the last commit - (went with FromArgs instead of FromParams - since I think the FromParams could be confused with the Index Params we use to build the index)

@cjnolet cjnolet removed the request for review from AyodeAwe July 22, 2025 18:37
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving packaging changes.

@rapids-bot rapids-bot bot merged commit cf9b256 into rapidsai:branch-25.08 Jul 22, 2025
53 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Elasticsearch + cuVS Team Jul 22, 2025
@benfred benfred deleted the cagra_c_graph_dataset branch July 23, 2025 21:55
rapids-bot bot pushed a commit that referenced this pull request Aug 12, 2025
…1216)

This PR leverages the functions introduced by #1086 and the data structures introduced by #1111 to access, copy, and re-create an index to/from a CAGRA graph.

Supersedes #1105

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - MithunR (https://github.com/mythrocks)

URL: #1216
enp1s0 pushed a commit to enp1s0/cuvs that referenced this pull request Aug 22, 2025
…apidsai#1216)

This PR leverages the functions introduced by rapidsai#1086 and the data structures introduced by rapidsai#1111 to access, copy, and re-create an index to/from a CAGRA graph.

Supersedes rapidsai#1105

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1216
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CMake cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change Python

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants