Skip to content

Conversation

@benfred
Copy link
Member

@benfred benfred commented Jul 23, 2025

No description provided.

@benfred benfred self-assigned this Jul 23, 2025
@benfred benfred requested a review from a team as a code owner July 23, 2025 05:06
@benfred benfred added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Jul 23, 2025

auto merge(raft::resources const& res,
const std::vector<tiered_index::index<cagra::index<float, uint32_t>>*>& indices)
-> tiered_index::index<cagra::index<float, uint32_t>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ideally we would have merge() return index and, as you have done, use the PIMPL idom to allow the return of either the same index type that was input, or a composite_index.

* @note: When device memory is sufficient, the dataset attached to the returned index is allocated
* in device memory by default; otherwise, host memory is used automatically.
*
* @note: This API only supports physical merge (`merge_strategy = MERGE_STRATEGY_PHYSICAL`), and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our discussion yesterday, we should just remove the "merge strategy" concept all together and support only physical in the "merge" apis. We can still provide a composite that will allow for multiple indexes to be searched concurrently, but don't need to make it part of the merge() apis.

* @brief merge multiple indices together
*/
template <typename UpstreamT>
auto merge(raft::resources const& res,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder- if we are going to rebuild the cagra index anyways to consolidate tiered indices, should we just go ahead and add all the vectors in tier0 to that cagra index too (e.g. should we also do a compaction?). Disregard if this is already what we're doing.

* @{
*/
/**
* @brief Merge multiple indices together into a single index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we flesh out this description a little more to provide more details for the user on exaclty how this does the merge? We should do the same in the c++ description also. Maybe add details about what happens to the vectors in tier 0, what happens to the indexes in tier 1, what are some of the memory considerations (I know there's copies- what are some of the assumptions a user will need to make?)

@cjnolet cjnolet changed the base branch from branch-25.08 to branch-25.10 August 13, 2025 15:35
@benfred
Copy link
Member Author

benfred commented Aug 14, 2025

/merge

@rapids-bot rapids-bot bot merged commit 699d32c into rapidsai:branch-25.10 Aug 14, 2025
100 of 102 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Elasticsearch + cuVS Team Board Aug 14, 2025
@benfred benfred deleted the tiered_index_merge branch August 14, 2025 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change Waiting for author

Development

Successfully merging this pull request may close these issues.

2 participants