Fix missing sync_stream in ScaNN build #1224

rmaschal · 2025-08-06T18:15:00Z

As title. For larger number of clusters (or faster PCI-E), the work required to find cluster assignments would take longer than the HtoD copy for each batch. This would lead to overwriting of one of the batch_load_iterator device buffers before kmeans predict completed, and thus lower recall.

This adds a raft::resource::sync_stream at the end of the kmeans predict loop to avoid this. Although not technically necessary (as the prior DtoH copy into pageable memory is synchronous), sync_stream is also added to the end of the quantization loop for consistency/correctness.

Also removed one unused comment.

tarang-jain · 2025-08-06T18:43:58Z

cpp/src/neighbors/scann/detail/scann_build.cuh

    dataset_vec_batches.prefetch_next_batch();
+
+    // Make sure work on device is finished before swapping buffers
+    raft::resource::sync_stream(res);


prefetch_next_batch seems to be synchronizing at the very end. Would it make better sense to have the sync_stream before prefetching the batch (above line 175)?

prefetch_next_batch() only syncs on the stream used for copying, so if the copy happens on a separate stream from the work (e.g. if you want to prfetch/overlap) then work might not be done when buffers are swapped (which is what I am trying to fix). Moving sync stream before prefetch will fix the problem, but at the cost of no overlapping of work/copy, reducing performance.

Good point. I had not realized that prefetching is done using a different stream

cjnolet · 2025-08-07T17:01:41Z

/merge

As title. For larger number of clusters (or faster PCI-E), the work required to find cluster assignments would take longer than the HtoD copy for each batch. This would lead to overwriting of one of the batch_load_iterator device buffers before kmeans predict completed, and thus lower recall. This adds a raft::resource::sync_stream at the end of the kmeans predict loop to avoid this. Although not technically necessary (as the prior DtoH copy into pageable memory is synchronous), sync_stream is also added to the end of the quantization loop for consistency/correctness. Also removed one unused comment. Authors: - https://github.com/rmaschal Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Tarang Jain (https://github.com/tarang-jain) URL: rapidsai#1224

Fix missing sync in ScaNN build

87990e8

rmaschal requested a review from a team as a code owner August 6, 2025 18:15

github-project-automation bot added this to Vector Search, ML, & Data Mining Release Board Aug 6, 2025

github-project-automation bot moved this to Todo in Vector Search, ML, & Data Mining Release Board Aug 6, 2025

cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Aug 6, 2025

cjnolet assigned rmaschal Aug 6, 2025

cjnolet moved this from Todo to In Progress in Vector Search, ML, & Data Mining Release Board Aug 6, 2025

cjnolet approved these changes Aug 6, 2025

View reviewed changes

tarang-jain reviewed Aug 6, 2025

View reviewed changes

tarang-jain approved these changes Aug 6, 2025

View reviewed changes

rapids-bot bot merged commit 3cd48dc into rapidsai:branch-25.10 Aug 7, 2025
55 checks passed

github-project-automation bot moved this from In Progress to Done in Vector Search, ML, & Data Mining Release Board Aug 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix missing sync_stream in ScaNN build #1224

Fix missing sync_stream in ScaNN build #1224

Uh oh!

rmaschal commented Aug 6, 2025

Uh oh!

tarang-jain Aug 6, 2025

Uh oh!

rmaschal Aug 6, 2025

Uh oh!

tarang-jain Aug 6, 2025

Uh oh!

cjnolet commented Aug 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix missing sync_stream in ScaNN build #1224

Fix missing sync_stream in ScaNN build #1224

Uh oh!

Conversation

rmaschal commented Aug 6, 2025

Uh oh!

tarang-jain Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

rmaschal Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

tarang-jain Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

cjnolet commented Aug 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants