Skip to content

Conversation

@mfoerste4
Copy link
Contributor

@mfoerste4 mfoerste4 commented Nov 25, 2024

First draft for scalar quantization.

WIP status:

  • only int8_t target type
  • quantile computation inefficient (via sampling & sorting)

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for this PR, here are my first set of comments.


// select subsample
int seed = 137;
constexpr size_t num_samples = 10000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a dataset that has dim=1k , this would correspond to 10 vectors. I suggest to increase this significantly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I increased to 1M, will this suffice until we switch to a streaming quantile computation?

rng);

// quantile / sort and pick for now
thrust::sort(thrust::omp::par, subset.data(), subset.data() + subset_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to confirm whether sorting on CPU side would be still preferred or not when we increase num_samples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the alternative? We could also switch to a approx. streaming approach here eventually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative would be to gather the points on the CPU side, and then call the GPU to do the quantization. Assuming we have a streaming quantile estimation, that would not be faster to copy data to the device. But with the current sorting approach, it is not clear to me what is the better solution.

@mfoerste4 mfoerste4 requested a review from tfeher November 26, 2024 18:03
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for the update, here are my next batch of comments.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Malte, this is already in a good shape, please take it out of draft state. A few more comments below.

@mfoerste4 mfoerste4 marked this pull request as ready for review December 2, 2024 13:01
@mfoerste4 mfoerste4 requested review from a team as code owners December 2, 2024 13:01
@mfoerste4 mfoerste4 changed the title [WIP] add C++ API scalar quantization Add C++ API scalar quantization Dec 2, 2024
@mfoerste4 mfoerste4 requested a review from tfeher December 2, 2024 13:04
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for the updates. The PR looks good to me.

@tfeher tfeher added feature request New feature or request non-breaking Introduces a non-breaking change labels Dec 2, 2024
Copy link
Contributor

@lowener lowener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the documentation to doxygen

@cjnolet
Copy link
Member

cjnolet commented Dec 3, 2024

Thanks for this feature, @mfoerste4. Do you know yet whether and by how much this improves over the corresponding CPU versions?

@mfoerste4
Copy link
Contributor Author

Thanks for this feature, @mfoerste4. Do you know yet whether and by how much this improves over the corresponding CPU versions?

@cjnolet , no, I have not performed any benchmarks with this feature. Ideally we would want to have it as an option within the ann benchmarks, but it did not find the time yet.

@mfoerste4 mfoerste4 requested review from cjnolet and lowener December 3, 2024 22:34
Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are almost there! Remaining changes from my side are mostly topical- hiding templates, naming and namespacing.

@mfoerste4 mfoerste4 requested a review from a team as a code owner December 4, 2024 14:28
@mfoerste4 mfoerste4 requested a review from cjnolet December 4, 2024 16:14
@cjnolet
Copy link
Member

cjnolet commented Dec 5, 2024

/merge

@rapids-bot rapids-bot bot merged commit b051f80 into rapidsai:branch-24.12 Dec 5, 2024
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CMake cpp feature request New feature or request non-breaking Introduces a non-breaking change

Development

Successfully merging this pull request may close these issues.

4 participants