-
Notifications
You must be signed in to change notification settings - Fork 143
Add C++ API scalar quantization #494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add C++ API scalar quantization #494
Conversation
tfeher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Malte for this PR, here are my first set of comments.
|
|
||
| // select subsample | ||
| int seed = 137; | ||
| constexpr size_t num_samples = 10000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a dataset that has dim=1k , this would correspond to 10 vectors. I suggest to increase this significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I increased to 1M, will this suffice until we switch to a streaming quantile computation?
| rng); | ||
|
|
||
| // quantile / sort and pick for now | ||
| thrust::sort(thrust::omp::par, subset.data(), subset.data() + subset_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to confirm whether sorting on CPU side would be still preferred or not when we increase num_samples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the alternative? We could also switch to a approx. streaming approach here eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative would be to gather the points on the CPU side, and then call the GPU to do the quantization. Assuming we have a streaming quantile estimation, that would not be faster to copy data to the device. But with the current sorting approach, it is not clear to me what is the better solution.
tfeher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Malte for the update, here are my next batch of comments.
tfeher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Malte, this is already in a good shape, please take it out of draft state. A few more comments below.
tfeher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Malte for the updates. The PR looks good to me.
lowener
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the documentation to doxygen
|
Thanks for this feature, @mfoerste4. Do you know yet whether and by how much this improves over the corresponding CPU versions? |
@cjnolet , no, I have not performed any benchmarks with this feature. Ideally we would want to have it as an option within the ann benchmarks, but it did not find the time yet. |
cjnolet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are almost there! Remaining changes from my side are mostly topical- hiding templates, naming and namespacing.
|
/merge |
First draft for scalar quantization.
WIP status: