Add C++ API scalar quantization #494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

rapids-bot merged 19 commits into rapidsai:branch-24.12 from mfoerste4:scalar_quantization

Dec 5, 2024

Contributor

mfoerste4 commented Nov 25, 2024 •

edited

Loading

First draft for scalar quantization.

WIP status:

only int8_t target type
quantile computation inefficient (via sampling & sorting)


          initial commit device

77be478

github-actions bot added cpp CMake labels


          added host support

9e35c37

tfeher requested changes

View reviewed changes

Contributor

tfeher left a comment

Thanks Malte for this PR, here are my first set of comments.

cpp/src/neighbors/detail/quantization.cuh Outdated

+                  // select subsample
+                  int seed                     = 137;
+                  constexpr size_t num_samples = 10000;

Contributor

tfeher Nov 26, 2024

For a dataset that has dim=1k , this would correspond to 10 vectors. I suggest to increase this significantly.

Contributor Author

mfoerste4 Nov 26, 2024

I increased to 1M, will this suffice until we switch to a streaming quantile computation?

cpp/src/neighbors/detail/quantization.cuh Outdated Show resolved Hide resolved

cpp/src/neighbors/detail/quantization.cuh Outdated

+                              rng);
+                  // quantile / sort and pick for now
+                  thrust::sort(thrust::omp::par, subset.data(), subset.data() + subset_size);

Contributor

tfeher Nov 26, 2024

Need to confirm whether sorting on CPU side would be still preferred or not when we increase num_samples.

Contributor Author

mfoerste4 Nov 26, 2024

What would be the alternative? We could also switch to a approx. streaming approach here eventually.

Contributor

tfeher Dec 2, 2024

Alternative would be to gather the points on the CPU side, and then call the GPU to do the quantization. Assuming we have a streaming quantile estimation, that would not be faster to copy data to the device. But with the current sorting approach, it is not clear to me what is the better solution.

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved


          changed API, couple of fixes

b2db922

mfoerste4 requested a review from tfeher

November 26, 2024 18:03

tfeher requested changes

View reviewed changes

Contributor

tfeher left a comment

Thanks Malte for the update, here are my next batch of comments.

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

cpp/src/neighbors/quantization.cu Outdated Show resolved Hide resolved

cpp/src/neighbors/quantization.cu Outdated Show resolved Hide resolved


          add inverse transform, review suggestions

52fca9f

tfeher requested changes

View reviewed changes

Contributor

tfeher left a comment

Hi Malte, this is already in a good shape, please take it out of draft state. A few more comments below.

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

cpp/src/neighbors/detail/quantization.cuh Outdated Show resolved Hide resolved

cpp/src/neighbors/detail/quantization.cuh Outdated Show resolved Hide resolved

mfoerste4 added 2 commits

December 2, 2024 12:53


          more review suggestions

8eb9911


          some cleanup

35d6492

mfoerste4 marked this pull request as ready for review

December 2, 2024 13:01

mfoerste4 requested review from a team as code owners

December 2, 2024 13:01

mfoerste4 changed the title ~~[WIP] add C++ API scalar quantization~~ Add C++ API scalar quantization

mfoerste4 requested a review from tfeher

December 2, 2024 13:04


          try to fix half/host issue

7dc93f0

tfeher approved these changes

View reviewed changes

Contributor

tfeher left a comment

Thanks Malte for the updates. The PR looks good to me.

tfeher added feature request non-breaking labels

mfoerste4 added 2 commits

December 2, 2024 16:47


          add fp16 header

b41bff6


          manual override for half equal

1ea41bf

lowener reviewed

View reviewed changes

Contributor

lowener left a comment

Add the documentation to doxygen

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

cpp/src/neighbors/detail/quantization.cuh Outdated Show resolved Hide resolved


          overload < operator for half

6f2a4a3

cjnolet reviewed

View reviewed changes

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

cjnolet reviewed

View reviewed changes

cpp/include/cuvs/neighbors/quantization.hpp Outdated Show resolved Hide resolved

Member

cjnolet commented Dec 3, 2024

Thanks for this feature, @mfoerste4. Do you know yet whether and by how much this improves over the corresponding CPU versions?

mfoerste4 added 2 commits

December 3, 2024 19:17


          moving files

e05e5b1


          finalize move

8ee9e98


          refactor class to simple struct with free functions

c5eb9ef

Contributor Author

mfoerste4 commented Dec 3, 2024

Thanks for this feature, @mfoerste4. Do you know yet whether and by how much this improves over the corresponding CPU versions?

@cjnolet , no, I have not performed any benchmarks with this feature. Ideally we would want to have it as an option within the ann benchmarks, but it did not find the time yet.


          change API to use view in/out instead of array return

7fda6d7

mfoerste4 requested review from cjnolet and lowener

December 3, 2024 22:34

mfoerste4 and others added 2 commits

December 3, 2024 22:47


          fixed docstrings

2a58595


          Merge branch 'branch-24.12' into scalar_quantization

513137f

lowener reviewed

View reviewed changes

cpp/include/cuvs/preprocessing/quantization.hpp Outdated Show resolved Hide resolved

cjnolet requested changes

View reviewed changes

Member

cjnolet left a comment

I think we are almost there! Remaining changes from my side are mostly topical- hiding templates, naming and namespacing.

cpp/include/cuvs/preprocessing/quantization.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/preprocessing/quantization.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/preprocessing/quantization.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/preprocessing/quantization.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/preprocessing/quantization.hpp Outdated Show resolved Hide resolved


          refactor namespace

ee7ac7a

mfoerste4 requested a review from a team as a code owner

December 4, 2024 14:28


          add missing files

45cabf9

mfoerste4 requested a review from cjnolet

December 4, 2024 16:14


          Merge branch 'branch-24.12' into scalar_quantization

2d2d1a0

cjnolet approved these changes

View reviewed changes

Member

cjnolet commented Dec 5, 2024

/merge

rapids-bot bot merged commit b051f80 into rapidsai:branch-24.12

55 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CMake cpp feature request non-breaking