feat: add quantization #217

stephantul · 2025-04-20T17:22:17Z

This PR adds quantization. Quantization can be applied during distillation, or during loading. Both are equivalent, except that distill-time quantization leads to smaller embedding sizes.

codecov · 2025-04-20T17:23:57Z

Codecov Report

Attention: Patch coverage is 98.07692% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
model2vec/quantization.py	94.73%	1 Missing ⚠️

Files with missing lines	Coverage Δ
model2vec/distill/distillation.py	`94.85% <100.00%> (+0.11%)`	⬆️
model2vec/model.py	`94.70% <100.00%> (+0.14%)`	⬆️
tests/test_model.py	`97.91% <100.00%> (+0.20%)`	⬆️
tests/test_quantization.py	`100.00% <100.00%> (ø)`
model2vec/quantization.py	`94.73% <94.73%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Pringled

LGTM, two questions

Pringled · 2025-04-21T16:15:36Z

model2vec/quantization.py

+        return embeddings.astype(np.float64)
+    elif quantize_to == DType.Int8:
+        # Normalize to [-127, 127] range for int8
+        scale = np.max(np.abs(embeddings)) / 127.0


Can this ever be 0 (zero division issues?)

Only if all embeddings are 0

Pringled · 2025-04-21T16:18:18Z

model2vec/quantization.py

+    elif quantize_to == DType.Float64:
+        return embeddings.astype(np.float64)
+    elif quantize_to == DType.Int8:
+        # Normalize to [-127, 127] range for int8


Should this not be [-128, 127] (the range of an 8-bit signed integer)? Not sure if it's relevant for the code though since it doesn't change the division.

I think the symmetry is more important than making sure the 1 extra value is used. I updated the comment.

feat: add quantization

836f7ac

stephantul requested a review from Pringled April 20, 2025 17:22

stephantul mentioned this pull request Apr 20, 2025

Size of output model is half of original #213

Closed

Pringled approved these changes Apr 21, 2025

View reviewed changes

stephantul added 2 commits April 21, 2025 18:40

add comment

cb7feb2

merge

eb21ede

stephantul merged commit 6731674 into main Apr 21, 2025
5 checks passed

stephantul deleted the quantization branch April 21, 2025 16:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add quantization #217

feat: add quantization #217

Uh oh!

stephantul commented Apr 20, 2025

Uh oh!

codecov bot commented Apr 20, 2025 •

edited

Loading

Uh oh!

Pringled left a comment

Uh oh!

Pringled Apr 21, 2025

Uh oh!

stephantul Apr 21, 2025

Uh oh!

Pringled Apr 21, 2025

Uh oh!

stephantul Apr 21, 2025

Uh oh!

Uh oh!

Uh oh!

feat: add quantization #217

feat: add quantization #217

Uh oh!

Conversation

stephantul commented Apr 20, 2025

Uh oh!

codecov bot commented Apr 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Pringled left a comment

Choose a reason for hiding this comment

Uh oh!

Pringled Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

stephantul Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

Pringled Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

stephantul Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Apr 20, 2025 •

edited

Loading