Skip to content

Conversation

@yiliu30
Copy link
Contributor

@yiliu30 yiliu30 commented Feb 2, 2024

Type of Change

Feature

Description

Half-Quadratic Quantization (HQQ) Intro:

HQQ requiring no calibration data, significantly speeds up the quantization of large models, while offering compression quality competitive with that of calibration-based methods. For instance, HQQ takes less than 5 minutes to process the colossal Llama-2-70B, that’s over 50x faster compared to the widely adopted GPTQ.

Usage:

from neural_compressor.torch.quantization import get_default_hqq_config, quantize
model = AutoModelForCausalLM.from_pretrained("facebook/opt-125m")
q_model = quantize(model, quant_config=get_default_hqq_config())

How has this PR been tested?

Local test and UTs

Dependency Change?

  • numpy
  • psutil
  • torch

Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
@yiliu30 yiliu30 added the PyTorch Related to PyTorch F/W label Feb 2, 2024
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
@wenhuach21 wenhuach21 self-requested a review February 7, 2024 01:12
@yiliu30 yiliu30 merged commit 07f940c into master Feb 7, 2024
@yiliu30 yiliu30 deleted the ly/hqq3 branch February 7, 2024 01:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Algorithm: HQQ PyTorch Related to PyTorch F/W

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants