-
Notifications
You must be signed in to change notification settings - Fork 193
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
While implementing a recommended recipe to compress the LLM, facing an issue.
Expected behavior
Successful compression of the LLM
To Reproduce
Exact steps to reproduce the behavior:
from llmcompressor.modifiers.quantization import GPTQModifier
from llmcompressor.modifiers.smoothquant import SmoothQuantModifier
from llmcompressor.transformers import oneshot
recipe = [
SmoothQuantModifier(smoothing_strength=0.8),
GPTQModifier(scheme="W8A8", targets="Linear", ignore=["lm_head"]),
]
oneshot(
model=model,#"ushuradmin/usrs_extractions_16bit_llama_3_1_v1",
dataset=load_custom_dataset(DATA_PATH),
recipe=recipe,
output_dir="usrs_extractions_Meta-Llama-3.1-8B-Instruct-INT8",
max_seq_length=2048,
num_calibration_samples=10,
#device_map = "auto"
#hf_token="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"
)
Errors
26, in GPTQWrapper.compress(self, blocksize, percdamp)
124 diag = torch.arange(self.columns, device=self.dev)
125 self.H[diag, diag] += damp
--> 126 self.H = torch.linalg.cholesky(self.H)
127 self.H = torch.cholesky_inverse(self.H)
128 self.H = torch.linalg.cholesky(self.H, upper=True)
_LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 3596 is not positive-definite).
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working