Skip to content

Gptj use gptq quant bug #961

@yemyhdtrc6088

Description

@yemyhdtrc6088

Describe the bug
Exception has occurred: TypeError
GPTJBlock.forward() got multiple values for argument 'hidden_states'
File "/home/zhoushengguang/workspace/llm-compressor/examples/big_models_with_accelerate/mult_gpus_int8_device_map.py", line 76, in
oneshot(
TypeError: GPTJBlock.forward() got multiple values for argument 'hidden_states'

Expected behavior
Exception has occurred: TypeError
GPTJBlock.forward() got multiple values for argument 'hidden_states'
File "/home/zhoushengguang/workspace/llm-compressor/examples/big_models_with_accelerate/mult_gpus_int8_device_map.py", line 76, in
oneshot(
TypeError: GPTJBlock.forward() got multiple values for argument 'hidden_states'

Environment
Include all relevant environment information:

  1. OS [e.g. Ubuntu 20.04]:
  2. Python version [e.g. 3.7]:
  3. LLM Compressor version or commit hash [e.g. 0.1.0, f7245c8]:
  4. ML framework version(s) [e.g. torch 2.3.1]:
  5. Other Python package versions [e.g. vLLM, compressed-tensors, numpy, ONNX]:
  6. Other relevant environment information [e.g. hardware, CUDA version]:

This is my code:
`import torch
from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoTokenizer

from llmcompressor.modifiers.quantization import GPTQModifier
from llmcompressor.modifiers.smoothquant import SmoothQuantModifier
from llmcompressor.transformers import oneshot
from llmcompressor.transformers.compression.helpers import calculate_offload_device_map

model_id="gptj/checkpoint-final"

device_map = calculate_offload_device_map(
model_id, num_gpus=2, reserve_for_hessians=True, torch_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(
model_id, device_map=device_map, torch_dtype=torch.bfloat16
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
NUM_CALIBRATION_SAMPLES = 512
MAX_SEQUENCE_LENGTH = 2048

recipe = [
GPTQModifier(
targets="Linear",
scheme="W8A8",
ignore=["lm_head"],
),
]
SAVE_DIR="GptJ-6B-GPTQ/"

oneshot(
model=model,
dataset="open_platypus",
recipe=recipe,
max_seq_length=MAX_SEQUENCE_LENGTH,
num_calibration_samples=NUM_CALIBRATION_SAMPLES,
save_compressed=True,
output_dir=SAVE_DIR,
)
`

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions