Skip to content

Commit 019c6bb

Browse files
cleanup
Signed-off-by: Brian Dellabetta <[email protected]>
1 parent 04cfb78 commit 019c6bb

9 files changed

+25
-31
lines changed

tests/e2e/vLLM/recipes/INT8/recipe_int8_channel_weight_dynamic_per_token.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ quant_stage:
33
SmoothQuantModifier:
44
smoothing_strength: 0.8
55
GPTQModifier:
6-
ignore: ["lm_head", "re:vision_tower.*", "re:multi_modal_projector.*", "re:visual.*", "re:vision_model.*"]
6+
ignore: ["lm_head"]
77
config_groups:
88
group_0:
99
weights: {num_bits: 8, type: int, symmetric: true, strategy: channel}

tests/e2e/vLLM/recipes/INT8/recipe_int8_tensor_weight_static_per_tensor_act.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ quant_stage:
33
SmoothQuantModifier:
44
smoothing_strength: 0.8
55
QuantizationModifier:
6-
ignore: [lm_head]
6+
ignore: ["lm_head", "re:vision_tower.*", "re:multi_modal_projector.*", "re:visual.*", "re:vision_model.*"]
77
config_groups:
88
group_0:
99
weights: {num_bits: 8, type: int, symmetric: true, strategy: tensor}

tests/lmeval/configs/int8_w8a8_dynamic_per_token.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
cadence: "weekly"
22
model: meta-llama/Meta-Llama-3-8B-Instruct
3-
scheme: INT8_dyn_per_token
43
recipe: tests/e2e/vLLM/recipes/INT8/recipe_int8_channel_weight_dynamic_per_token.yaml
54
dataset_id: HuggingFaceH4/ultrachat_200k
65
dataset_split: train_sft

tests/lmeval/configs/vl_int8_w8a8_dynamic_per_token.yaml

Lines changed: 0 additions & 22 deletions
This file was deleted.
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
cadence: "weekly"
2+
model: Qwen/Qwen2.5-VL-7B-Instruct
3+
model_class: TraceableQwen2_5_VLForConditionalGeneration
4+
recipe: tests/e2e/vLLM/recipes/INT8/recipe_int8_tensor_weight_static_per_tensor_act.yaml
5+
dataset_id: lmms-lab/flickr30k
6+
dataset_split: "test[:512]"
7+
lmeval:
8+
model: "hf-multimodal"
9+
model_args:
10+
dtype: bfloat16
11+
add_bos_token: True
12+
convert_img_format: True
13+
task: mmmu_val_literature
14+
num_fewshot: 0
15+
use_stderr_atol: True
16+
batch_size: 8
17+
# dense model achieves 0.9 accuracy
18+
metrics:
19+
acc,none: 0.667

tests/lmeval/configs/vl_w4a16_actorder_weight.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,7 @@ model: Qwen/Qwen2.5-VL-7B-Instruct
33
model_class: TraceableQwen2_5_VLForConditionalGeneration
44
recipe: tests/e2e/vLLM/recipes/actorder/recipe_w4a16_actorder_weight_dampfrac1e-1.yaml
55
dataset_id: lmms-lab/flickr30k
6-
dataset_split: "test[:256]"
7-
scheme: W4A16_actorder_group
6+
dataset_split: "test[:512]"
87
lmeval:
98
model: "hf-multimodal"
109
model_args:

tests/lmeval/configs/w4a16_actorder_group.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
cadence: "weekly"
22
model: meta-llama/Meta-Llama-3-8B-Instruct
3-
scheme: W4A16_actorder_group
43
recipe: tests/e2e/vLLM/recipes/actorder/recipe_w4a16_actorder_group.yaml
54
dataset_id: HuggingFaceH4/ultrachat_200k
65
dataset_split: train_sft

tests/lmeval/configs/w4a16_actorder_weight.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
cadence: "weekly"
22
model: meta-llama/Meta-Llama-3-8B-Instruct
3-
scheme: W4A16_actorder_weight
43
recipe: tests/e2e/vLLM/recipes/actorder/recipe_w4a16_actorder_weight.yaml
54
dataset_id: HuggingFaceH4/ultrachat_200k
65
dataset_split: train_sft

tests/lmeval/test_lmeval.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,8 +59,9 @@ class TestLMEval:
5959
W4N16 with channel quantization). To add a new test case, a new config has to be
6060
added to the lm_eval_configs folder. The tests run on a cadence defined by the
6161
`cadence` field. Each config defines the model to quantize. Optionally, a dataset
62-
id and split can be provided for calibration. Finally, all config files must list
63-
a scheme. The scheme can be a preset scheme from
62+
id and split can be provided for calibration.
63+
Either a recipe or a scheme should be provided. If a recipe is not provided, the
64+
config file must list a scheme. The scheme can be a preset scheme from
6465
https://github.com/neuralmagic/compressed-tensors/blob/main/src/compressed_tensors/quantization/quant_scheme.py
6566
or another identifier which can be used for the particular test case. If a recipe
6667
is not provided, it is assumed that the scheme provided is a preset scheme and will

0 commit comments

Comments
 (0)