Skip to content

Commit 3ce30df

Browse files
[Test] Add accuracy test for qwen3-8b-w8a8
Signed-off-by: hfadzxy <[email protected]>
1 parent 4312a92 commit 3ce30df

File tree

3 files changed

+15
-1
lines changed

3 files changed

+15
-1
lines changed

.github/workflows/accuracy_test.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,8 @@ jobs:
5959
model_name: DeepSeek-V2-Lite
6060
- runner: a2-4
6161
model_name: Qwen3-Next-80B-A3B-Instruct
62+
- runner: a2-1
63+
model_name: Qwen3-8B-W8A8
6264
fail-fast: false
6365
# test will be triggered when tag 'accuracy-test' & 'ready-for-test'
6466
if: >-
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
model_name: "vllm-ascend/Qwen3-8B-W8A8"
2+
hardware: "Atlas A2 Series"
3+
tasks:
4+
- name: "gsm8k"
5+
metrics:
6+
- name: "exact_match,strict-match"
7+
value: 0.80
8+
- name: "exact_match,flexible-extract"
9+
value: 0.82
10+
num_fewshot: 5
11+
enable_thinking: False
12+
quantization: ascend

tests/e2e/models/test_lm_eval_correctness.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ def build_model_args(eval_config, tp_size):
4848
}
4949
for s in [
5050
"max_images", "gpu_memory_utilization", "enable_expert_parallel",
51-
"tensor_parallel_size", "enforce_eager", "enable_thinking"
51+
"enforce_eager", "enable_thinking", "quantization"
5252
]:
5353
val = eval_config.get(s, None)
5454
if val is not None:

0 commit comments

Comments
 (0)