Skip to content

Commit 8e43aaa

Browse files
authored
Add vLLM e2e tests (#117)
* add first test * update tests * update to use config files * update test * update to add int8 tests * update * fix condition * fix typo * add w8a16 * update * update to clear session and delete dirs * conditional import for vllm * update * update num samples * add more test cases; add custom recipe support * update model * updat recipe modifier * Update fp8_weight_only.yaml * add more test cases * try a larger model * revert * add description; save model to hub post testing
1 parent ac673b5 commit 8e43aaa

20 files changed

+305
-29
lines changed

tests/e2e/vLLM/__init__.py

Whitespace-only changes.
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
scheme: FP8_DYNAMIC
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
scheme: FP8
5+
dataset_id: HuggingFaceH4/ultrachat_200k
6+
dataset_split: train_sft
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
recipe: tests/e2e/vLLM/recipes/FP8/recipe_fp8_weight_only_channel.yaml
5+
scheme: FP8A16_channel
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
recipe: tests/e2e/vLLM/recipes/FP8/recipe_fp8_weight_only_per_tensor.yaml
5+
scheme: FP8A16_tensor
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
recipe: tests/e2e/vLLM/recipes/INT8/recipe_int8_channel_weight_static_per_tensor_act.yaml
5+
dataset_id: HuggingFaceH4/ultrachat_200k
6+
dataset_split: train_sft
7+
scheme: W8A8_channel_weight_static_per_tensor
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
scheme: W8A8
5+
dataset_id: HuggingFaceH4/ultrachat_200k
6+
dataset_split: train_sft
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
recipe: tests/e2e/vLLM/recipes/INT8/recipe_int8_tensor_weight_static_per_tensor_act.yaml
5+
dataset_id: HuggingFaceH4/ultrachat_200k
6+
dataset_split: train_sft
7+
scheme: W8A8_tensor_weight_static_per_tensor_act
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
scheme: W4A16_channel
5+
dataset_id: HuggingFaceH4/ultrachat_200k
6+
dataset_split: train_sft
7+
recipe: tests/e2e/vLLM/recipes/WNA16/recipe_w4a16_channel_quant.yaml
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
scheme: W4A16
5+
dataset_id: HuggingFaceH4/ultrachat_200k
6+
dataset_split: train_sft

0 commit comments

Comments
 (0)