Skip to content

Conversation

farazkh80
Copy link
Collaborator

@farazkh80 farazkh80 commented May 9, 2025

[TRTLLM-4618][feat] Add E2E tests for Llama3.1-70B, Mixtral 8x7B on RTX6000 Pro (SM120) with FP16/FP8/NVFP4

Description

  • This PR adds tests to end-to-end tests for Llama3.1-70B at FP16, FP8 and NVFP4, Mixtral 8x7B FP4 to the TensorRT-LLM test suite to be run on SM120.
  • The tests will be used by QA as a part of the B40 Bring-up (RTX6000 Pro SM120) effort.

Test Coverage

Single node tests

  • test_e2e.py::test_ptp_quickstart_advanced[Llama3.1-8B-BF16-llama-3.1-model/Meta-Llama-3.1-8B]
  • test_e2e.py::test_ptp_quickstart_advanced[Llama3.1-8B-NVFP4-nvfp4-quantized/Meta-Llama-3.1-8B]
  • test_e2e.py::test_ptp_quickstart_advanced[Llama3.1-8B-FP8-llama-3.1-model/Llama-3.1-8B-Instruct-FP8]
  • test_e2e.py::test_ptp_quickstart_advanced[Llama3.1-70B-NVFP4-nvfp4-quantized/Meta-Llama-3.1-70B]
  • test_e2e.py::test_ptp_quickstart_advanced[Llama3.1-70B-FP8-llama-3.1-model/Llama-3.1-70B-Instruct-FP8]
  • test_e2e.py::test_ptp_quickstart_advanced[Mixtral-8x7B-NVFP4-nvfp4-quantized/Mixtral-8x7B-Instruct-v0.1]

These tests will be included in the SM120 verification plan for QA sign-off.

Multi node tests

  • test_e2e.py::test_ptp_quickstart_advanced_2gpus_sm120[Llama3.1-70B-BF16-llama-3.1-model/Meta-Llama-3.1-70B]
  • test_e2e.py::test_ptp_quickstart_advanced_2gpus_sm120[Mixtral-8x7B-BF16-Mixtral-8x7B-Instruct-v0.1]

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@farazkh80 farazkh80 force-pushed the b40_test_coverage branch from 7a6d196 to 1284049 Compare May 12, 2025 16:40
@farazkh80 farazkh80 marked this pull request as ready for review May 12, 2025 18:01
@farazkh80
Copy link
Collaborator Author

/bot run

1 similar comment
@pamelap-nvidia
Copy link
Collaborator

/bot run

Copy link
Collaborator

@pamelap-nvidia pamelap-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG! Once you update the test case, I can help trigger a bot run with rtx 6000.

@farazkh80 farazkh80 force-pushed the b40_test_coverage branch from 549e544 to cc55603 Compare May 13, 2025 18:38
@farazkh80 farazkh80 requested a review from pamelap-nvidia May 13, 2025 18:43
@pamelap-nvidia
Copy link
Collaborator

bot run --stage-list "RTXPro6000-PyTorch-[Post-Merge]-1"

@pamelap-nvidia
Copy link
Collaborator

/bot run --stage-list "RTXPro6000-PyTorch-[Post-Merge]-1"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5057 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5057 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #3682 (Partly Tested) completed with status: 'SUCCESS'

@pamelap-nvidia
Copy link
Collaborator

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5071 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5071 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #3691 completed with status: 'SUCCESS'

@farazkh80 farazkh80 force-pushed the b40_test_coverage branch from 4cae7f7 to 048cdb6 Compare May 14, 2025 15:24
@pamelap-nvidia
Copy link
Collaborator

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5197 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5197 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #5071 for commit 048cdb6

@pamelap-nvidia pamelap-nvidia merged commit 42de79d into NVIDIA:main May 14, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants