fix: improve PyExecutor resource allocations #4299

ixlmar · 2025-05-14T10:43:58Z

chore: restore symmetry of worker start/shutdown
chore: fix return type of cal_max_tokens
chore: type some more return values
fix: free resources before re-claiming

PR title

Please write the PR title by following template:

[JIRA ticket link/nvbug link/github issue link][fix/feat/doc/infra/...] <summary of this PR>

For example, assume I have a PR hope to support a new feature about cache manager of Jira TRTLLM-1000 ticket, it would be like

[TRTLLM-1000][feat] Support a new feature about cache manager

Description

This PR summarizes a few minor improvements to create_py_executor.

Test Coverage

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

ixlmar · 2025-05-14T10:51:54Z

/bot run

DomBrown

I would like to be absolutely sure that these None checks are not required before approving this (see comments).

I agree that kv_cache_manager is not None seems un-necessary in the places you have removed it. I'm just wondering if there is some 'side-effect' reasons for that check being in place. If not then fine :)

Otherwise looks good

tensorrt_llm/_torch/pyexecutor/py_executor_creator.py

tensorrt_llm/_torch/pyexecutor/_util.py

tensorrt-cicd · 2025-05-14T11:28:48Z

PR_Github #5165 [ run ] triggered by Bot

DomBrown

Comments addressed. LGTM

tensorrt-cicd · 2025-05-14T20:52:57Z

PR_Github #5165 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #3766 completed with status: 'FAILURE'

ixlmar · 2025-05-15T06:48:31Z

/bot run

tensorrt-cicd · 2025-05-15T06:54:27Z

PR_Github #5286 [ run ] triggered by Bot

ixlmar · 2025-05-15T07:09:37Z

/bot kill

ixlmar · 2025-05-15T07:10:04Z

/bot run

tensorrt-cicd · 2025-05-15T07:15:56Z

PR_Github #5293 [ kill ] triggered by Bot

tensorrt-cicd · 2025-05-15T07:16:00Z

PR_Github #5294 [ ] completed with state ABORTED

tensorrt-cicd · 2025-05-15T07:16:30Z

PR_Github #5286 [ run ] completed with state ABORTED

tensorrt-cicd · 2025-05-15T07:16:59Z

PR_Github #5293 [ kill ] completed with state SUCCESS
Successfully killed previous jobs for commit e0760b4

ixlmar · 2025-05-15T07:17:57Z

/bot run

tensorrt-cicd · 2025-05-15T07:23:49Z

PR_Github #5295 [ run ] triggered by Bot

ixlmar · 2025-05-15T09:01:14Z

/bot run

tensorrt-cicd · 2025-05-15T09:07:34Z

PR_Github #5325 [ run ] triggered by Bot

tensorrt-cicd · 2025-05-15T09:07:40Z

PR_Github #5295 [ run ] completed with state ABORTED
/LLM/main/L0_MergeRequest_PR pipeline #3867 completed with status: 'FAILURE'

tensorrt-cicd · 2025-05-15T11:21:30Z

PR_Github #5325 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #3883 completed with status: 'FAILURE'

ixlmar · 2025-05-15T15:40:11Z

/bot run

tensorrt-cicd · 2025-05-15T15:46:21Z

PR_Github #5367 [ run ] triggered by Bot

tensorrt-cicd · 2025-05-15T21:19:46Z

PR_Github #5367 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #3918 completed with status: 'FAILURE'

ixlmar · 2025-05-16T08:13:07Z

/bot run

ixlmar · 2025-05-16T08:15:14Z

/bot kill

chore: restore symmetry of worker start/shutdown chore: fix return type of cal_max_tokens chore: type some more return values fix: free resources before re-claiming Signed-off-by: ixlmar <[email protected]>

tensorrt-cicd · 2025-05-16T08:19:35Z

PR_Github #5484 [ run ] triggered by Bot

tensorrt-cicd · 2025-05-16T08:22:22Z

PR_Github #5485 [ kill ] triggered by Bot

tensorrt-cicd · 2025-05-16T08:22:26Z

PR_Github #5484 [ run ] completed with state ABORTED

tensorrt-cicd · 2025-05-16T08:22:57Z

PR_Github #5485 [ kill ] completed with state SUCCESS
Successfully killed previous jobs for commit 6666996

ixlmar · 2025-05-16T08:25:46Z

/bot run

tensorrt-cicd · 2025-05-16T08:32:11Z

PR_Github #5487 [ run ] triggered by Bot

tensorrt-cicd · 2025-05-16T15:17:00Z

PR_Github #5487 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4003 completed with status: 'SUCCESS'

ixlmar requested review from DomBrown and dcampora May 14, 2025 10:47

ixlmar marked this pull request as ready for review May 14, 2025 10:51

DomBrown requested changes May 14, 2025

View reviewed changes

tensorrt_llm/_torch/pyexecutor/py_executor_creator.py Outdated Show resolved Hide resolved

tensorrt_llm/_torch/pyexecutor/_util.py Outdated Show resolved Hide resolved

ixlmar requested a review from DomBrown May 14, 2025 13:02

DomBrown approved these changes May 14, 2025

View reviewed changes

ixlmar force-pushed the chore/kv-cache-allocation branch from 85590c1 to e0760b4 Compare May 15, 2025 07:09

ixlmar force-pushed the chore/kv-cache-allocation branch from e0760b4 to c75ab28 Compare May 15, 2025 15:40

fix: improve PyExecutor resource allocations

6666996

chore: restore symmetry of worker start/shutdown chore: fix return type of cal_max_tokens chore: type some more return values fix: free resources before re-claiming Signed-off-by: ixlmar <[email protected]>

ixlmar force-pushed the chore/kv-cache-allocation branch from c75ab28 to 6666996 Compare May 16, 2025 08:15

DomBrown merged commit 13b6140 into NVIDIA:main May 16, 2025
3 checks passed

ixlmar deleted the chore/kv-cache-allocation branch May 16, 2025 15:36

fix: improve PyExecutor resource allocations #4299

fix: improve PyExecutor resource allocations #4299

Uh oh!

Conversation

ixlmar commented May 14, 2025

PR title

Description

Test Coverage

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

ixlmar commented May 14, 2025

Uh oh!

DomBrown left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tensorrt-cicd commented May 14, 2025

Uh oh!

DomBrown left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented May 14, 2025

Uh oh!

ixlmar commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

ixlmar commented May 15, 2025

Uh oh!

ixlmar commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

ixlmar commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

ixlmar commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

ixlmar commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

tensorrt-cicd commented May 15, 2025

Uh oh!

ixlmar commented May 16, 2025

Uh oh!

ixlmar commented May 16, 2025

Uh oh!

tensorrt-cicd commented May 16, 2025

Uh oh!

tensorrt-cicd commented May 16, 2025

Uh oh!

tensorrt-cicd commented May 16, 2025

Uh oh!

tensorrt-cicd commented May 16, 2025

Uh oh!

ixlmar commented May 16, 2025

Uh oh!

tensorrt-cicd commented May 16, 2025

Uh oh!

tensorrt-cicd commented May 16, 2025

Uh oh!

Uh oh!