feat: avoid some copies in torch formatter #7787
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
perf: reduce copies in TorchFormatter
This PR make changes the torch formatter to avoid unnecessary copies and casts when converting decoded batches to tensors.
Because many arrays are already in a torch-friendly memory layout and dtype, we can do zero‑copy conversions (
torch.from_numpy) and only fall back toas_tensorwhen a dtype/device change is required. We also consolidate lists of same‑shape tensors with a cheapstackonly when safe.Why it helps
Small benchmark script (based on #6104)
Without changes
With changes
Updated reproduction scripts
Below are some simple test cases using
mainand thisrefactor-torch-formatterbranch. I've included the two scripts and output when running on a local machine.output
and this branch specifically