[Core] Improve Tensor serialisation #18774

lgeiger · 2025-05-27T18:27:44Z

This improves v1 tensor serialisation by directly relying on torch.frombuffer which removes the need for an temporary numpy array which is a bit easier to read.

Here's a small micro benchmark to verify that this is also faster:

import numpy as np
import torch

from vllm.v1.serial_utils import MsgpackDecoder, MsgpackEncoder

encoder = MsgpackEncoder()
tensor_decoder = MsgpackDecoder(torch.Tensor)
numpy_decoder = MsgpackDecoder(np.ndarray)

array = np.random.rand(1, 3, 896, 896).astype(np.float32)
tensor = torch.tensor(array)
encoded_tensor = encoder.encode(tensor)
encoded_array = encoder.encode(array)

main:

Tensor encode/decode
5.44 μs ± 12.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.22 μs ± 9.13 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Array encode/decode
5.46 μs ± 4.99 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.43 μs ± 4.35 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

This PR:

Tensor encode/decode
5.03 μs ± 3.81 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
4.41 μs ± 7.52 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Array encode/decode
5.02 μs ± 23.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.45 μs ± 3.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Signed-off-by: Lukas Geiger <[email protected]>

github-actions · 2025-05-27T18:27:54Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

njhill

Thanks @lgeiger, LGTM

This reverts commit d73a945.

njhill · 2025-05-28T21:06:18Z

I'm going to revert this since it broke some tests: #18857. We can look into the reason and open another PR as needed.

lgeiger · 2025-05-28T22:19:35Z

Oh so sorry, about that. I can have a look at it in a bit

lgeiger · 2025-05-28T22:35:55Z

#18860 Should fix it

Signed-off-by: Lukas Geiger <[email protected]> Signed-off-by: amit <[email protected]>

Signed-off-by: Lukas Geiger <[email protected]> Signed-off-by: minpeter <[email protected]>

[Core] Improve Tensor serialisation

2eb164a

Signed-off-by: Lukas Geiger <[email protected]>

lgeiger requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners May 27, 2025 18:27

mergify bot added the v1 label May 27, 2025

njhill approved these changes May 27, 2025

View reviewed changes

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label May 27, 2025

DarkLight1337 merged commit d73a945 into vllm-project:main May 28, 2025
72 checks passed

lgeiger deleted the improve-tensor-serialisation branch May 28, 2025 02:48

njhill added a commit that referenced this pull request May 28, 2025

Revert "[Core] Improve Tensor serialisation (#18774)"

6818cd0

This reverts commit d73a945.

lgeiger mentioned this pull request May 28, 2025

[Bugfix] Ensure tensors are contiguous during serialisation #18860

Merged

amitm02 pushed a commit to amitm02/vllm that referenced this pull request Jun 1, 2025

[Core] Improve Tensor serialisation (vllm-project#18774)

5397cda

Signed-off-by: Lukas Geiger <[email protected]> Signed-off-by: amit <[email protected]>

minpeter pushed a commit to minpeter/vllm that referenced this pull request Jun 24, 2025

[Core] Improve Tensor serialisation (vllm-project#18774)

8c916d6

Signed-off-by: Lukas Geiger <[email protected]> Signed-off-by: minpeter <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] Improve Tensor serialisation #18774

[Core] Improve Tensor serialisation #18774

Uh oh!

lgeiger commented May 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 27, 2025

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

njhill commented May 28, 2025

Uh oh!

lgeiger commented May 28, 2025

Uh oh!

lgeiger commented May 28, 2025

Uh oh!

Uh oh!

Uh oh!

[Core] Improve Tensor serialisation #18774

[Core] Improve Tensor serialisation #18774

Uh oh!

Conversation

lgeiger commented May 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 27, 2025

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

njhill commented May 28, 2025

Uh oh!

lgeiger commented May 28, 2025

Uh oh!

lgeiger commented May 28, 2025

Uh oh!

Uh oh!

lgeiger commented May 27, 2025 •

edited by github-actions bot

Loading