[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM #7962

sroy745 · 2024-08-28T18:40:08Z

Add documentation about lossless guarantees in Speculative Decoding in vLLM. The documentation documents the finding in this issue here #7627

@cadedaniel

Pull from head

github-actions · 2024-08-28T18:40:20Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

Comment /ready on the PR
Add ready label to the PR
Enable auto-merge.

🚀

sroy745 · 2024-08-28T20:03:50Z

/ready

njhill · 2024-08-29T19:52:42Z

docs/source/models/spec_decode.rst

+3. **vLLM Logprob Stability**
+   - vLLM currently does not guarantee stable log probabilities (logprobs) across different batch sizes, which might 
+   cause small variations in output probabilities. 
+   This issue may stem from non-deterministic behaviors in batched operations or numerical instability in Torch operations. 
+   as explained in the `Numerical Accuracy section <https://pytorch.org/docs/stable/notes/numerical_accuracy.html#batched-computations-or-slice-computations>`_


@sroy745 this isn't spec decoding specific, it applies generally when concurrent requests are batched differently. I guess would be good to have a dedicated section explaining that too...

I added a section for this in serving/faq.rst (I could not find any other generic place to add it. As you mentioned it is not specific to spec decode so thought of adding it to serving faqs). I added a link to it in this subsection. I am not sure if this is what you meant. PTAL and let me know.

cadedaniel · 2024-08-29T22:21:46Z

docs/source/models/spec_decode.rst

+
+2. **Algorithmic Losslessness**
+   - vLLM’s implementation of speculative decoding is algorithmically validated to be lossless when the
+   temperature parameter (`temp`) is set to 0. Key tests include:


the rejection sampler convergence tests also handle the case where temperature is nonzero, and/or other sampling parameters are applied.

Done. Removed mention of temperature = 0 in the comment.

sroy745

Thanks for the review. Addressed your comments.

njhill

Thanks @sroy745, looks great, sorry for the late review!

…ees in Speculative Decoding in vLLM (vllm-project#7962)

…ees in Speculative Decoding in vLLM (vllm-project#7962) Signed-off-by: Alvant <[email protected]>

…ees in Speculative Decoding in vLLM (vllm-project#7962) Signed-off-by: LeiWang1999 <[email protected]>

sroy745 and others added 27 commits May 28, 2024 20:39

Merge pull request #1 from vllm-project/main

5650b95

Pull from head

Merge branch 'vllm-project:main' into main

8f36146

Merge branch 'vllm-project:main' into main

9e75057

Merge branch 'vllm-project:main' into main

db2c679

Merge branch 'vllm-project:main' into main

8d7512c

Merge branch 'vllm-project:main' into main

1473f74

Merge branch 'vllm-project:main' into main

4013e1a

Merge branch 'vllm-project:main' into main

2dbdd78

Merge branch 'vllm-project:main' into main

b3575e9

Merge branch 'vllm-project:main' into main

94b0d43

Merge branch 'vllm-project:main' into main

fa8fedf

Merge branch 'vllm-project:main' into main

6ed96b4

Merge branch 'vllm-project:main' into main

b71c533

Merge branch 'vllm-project:main' into main

57babef

Merge branch 'vllm-project:main' into main

4b19bac

Merge branch 'vllm-project:main' into main

eb7a1c4

Merge branch 'vllm-project:main' into main

7e2c87e

Merge branch 'vllm-project:main' into main

6212d5f

Merge branch 'vllm-project:main' into main

5491438

Merge branch 'vllm-project:main' into main

68e080a

Merge branch 'vllm-project:main' into main

55e4332

Merge branch 'vllm-project:main' into main

532eb48

Merge branch 'vllm-project:main' into main

7cea056

Merge branch 'vllm-project:main' into main

185e056

Merge branch 'vllm-project:main' into main

e2be95f

Merge branch 'vllm-project:main' into main

2ed5473

Add lossless guarantees of SD in vLLM

085dea8

sroy745 marked this pull request as draft August 28, 2024 18:40

Fix formatting

322463d

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 28, 2024

sroy745 and others added 3 commits August 28, 2024 20:07

small comment fix

37e2cc5

small comment fix

f6606d5

Merge branch 'vllm-project:main' into main

efa4714

njhill reviewed Aug 29, 2024

View reviewed changes

Merge branch 'vllm-project:main' into main

fb87d34

cadedaniel reviewed Aug 29, 2024

View reviewed changes

sroy745 added 7 commits August 30, 2024 21:05

Address comments

a4ce5b8

Fix format

b6e58c9

Fix format

7004b00

Fix format

fbec1be

Fix format

25190a0

Fix format

db12986

Fixes

0f745e3

sroy745 commented Aug 30, 2024

View reviewed changes

sroy745 and others added 3 commits August 31, 2024 10:16

Merge branch 'vllm-project:main' into main

5419e49

Merge remote-tracking branch 'origin/main' into vllm-spec-decode-doc

c76a4e2

Fix comment

77d42c5

mgoin approved these changes Sep 5, 2024

View reviewed changes

mgoin merged commit 2febcf2 into vllm-project:main Sep 5, 2024
23 checks passed

sroy745 deleted the vllm-spec-decode-doc branch September 5, 2024 20:46

njhill reviewed Sep 5, 2024

View reviewed changes

dtrifiro pushed a commit to opendatahub-io/vllm that referenced this pull request Sep 12, 2024

[Documentation][Spec Decode] Add documentation about lossless guarant…

5336fa5

…ees in Speculative Decoding in vLLM (vllm-project#7962)

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Documentation][Spec Decode] Add documentation about lossless guarant…

b8aaf26

…ees in Speculative Decoding in vLLM (vllm-project#7962) Signed-off-by: Alvant <[email protected]>

LeiWang1999 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Mar 26, 2025

[Documentation][Spec Decode] Add documentation about lossless guarant…

a6e5bce

…ees in Speculative Decoding in vLLM (vllm-project#7962) Signed-off-by: LeiWang1999 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM #7962

[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM #7962

Uh oh!

sroy745 commented Aug 28, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Aug 28, 2024

Uh oh!

sroy745 commented Aug 28, 2024

Uh oh!

njhill Aug 29, 2024

Uh oh!

sroy745 Aug 30, 2024

Uh oh!

cadedaniel Aug 29, 2024

Uh oh!

sroy745 Aug 30, 2024

Uh oh!

sroy745 left a comment

Uh oh!

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

Uh oh!

[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM #7962

[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM #7962

Uh oh!

Conversation

sroy745 commented Aug 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 28, 2024

Uh oh!

sroy745 commented Aug 28, 2024

Uh oh!

njhill Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

sroy745 Aug 30, 2024

Choose a reason for hiding this comment

Uh oh!

cadedaniel Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

sroy745 Aug 30, 2024

Choose a reason for hiding this comment

Uh oh!

sroy745 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sroy745 commented Aug 28, 2024 •

edited

Loading