Choppy text response streaming #5124

rcdailey · 2024-12-28T18:18:56Z

rcdailey
Dec 28, 2024

When streaming responses back from claude-3.5, responses are a little choppy/slow. Is there a way to configure this for improved streaming performance of the text?

Currently I use LiteLLM proxy to facilitate access to anthropic and open AI models. I do not directly integrate them through librechat itself.

Advice is appreciated.

Answered by danny-avila

Dec 28, 2024

This is totally dependent on the API provider. Compare the "smoothness" of the stream generation with OpenAI. It's mostly due to rate of token per generation response. I believe providers do a bit of "stuffing" each response with more tokens to reduce the number of responses being transmitted. OpenAI's rate is 1 token per response, which provides a good streaming experience.

The only way around this is to manipulate the stream rate as well as control the length of "chunks" (total tokens per response). Despite being fast, Google is the worst offender of choppy text so I explicitly coded in something to manipulate the stream, so that it's smoother when running on LibreChat.

LiteLLM is an ad…

View full answer

danny-avila · 2024-12-28T19:05:49Z

danny-avila
Dec 28, 2024
Maintainer

This is totally dependent on the API provider. Compare the "smoothness" of the stream generation with OpenAI. It's mostly due to rate of token per generation response. I believe providers do a bit of "stuffing" each response with more tokens to reduce the number of responses being transmitted. OpenAI's rate is 1 token per response, which provides a good streaming experience.

The only way around this is to manipulate the stream rate as well as control the length of "chunks" (total tokens per response). Despite being fast, Google is the worst offender of choppy text so I explicitly coded in something to manipulate the stream, so that it's smoother when running on LibreChat.

LiteLLM is an additional layer for you--I couldn't tell you whether they manipulate the rate of tokens per generation response, but they technically could. They probably leave it alone when proxying.

Through agents, I've been experimenting with making the stream smoother for Anthropic, and you can see it's possible here:
https://github.com/user-attachments/assets/e5510e16-96dc-4e1a-afc2-e1ea8d8b725b

1 reply

rcdailey Dec 28, 2024
Author

Thank you for such a quick response. Happy Holidays!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Choppy text response streaming #5124

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Choppy text response streaming #5124

Uh oh!

rcdailey Dec 28, 2024

Replies: 1 comment · 1 reply

Uh oh!

danny-avila Dec 28, 2024 Maintainer

Uh oh!

rcdailey Dec 28, 2024 Author

rcdailey
Dec 28, 2024

Replies: 1 comment 1 reply

danny-avila
Dec 28, 2024
Maintainer

rcdailey Dec 28, 2024
Author