-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Description
What happened?
Describe the bug
Currently, litellm does not properly handle the reasoning field included in the delta object of streaming responses from Vercel AI Gateway. This results in the loss of important reasoning process data provided by the model.
Current Behavior
When Vercel AI Gateway sends a streaming chunk that includes the reasoning field, the information is absent from the final response stream object returned by litellm.
Example Vercel AI Gateway Stream Chunk:
{
"data": {
"id": "gen_01K8ZE8RV61CRN3QG42PCAXA39",
"object": "chat.completion.chunk",
"model": "minimax/minimax-m2",
"choices": [
{
"index": 0,
"delta": {
"reasoning": " down why the sky appears blue, which is linked to Rayleigh scattering...",
"reasoning_details": [...]
},
"finish_reason": null
}
]
}
}Expected Behavior
Similar to how OpenRouter's streaming responses are handled, the value of the reasoning field should be parsed and included in litellm's standard response model, for instance, as a reasoning_content attribute.
Cause of the issue
The file litellm/llms/custom_llm/vercel_ai_gateway.py lacks a dedicated ChatCompletionStreamingHandler for Vercel AI Gateway's streaming responses. Consequently, the default handler fails to process provider-specific fields like reasoning.
Proposed Solution
- Create a
VercelAIGatewayChatCompletionStreamingHandlerclass, taking inspiration fromOpenRouterChatCompletionStreamingHandler. - In the
chunk_parsermethod of the new handler, add logic to map the value ofchoice["delta"]["reasoning"]tochoice["delta"]["reasoning_content"]. - Implement the
get_model_response_iteratormethod in theVercelAIGatewayConfigclass to return an instance of the newly created custom handler.
Relevant log output
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
v1.78.5
Twitter / LinkedIn details
No response