Skip to content

[Bug] gpt-oss streaming function calling error #4014

@QwertyJack

Description

@QwertyJack

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

The GPT-OSS function call returns incorrect results when streaming is enabled.

Reproduction

Start the server:

TM_ANOMALY_HANDLER="level=2,inf=65504,nan=0" \
        lmdeploy serve api_server \
        /models/gpt-oss-120b \
        --model-name gpt-oss-120b \
        --tp 4

Send a request:

#!/usr/bin/env python

from openai import OpenAI


client = OpenAI(
    base_url='http://localhost:10028/v1',
    api_key='1',
)

is_stream = True
stream = client.chat.completions.create(
    model='gpt-oss-120b',
    messages=[
        {
            "role": "user",
            "content": "What is the weather like in Paris today?"
        }
    ],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get current temperature for a given location.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City and country e.g. Bogotá, Colombia"
                        }
                    },
                    "required": [
                        "location"
                    ],
                    "additionalProperties": False
                },
                "strict": True
            }
        }
    ],
    stream=is_stream,
)

if not is_stream:
    print(stream)

tool_calls = {}
for response_chunk in stream:
    delta_tool_calls = response_chunk.choices[0].delta.tool_calls
    if delta_tool_calls:
        for tool_call_chunk in delta_tool_calls:
            call_index = tool_call_chunk.index
            if call_index not in tool_calls:
                tool_calls[call_index] = tool_call_chunk
            else:
                tool_calls[call_index].function.arguments += tool_call_chunk.function.arguments
print(tool_calls[0].model_dump_json())

When specify is_stream=True, it returns the correct result like this:

ChatCompletion(id='5', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-tBD4PDRhQZmnTyUVnxrGDk', function=Function(arguments='{\n  "location": "Paris, France"\n}', name='get_weather'), type='function')], gen_tokens=None, reasoning_content='User asks for weather in Paris today. We need to fetch current weather via get_weather function. Use location string "Paris, France".'))], created=1759056400, model='gpt-oss-120b', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=56, prompt_tokens=140, total_tokens=196, completion_tokens_details=None, prompt_tokens_details=None))

However, with is_stream=True it outputs:

{"index": 0, "id": "chatcmpl-tool-c5dsGyb3sNAFV4pXE3w92k", "function": {"arguments": "{\n{\n   \" \"locationlocation\":\": \" \"ParisParis,, France France\"\n\"\n}}", "name": "get_weather"}, "type": "function"}

which leads to incorrect function calls.

The BUG is related to this function GptOssChatParser.parse_streaming():

delta_tool_call = DeltaToolCall(index=base_index,

Environment

lmdeploy==0.10.1

Error traceback

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions