-
Notifications
You must be signed in to change notification settings - Fork 611
Closed
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
The GPT-OSS function call returns incorrect results when streaming is enabled.
Reproduction
Start the server:
TM_ANOMALY_HANDLER="level=2,inf=65504,nan=0" \
lmdeploy serve api_server \
/models/gpt-oss-120b \
--model-name gpt-oss-120b \
--tp 4
Send a request:
#!/usr/bin/env python
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:10028/v1',
api_key='1',
)
is_stream = True
stream = client.chat.completions.create(
model='gpt-oss-120b',
messages=[
{
"role": "user",
"content": "What is the weather like in Paris today?"
}
],
tools=[
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country e.g. Bogotá, Colombia"
}
},
"required": [
"location"
],
"additionalProperties": False
},
"strict": True
}
}
],
stream=is_stream,
)
if not is_stream:
print(stream)
tool_calls = {}
for response_chunk in stream:
delta_tool_calls = response_chunk.choices[0].delta.tool_calls
if delta_tool_calls:
for tool_call_chunk in delta_tool_calls:
call_index = tool_call_chunk.index
if call_index not in tool_calls:
tool_calls[call_index] = tool_call_chunk
else:
tool_calls[call_index].function.arguments += tool_call_chunk.function.arguments
print(tool_calls[0].model_dump_json())
When specify is_stream=True
, it returns the correct result like this:
ChatCompletion(id='5', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-tBD4PDRhQZmnTyUVnxrGDk', function=Function(arguments='{\n "location": "Paris, France"\n}', name='get_weather'), type='function')], gen_tokens=None, reasoning_content='User asks for weather in Paris today. We need to fetch current weather via get_weather function. Use location string "Paris, France".'))], created=1759056400, model='gpt-oss-120b', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=56, prompt_tokens=140, total_tokens=196, completion_tokens_details=None, prompt_tokens_details=None))
However, with is_stream=True
it outputs:
{"index": 0, "id": "chatcmpl-tool-c5dsGyb3sNAFV4pXE3w92k", "function": {"arguments": "{\n{\n \" \"locationlocation\":\": \" \"ParisParis,, France France\"\n\"\n}}", "name": "get_weather"}, "type": "function"}
which leads to incorrect function calls.
The BUG is related to this function GptOssChatParser.parse_streaming()
:
delta_tool_call = DeltaToolCall(index=base_index, |
Environment
lmdeploy==0.10.1
Error traceback
Metadata
Metadata
Assignees
Labels
No labels