Skip to content

Commit 77a0cbb

Browse files
HuanzhiMaoVishnuSuresh27
authored andcommitted
[BFCL] Fix Irrelevance Category Performance for DeepSeek Coder Handler (ShishirPatil#796)
This PR updates the decoding logic for DeepSeek-Coder handler to fix its performance issue in the irrelevance category. The irrelevance category metric we use is that, either the `decode_ast` should fail (error) or the decoded output is empty (eg, empty list or empty string). For the DeepSeek-Coder model, When it outputs a valid function call, the model response will be a list of dictionaries `[{func1:{param1:val1,...}},{func2:{param2:val2,...}}]`, so it's fine for `decode_ast` to just return it without any processing. However, when the output is a message (not valid function call), under the `_parse_query_response_prompting` logic, the model response will be that message string, and in the current `decode_ast` implementation, that string will just be treated as the decoded output, and it would fail both the metric for the irrelevance category, which is not ideal.
1 parent c3520ad commit 77a0cbb

File tree

2 files changed

+22
-4
lines changed

2 files changed

+22
-4
lines changed

berkeley-function-call-leaderboard/bfcl/__main__.py

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -37,10 +37,22 @@ def list_commands(self, ctx):
3737
)
3838

3939

40-
# Input is like 'a,b,c,d', we need to transform it to ['a', 'b', 'c', 'd'] because that's the expected format in the actual main funciton
41-
handle_multiple_input = lambda x: [
42-
item.strip() for item in ",".join(x).split(",") if item.strip()
43-
]
40+
def handle_multiple_input(input_str):
41+
"""
42+
Input is like 'a,b,c,d', we need to transform it to ['a', 'b', 'c', 'd'] because that's the expected format in the actual main funciton
43+
"""
44+
if input_str is None:
45+
"""
46+
Cannot return None here, as typer will check the length of the return value and len(None) will raise an error
47+
But when default is None, an empty list will be internally converted to None, and so the pipeline still works as expected
48+
```
49+
if default_value is None and len(value) == 0:
50+
return None
51+
```
52+
"""
53+
return []
54+
55+
return [item.strip() for item in ",".join(input_str).split(",") if item.strip()]
4456

4557

4658
@cli.command()

berkeley-function-call-leaderboard/bfcl/model_handler/oss_model/deepseek_coder.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,16 @@ def __init__(self, model_name, temperature) -> None:
2121

2222
@overrides
2323
def decode_ast(self, result, language="Python"):
24+
# The input is already a list of dictionaries, so no need to decode
25+
# `[{func1:{param1:val1,...}},{func2:{param2:val2,...}}]`
26+
if type(result) != list:
27+
return []
2428
return result
2529

2630
@overrides
2731
def decode_execute(self, result):
32+
if type(result) != list:
33+
return []
2834
return convert_to_function_call(result)
2935

3036
@overrides

0 commit comments

Comments
 (0)