Skip to content

Commit 9df5c34

Browse files
[BFCL Chore] Fix Functionary Medium 3.1 model name & add readme parallel inference (#577)
Changes: - Fix Functionary Medium 3.1 model version name in `eval_runner_helper.py` - add readme parallel inference --------- Co-authored-by: Huanzhi (Hans) Mao <[email protected]>
1 parent de8307b commit 9df5c34

File tree

2 files changed

+8
-4
lines changed

2 files changed

+8
-4
lines changed

berkeley-function-call-leaderboard/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,11 +66,12 @@ If decided to run OSS model, the generation script uses vllm and therefore requi
6666

6767
### Generating LLM Responses
6868

69-
Use the following command for LLM inference of the evaluation dataset with specific models
69+
Use the following command for LLM inference of the evaluation dataset with specific models.
7070

7171
```bash
72-
python openfunctions_evaluation.py --model MODEL_NAME --test-category TEST_CATEGORY
72+
python openfunctions_evaluation.py --model MODEL_NAME --test-category TEST_CATEGORY --num-threads 1
7373
```
74+
You can optionally specify the number of threads to use for *parallel inference* by setting the `--num-threads` flag to speed up inference for **hosted models**, not applicable for OSS models.
7475

7576
For available options for `MODEL_NAME` and `TEST_CATEGORY`, please refer to the [Models Available](#models-available) and [Available Test Category](#available-test-category) section below.
7677

@@ -222,7 +223,7 @@ Some companies have proposed some optimization strategies in their models' handl
222223

223224
* [August 8, 2024] [#574](https://github.com/ShishirPatil/gorilla/pull/574): Set temperature to 0.001 for all models for consistency and reproducibility.
224225
* [August 7, 2024] [#571](https://github.com/ShishirPatil/gorilla/pull/571): Support parallel inference for hosted models. User can specify the number of threads to use for parallel inference by setting the `--num-threads` flag. The default is 1, which means no parallel inference.
225-
* [August 6, 2024] [#569](https://github.com/ShishirPatil/gorilla/pull/569), [#570](https://github.com/ShishirPatil/gorilla/pull/570): Add the following new models to the leaderboard:
226+
* [August 6, 2024] [#569](https://github.com/ShishirPatil/gorilla/pull/569), [#570](https://github.com/ShishirPatil/gorilla/pull/570), [#573](https://github.com/ShishirPatil/gorilla/pull/573): Add the following new models to the leaderboard:
226227
* `open-mistral-nemo-2407`
227228
* `open-mistral-nemo-2407-FC-Any`
228229
* `open-mistral-nemo-2407-FC-Auto`
@@ -234,6 +235,9 @@ Some companies have proposed some optimization strategies in their models' handl
234235
* `gpt-4o-mini-2024-07-18-FC`
235236
* `gpt-4o-2024-08-06`
236237
* `gpt-4o-2024-08-06-FC`
238+
* `meetkai/functionary-medium-v3.1-FC`
239+
* `meetkai/functionary-small-v3.1-FC`
240+
* `meetkai/functionary-small-v3.2-FC`
237241
* [August 5, 2024] [#568](https://github.com/ShishirPatil/gorilla/pull/568): Rephrase the question prompt for the `executable_parallel_function` category to remove potentially misleading information implying multi-turn function calls.
238242
* [August 4, 2024] [#557](https://github.com/ShishirPatil/gorilla/pull/557): Bug fix in the possible answers.
239243
* simple: 7 affected

berkeley-function-call-leaderboard/eval_checker/eval_runner_helper.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -253,7 +253,7 @@
253253
"MIT",
254254
],
255255
"meetkai/functionary-medium-v3.1-FC": [
256-
"Functionary-Medium-v3.0 (FC)",
256+
"Functionary-Medium-v3.1 (FC)",
257257
"https://huggingface.co/meetkai/functionary-medium-v3.1",
258258
"MeetKai",
259259
"MIT",

0 commit comments

Comments
 (0)