Releases: ShishirPatil/gorilla
Berkeley Function Calling Leaderboard Updates (v1.3)
Highlights
🏆 Stable release of Berkeley Function Calling Leaderboard V3 with Multi-step and Multi-turn function call evaluation
What's Changed
- Gorilla README and repo structure revamp by @CharlieJCJ in #799
- [BFCL] Fix
live_parallel_multiple_9-8-0
copy-paste issue by @pkesseli in #865 - [BFCL] Fix Typo in
multi_turn_base_34
Ground Truth by @HuanzhiMao in #876 - Adding New Model Haha-7B by @ZydHaha in #858
- [BFCL Chore] Implement
retry_with_backoff
for Amazon Nova Handler by @HuanzhiMao in #880 - [BFCL] Fix
live_simple_183-108-0
by @pkesseli in #872 - [BFCL] Fix live_simple_165-98-0 by @pkesseli in #871
- [BFCL] Fix
live_simple_44-18-0
andlive_simple_45-18-1
by @pkesseli in #870 - [BFCL] Fix Nova Handler for Consecutive User Prompt Issue by @HuanzhiMao in #881
- Add support for QwQ and Sky-T1-32B-Preview by @SumanthRH in #888
- add handler for Bielik by @dominikabasaj in #887
- [BFCL Chore] Align Score File
id
with Result File Test Case IDs by @HuanzhiMao in #893 - Fix minor typo in default system prompt without func by @canyon289 in #895
- Falcon3 support by @kirill-fedyanin in #894
- [BFCL] Update tool construction for Palmyra models by @samjulien in #897
- Added compute_exchange_rate to multi_turn_base entry 180 ground truth by @Raymond112514 in #892
- [BFCL] Add New Model
o3-mini-2025-01-31
ando3-mini-2025-01-31-FC
by @HuanzhiMao in #898 - Add CALM models by @jgreer013 in #900
- [BFCL] Add New Model
gemini-2.0-flash-001
,gemini-2.0-flash-lite-preview-02-05
,gemini-2.0-pro-exp-02-05
. by @HuanzhiMao in #902 - chore: added snippet for hf datasets compatibility by @alt-glitch in #906
- Update model_metadata.py by @jgreer013 in #907
- Rename CALM to CoALM by @jgreer013 in #913
- Bitagent 8b submission by @VectorForger in #917
- Bitagent 8b Metadata Change by @VectorForger in #919
- [BFCL] Add New Model
gpt-4.5-preview-2025-02-27
,gpt-4.5-preview-2025-02-27-FC
by @HuanzhiMao in #922 - [BFCL] fix bug in how score_dir is handled for bfcl evaluate by @liamcli in #924
- [BFCL] Add New Model
DeepSeek-R1
by @HuanzhiMao in #901 - Make all import paths absolute. by @fvisin in #935
- Move logic to eval a task in a separate function. by @fvisin in #933
- Fix Gorilla Paper
requirements.txt
Location to Remove Global Dependency Confusion by @HuanzhiMao in #937 - [BFCL] Add _unused Suffix to Unused Dataset Files in the BFCL Benchmark by @HuanzhiMao in #938
- [BFCL] Support Local Inference for
deepseek-ai/DeepSeek-R1
by @HuanzhiMao in #926 - [BFCL] Add Support for
Qwen2.5
Models in Function Calling Mode by @HuanzhiMao in #925 - [BFCL] Add New Model
claude-3-7-sonnet-20250219
,claude-3-7-sonnet-20250219-FC
by @HuanzhiMao in #923 - [BFCL] Add handler and meta info for ToolACE-2-8B by @XuHwang in #941
- [BFCL] Reorganized All
constant.py
Files to aconstants
Folder by @catherineruoxiwu in #944 - [BFCL] Add New Models
gemini-2.0-flash-lite-001
,gemini-2.0-flash-thinking-exp-01-21
by @HuanzhiMao in #942 - [BFCL] Add Google
Gemma-3
Series Models by @HuanzhiMao in #939 - [BFCL] Move
model_metadata.py
toconstants
folder by @catherineruoxiwu in #949 - Add Cohere Command A by @harry-cohere in #951
- Reformatted Supported Model Table by @JasonHuang1103 in #961
- [BFCL] Use HTTPS instead of HTTP for OMDB by @hrshtv in #960
- [BFCL] Fix ambiguity in exec_parallel_10 question by @amitojsingh2022 in #962
- [BFCL] Fix API Keys Handling by @catherineruoxiwu in #959
- [BFCL] Fix wrong date in live_simple_205-116-13 by @amitojsingh2022 in #963
- [BFCL] Moved Ground Truths for Executable Tests to
./data/possible_answer
Folder by @catherineruoxiwu in #953 - [BFCL] Reorganizing Codes in
./bfcl/eval_checker/executable_eval/data/
by @catherineruoxiwu in #954 - [BFCL] Add
gemini-2.5-pro
to the Leaderboard by @catherineruoxiwu in #974 - [BFCL] Update Retry Logic for Gemini Models by @HuanzhiMao in #976
- [BFCL] Fix Typo in
multi_turn_base_166
Ground Truth. by @HuanzhiMao in #979 - Add Salesforce xLAM-2 series of model handlers and update vLLM version from 0.6.3 to 0.6.5 by @zuxin666 in #972
- [BFCL] Retire Executable Categories from Leaderboard by @HuanzhiMao in #943
- feat. Add Novita LLM Models API by @novita-viktor in #980
- [BFCL] Add New Models
Llama-4-Scout
,Llama-4-Maverick
by @HuanzhiMao in #981 - [BFCL] Add Support for Fully Offline Model Inference via
--local-model-path
by @catherineruoxiwu in #985 - Fix Typo in Model Name for
xLAM-2-8b-fc-r
by @HuanzhiMao in #992 - Add ThinkAgents/ThinkAgent-1B by @0xayman in #928
- [BFCL] Add Grok 3 Models to the Leaderboard by @catherineruoxiwu in #987
- [BFCL] Add mistral-large-2411 and mistral-small-2503 by @pracheeti12 in #988
- Add xiaoming-14B by @kevin2016 in #977
- [BFCL] Retire Outdated Models from the Leaderboard by @catherineruoxiwu in #997
- [BFCL] add support for microsoft/Phi-4-mini-instruct by @RobotSail in #967
- [BFCL] Add
microsoft/phi-4
to the Leaderboard by @catherineruoxiwu in #1000 - [BFCL] Add GPT 4.1 Series Models to the Leaderboard. by @catherineruoxiwu in #1002
- Bump
writer-sdk
Dependency Version by @HuanzhiMao in #1006 - [BFCL] add model config by @itea1001 in #999
- [BFCL] Add Validation for Model Names by @catherineruoxiwu in #1008
- [BFCL] Update Error Message for New Handler Mappings by @catherineruoxiwu in #1013
- [BFCL] fix entry id typo in
live_multiple_1052-279-0
by @itea1001 in #1022 - Update QwQ-32b api by @CostaliyA in #1014
- Migrate to correct testing API by @emmanuel-ferdman in #1029
- Add gemini-2.5-pro-preview-05-06 Models by @Guangyu-Joshua-Feng in #1031
- [BFCL] Add Qwen 3 Series Models to the Leaderboard by @catherineruoxiwu in #1015
- [BFCL] Remove latency data for open source models by @errorfourten in #1033
- fix treesitter setup by @CharlieJCJ in #1045
- [BFCL] Added support for Mistral Medium 3 by @errorfourten in #1040
- New colab links for gorilla hosted and openfunctions hosted by @ShishirPatil in #1036
- [BFCL] Add
version
tobfcl
CLI by @ShishirPatil in #1038 - Add DM-Cito-8B by @kevin2016 in #1017
- fix: loosen openai requirements to be >= 1.76.0 by @TheFloatingString in #1050
- [BFCL] Packagerize for PyPI Distribution by @HuanzhiMao in #1054
- [BFCL] CI: Add “Publish to PyPI” workflow with CalVer-serial auto-versioning by @HuanzhiMao in https://github.com/ShishirPatil/gorill...
Berkeley Function Calling Leaderboard Updates (v1.2)
Highlights
🏆 Berkeley Function Calling Leaderboard V3 with Multi-step and Multi-turn function call evaluation
What's Changed
- [BFCL] Package the Codebase by @devanshamin in #565
- Added python script named as raft_local.py to raft directory to run script completely locally using HF models by @himanshushukla12 in #605
- RAFT Enhancements: Improved robustness, logging, checkpointing, threading, Llama support, Azure auth and eval by @cedricvidal in #604
- Fix/merge commit #605 and #604 by @ShishirPatil in #609
- Fix issue #614: [BFCL] ModuleNotFoundError after commit 70d6722 by @kobe0938 in #615
- Fix some bugs in test case prompts/ground truths by @aw632 in #608
- [BFCL] Dataset and Possible Answer Fix by @HuanzhiMao in #600
- Add Salesforce xLAM model series by @zuxin666 in #616
- Update gemini_handler.py to better handle NL+FC model output by @vandyxiaowei in #617
- [BFCL] Fix Decoding Issue in Nvidia Handler by @HuanzhiMao in #623
- [BFCL] Fix Llama Handler by @HuanzhiMao in #626
- [BFCL] add MadeAgents/Hammer-7b handler by @linqq9 in #627
- [BFCL] Refactor Model Handler into OSS and Proprietary Components by @devanshamin in #612
- [BFCL] Hot Fix to Remove Extra Parameters for NoAPIKeyError by @HuanzhiMao in #636
- fix: bug for glm prompt format by @zhangch-ss in #638
- [BFCL] Add New Model
o1-preview-2024-09-12
ando1-mini-2024-09-12
by @HuanzhiMao in #635 - [BFCL] BFCL v3 by @HuanzhiMao in #644
- removed unnecessary comments in raft/raft_local.py by @himanshushukla12 in #654
- [BFCL] Chore: Separate Change Log. by @HuanzhiMao in #648
- [BFCL] Bug Fix inference_single_turn_FC function for base_handler by @HuanzhiMao in #656
- [BFCL] Bug Fix parse_nested_value function for model_handler utils by @VishnuSuresh27 in #660
- added Phi-3 handlers by @AndyChenYH in #640
- Update agent arena frontend and evals by @NithikYekollu in #666
- [BFCL] Speed Up Locally-hosted Model Inference Process by @HuanzhiMao in #671
- [BFCL] Fix Hanging Inference for OSS Models on GPU Platforms by @HuanzhiMao in #663
- [BFCL] Add gemini-1.5-pro-002, gemini-1.5-pro-002-FC, gemini-1.5-pro-001, gemini-1.5-pro-001-FC, gemini-1.5-flash-002, gemini-1.5-flash-002-FC, gemini-1.0-pro-002, gemini-1.0-pro-002-FC by @HuanzhiMao in #658
- [BFCL] Add Llama-3.2-1B-Instruct, Llama-3.2-3B-Instruct, Llama-3.1-8B-Instruct, Llama-3.1-70B-Instruct by @HuanzhiMao in #657
- [BFCL] Add ToolACE handler for BFCL-v3 by @XuHwang in #653
- Add Qwen handler and fix mean_latency calculation error for OSS models by @zhangch-ss in #642
- update README.md by @leosun12 in #669
- [BFCL] Chore: Various Improvements and Adjustments by @HuanzhiMao in #673
- [BFCL] Chore: Refactor File Path Handling and Automate apply_function_credential_config.py by @HuanzhiMao in #675
- docs: update README.md by @eltociear in #676
- [BFCL-v3] Multi-Turn Possible Answer Order Change by @Fanjia-Yan in #679
- update hammer handler and add Hammer2.0 model by @linqq9 in #667
- [BFCL] Chore: Improve Multi Turn Error Logs by @HuanzhiMao in #689
- Update google-cloud-aiplatform dependency by @jieru-hu in #677
- add minicpm3 4b by @Cppowboy in #633
- [BFCL-v2] Dataset and Possible Answer Fix by @HuanzhiMao in #661
- [BFCL] Add Gemma-2 models by @jacovkim in #696
- add a basic bfcl command-line interface by @mattf in #621
- Fixing BFCL-v3 multi-turn apps by @virginie-do in #701
- [BFCL v1] Update Executable Ground Truth for REST Category by @CharlieJCJ in #708
- [BFCL v1] Rephrase Question for Better Clarity for Java & JavaScript Categories by @HuanzhiMao in #709
- [BFCL] Add SGLang Backend Support for OSS Local Inference by @hnyls2002 in #587
- (typo):I've made some corrections to your repository to improve clarity by @PrathameshSPawar in #713
- docs: Centered the Image by @bhargavshirin in #680
- [BFCL] Multi Turn Dataset and Possible Answer Fix by @HuanzhiMao in #683
- [BFCL] Chore: Separate out Func Doc for Multi-Turn Categories by @HuanzhiMao in #717
- [BFCL] Multi Turn Dataset and Possible Answer Fix (Base Category) by @HuanzhiMao in #719
- [BFCL] Multi Turn Dataset Fix (Function Doc) by @HuanzhiMao in #722
- [BFCL] Multi Turn Dataset Fix (Base Category) by @HuanzhiMao in #723
- [BFCL] Multi Turn Pipeline Robustness Patch by @HuanzhiMao in #724
- [BFCL] Small typo in variable name in travel_booking.py by @daanaea in #731
- [BFCL] Patch #724 by @HuanzhiMao in #730
- [BFCL] Multi Turn Dataset Fix (Miss Func & Long Context) by @HuanzhiMao in #728
- [BFCL] Multi Turn Dataset Fix (Miss Param) by @HuanzhiMao in #732
- [BFCL] Update Eval Metric for Multi Turn Irrelevance Scenarios by @HuanzhiMao in #725
- [BFCL] Remove duplicate in eval_runner.py by @ThomasRochefortB in #735
- [BFCL] Support Dynamic max_tokens for Locally-Hosted Models by @HuanzhiMao in #712
- [BFCL] Refine Evaluation Metric for Multi Turn Categories by @HuanzhiMao in #733
- [BFCL] Adding New Model GoGoAgent by @RogueTensor in #720
- [BFCL] Chore: Improve Inference Log Readability by @HuanzhiMao in #746
- [BFCL Dataset Revamp 1/n] Multi-Turn (Part 1) by @Fanjia-Yan in #740
- [BFCL] Robustness Patch for
_multi_threaded_inference
by @HuanzhiMao in #754 - [BFCL] Prompt Caching for Claude Models by @VishnuSuresh27 in #751
- [BFCL Dataset Revamp 2/n] Live Dataset Fix (Simple, Parallel, Parallel Multiple) by @Fanjia-Yan in #737
- [BFCL Dataset Revamp 3/n] Live Dataset Fix (Multiple) by @Fanjia-Yan in #739
- Update google-cloud-aiplatform version to 1.72.0 by @gabrielibagon in #760
- [BFCL] Minor Grammatical Corrections to DEFAULT_SYSTEM_PROMPT by @HuanzhiMao in #747
- [BFCL] Remove
Llama-3.2-3B-Instruct-FC
andLlama-3.2-1B-Instruct-FC
from Leaderboard by @HuanzhiMao in #749 - [BFCL Chore] Supply
data_multi_turn.csv
for Multi-Turn Evaluation Results by @HuanzhiMao in #762 - [BFCL] Remove Workaround Patch for Vertex AI Package by @HuanzhiMao in #761
- Add exponential retry logic for gemini models by @gabrielibagon in #764
- [BFCL] Remove Duplicate Line in
record_cost_latency
by @HuanzhiMao in #767 - Fix handling of examples with no tools in Gemini by @gabrielibagon in #770
- Remove stop condition in gemini retry logic by @gabrielibagon in #769
- Skip adding empty content from gemini by @gabrielibagon in #768
- [BFCL] Add the option to log to WandB during bfcl evaluate by @ThomasRochefortB in #736
- [BFCL] Add
claude-3-5-haiku-20241022
,claude-3-5-haiku-20241022-FC
,claude-3-5-sonnet-20241022
, `claude-3-5-sonnet-202...
Berkeley Function Calling Leaderboard Updates (v1.1)
Highlights
🏆 Berkeley Function Calling Leaderboard V2 along with Live data
What's Changed
- Added Agent Arena Frontend Client to Gorilla Repository by @NithikYekollu in #586
- [BFCL] Add BFCL_V2_Live Dataset by @HuanzhiMao in #580
- Create an issue template for BFCL by @ShishirPatil in #599
- [BFCL] Relocate Formatting Instructions and Function Documentation to System Prompt by @HuanzhiMao in #593
Full Changelog: v1.0...v1.1
Berkeley Function Calling Leaderboard Updates (v1.0)
Highlights
🏆 We are thrilled to announce the stable v1.0 release of the Berkeley Function Calling Leaderboard data-set and eval-pipeline! A heartfelt thank you to all our contributors and users for your enthusiastic engagement and support throughout v1. We are just getting started! Buckle-up for v2 🚀 🚀 🚀
What's Changed
- better handle float value comparison by @vandyxiaowei in #407
- Bump pymysql from 1.1.0 to 1.1.1 in /goex by @dependabot in #453
- Fixes For NexusHandler by @VenkatKS in #437
- [BFCL] PR#407 Evaluation Pipeline Robustness Patch by @HuanzhiMao in #462
- Add firefunction-v2 to the leaderboard by @pgarbacki in #470
- [BFCL] Add Claude 3.5 Sonnet Function Calling Infernece Inference by @Fanjia-Yan in #480
- [BFCL] Standardize Model Name Among handler_map and eval_runner_helper by @HuanzhiMao in #439
- Remove redundant tokens from GPT-handler by @hellovai in #490
- [GoEx] Undo Minor Bug Fix + README Minor Improvement by @royh02 in #468
- [BFCL] Add ability to evaluate Nemotron-4-340B-Instruct by @Fanjia-Yan in #489
- fix some data issues in parallel/parallel multiple answers by @vandyxiaowei in #423
- [BFCL] Add Support for GLM-4-9B function calling inference by @Fanjia-Yan in #474
- [BFCL] Sanity check is now optional by @ShishirPatil in #496
- [BFCL] Improved tree-sitter java, javascript installation by @CharlieJCJ in #505
- [BFCL] Fix Possible Answer for AST Parallel and Parallel_Multiple Category by @HuanzhiMao in #503
- [BFCL] Add Test Dataset to Repository by @HuanzhiMao in #504
- [BFCL] Support Category-Specific Generation for OSS Model, Remove eval_data_compilation Step by @HuanzhiMao in #512
- [BFCL] Fix Double-Casting Issue in model_handler for Java and JS category. by @HuanzhiMao in #516
- [BFCL] Fix Dataset Issue for executable_parallel_multiple Category by @HuanzhiMao in #522
- [BFCL] add ibm-granite-20b-functioncallling model by @MayankAgarwal in #525
- [BFCL] Overhaul apply_function_credential_config.py for Enhanced Usability by @HuanzhiMao in #508
- Fixed the warning message "Setting
pad_token_id
toeos_token_id
:1… by @dineshkumarsarangapani in #110 - [BFCL] Specify package version in requirements.txt by @HuanzhiMao in #515
- [BFCL] Standardize TEST_CATEGORY Among eval_runner.py and openfunctions_evaluation.py by @HuanzhiMao in #506
- fix line return by @fantasist in #531
- [BFCL] Apply Fix to Newly Introduced Model Handler Missed in Previous PR Merge by @HuanzhiMao in #536
- [RAFT] Fix Datapoint Field in Formatter for Data Generation by @HuanzhiMao in #535
- [BFCL] Fix language_specific_pre_processing for Java and JavaScript Test Category by @HuanzhiMao in #538
- [BFCL] Patch Generation Script for Locally Hosted OSS model by @HuanzhiMao in #537
- [BFCL] Support Multi-Model Multi-Category Generation; Add Index to Dataset; Handle vLLM Benign Error by @HuanzhiMao in #540
- Add NousResearch/{Hermes-2-Pro-Llama-3-8B,Hermes-2-Theta-Llama-3-8B} models by @alonsosilvaallende in #542
- [BFCL] Fix Dataset Pre-Processing for Java and JavaScript Test Category, Part 2 by @HuanzhiMao in #545
- Add Salesforce xLAM handler and fix minor issues by @zuxin666 in #532
- Add NousResearch/Hermes-2-{Pro-Llama-3-80B,Theta-Llama-3-80B} by @alonsosilvaallende in #556
- Add Yi Handler by @fantasist in #543
- Add more descriptive error message in eval_runner.py by @alonsosilvaallende in #552
- [BFCL] Fix JS type converter to handle dictionaries with array values by @CharlieJCJ in #549
- [BFCL] Handling rate limits by @ShishirPatil in #559
- [BFCL] Fix Dataset and Possible Answer Issue by @HuanzhiMao in #557
- [BFCL] Dataset Question Fix for Executable Parallel Category by @HuanzhiMao in #568
- [BFCL] Add New Model gpt-4o-2024-08-06, gpt-4o-mini-2024-07-18 by @HuanzhiMao in #569
- [BFCL] Add New Model open-mistral-nemo-2407, open-mixtral-8x22b, open-mixtral-8x7b by @HuanzhiMao in #570
- [BFCL] Improve Warning Message when Aggregating Results by @HuanzhiMao in #517
- [BFCL] Add New Model functionary-small-v3.1, functionary-small-v3.2, functionary-medium-v3.1; Update Token Price by @HuanzhiMao in #573
- [BFCL] Set Model Temperature to 0.001 for All Models by @HuanzhiMao in #574
- [BFCL] Support Parallel Inference for Hosted Models by @HuanzhiMao in #571
- [BFCL Chore] Fix Functionary Medium 3.1 model name & add readme parallel inference by @CharlieJCJ in #577
New Contributors
- @dependabot made their first contribution in #453
- @VenkatKS made their first contribution in #437
- @pgarbacki made their first contribution in #470
- @hellovai made their first contribution in #490
- @MayankAgarwal made their first contribution in #525
- @dineshkumarsarangapani made their first contribution in #110
- @fantasist made their first contribution in #531
- @alonsosilvaallende made their first contribution in #542
Full Changelog: v0.3...v1.0
GoEx and Berkeley Function Calling Leaderboard Updates
😍 v0.3 release 🚀
Highlights
⚡️ Released GoEx: A runtime that presents abstractions for safe execution of LLM generated code, APIs, actions, etc
🏆 Updates to Berkeley Function Calling Leaderboard (aka Berkeley Tool Calling Leaderboard) : Newer models including GPT-4o, gemini-flash and 1.5-pro, Hermes-2-Pro, etc. All measured along P95 and P99 latency, and costs besides accuracy.
What's Changed
- Fix Typos in Evaluation Script and System Prompt. Identify Errors in a Dataset by @zuxin666 in #335
- BFCL April 8th Release by @HuanzhiMao in #330
- Initial goex commit by @ShishirPatil in #336
- BFCL April 9th Release (Dataset Bug Fix) by @HuanzhiMao in #338
- BFCL April 10th Release (API Sanity Check) by @HuanzhiMao in #339
- Add Support for NousResearch/Hermes-2-Pro-Mistral-7B Function Calling by @Fanjia-Yan in #327
- Update raft.py with default
p
to match paper by @ShishirPatil in #353 - GoEx Import Issues by @royh02 in #354
- BFCL April 11th Patch. Add Latency Statistics. by @HuanzhiMao in #347
- GoEx Gitignore User Credentials by @royh02 in #344
- Fix Circular Import Issue for BFCL evluation pipeline by @HuanzhiMao in #356
- Added Docker to README by @Noppapon in #355
- [Bug fix] Add Hermes-2-Pro-Mistral-7B model to UNDERSCORE_TO_DOT to parse API properly by @JasonZhu1313 in #364
- Update requirements.txt by @viniciuslazzari in #343
- Fix script argument by @ricklamers in #367
- BFCL April 16th Release by @HuanzhiMao in #366
- Log error messages from API validation by @eitanturok in #369
- Update .gitignore by @eitanturok in #370
- BFCL April 18th Release (Pipeline only) by @HuanzhiMao in #375
- Add missing argument to
OSSHandler
's_format_prompt
function by @eitanturok in #373 - Add FC + Prompt for Cohere command-r-plus by @harry-cohere in #350
- BFCL April 19th Release (Dataset & Pipeline) by @HuanzhiMao in #377
- Azure OpenAI support in raft.py by @cedricvidal in #381
- BFCL April 25th Release (New Models) by @HuanzhiMao in #386
- Colored logging configuration + displaying progress in logs by @cedricvidal in #384
- BFCL April 27th Release (Bug Fix in Cost/Latency Calculation) by @HuanzhiMao in #390
- BFCL April 28th Release (New Model: snowflake/arctic) by @Fanjia-Yan in #397
- RAFT Recovery Mode for interruptions by @kaiwen129 in #410
- Small corrections to possible_answers for simple test category by @aastroza in #405
- BFCL May 6th Release (Dataset Bug Fix) by @HuanzhiMao in #412
- RAFT DevContainer for GitHub Codespaces by @cedricvidal in #379
- RAFT Add support for configuring separate completion and embedding endpoints + pytest by @cedricvidal in #396
- RAFT Fix arbitrary code execution vulnerability in checkpoint feature by @cedricvidal in #415
- handle parallel function calls from gemini by @vandyxiaowei in #406
- RAFT Support for chat and completion model formats by @cedricvidal in #417
- [RAFT] Edit encode prompt to include
<ANSWER>:
tag in label by @kaiwen129 in #422 - [BFCL] Patch Gemini Handler by @HuanzhiMao in #421
- BFCL May 14th Release (GPT-4o and Gemini) by @Fanjia-Yan in #426
- [BFCL] update tree_sitter version in requirements.txt by @justinwangx in #433
- Fix indentation in leaderboard README by @polm-stability in #449
- Fix breaking changes due to updated Anthropic SDK by @eitanturok in #452
New Contributors
- @zuxin666 made their first contribution in #335
- @JasonZhu1313 made their first contribution in #364
- @ricklamers made their first contribution in #367
- @eitanturok made their first contribution in #369
- @harry-cohere made their first contribution in #350
- @cedricvidal made their first contribution in #381
- @aastroza made their first contribution in #405
- @vandyxiaowei made their first contribution in #406
- @justinwangx made their first contribution in #433
- @polm-stability made their first contribution in #449
Full Changelog: v0.2...v0.3
RAFT and Berkeley Function Calling Leaderboard Updates
😍 v0.2 release 🚀
Highlights
🎯 Berkeley Function Calling Leaderboard (BFCL): How do models stack up for function calling?
- Now includes latency and cost
- More open-source and closed-source models
- Bug fixes in dataset.
RAFT: Fine-tuning technique to improve LLMs for in-domain RAG!
What's Changed
- Adding APIs of 9 Google Service to API Zoo by @meenakshi-mittal in #204
- Github Actions to Maintain API Zoo Index by @ramanv0 in #188
- Adding Zoom API to API Zoo by @meenakshi-mittal in #221
- API Zoo Index Github Actions Fix by @ramanv0 in #261
- Added Google Forms API by @elva01 in #185
- RAFT + readme + small sample dataset by @kaiwen129 in #218
- Sample data for RAFT by @ShishirPatil in #264
- Docusign Additions by @dangeo773 in #194
- [Bug Fix] Fix Executable Exact Match Condition Did not Meet by @Fanjia-Yan in #251
- [Bug Fix] Fix Error in Parallel Function Possible Answer by @Fanjia-Yan in #252
- [Bug Fix] Restrict AST checker on Boolean Variable by @Fanjia-Yan in #256
- Adding 7 Oracle APIs to API Zoo by @meenakshi-mittal in #205
- Adding Datadog API to API Zoo by @meenakshi-mittal in #206
- Added Notion APIs (Block, Page, and Database) to APIZoo by @jennifer818 in #195
- removed testing code by @kaiwen129 in #281
- feat: more type annotations for the functions by @UponTheSky in #283
- [Fix] java, javascript parsers in openfunctions-v2 by @CharlieJCJ in #284
- Leaderboard Update April 1 by @HuanzhiMao in #299
- Remove Large File from
./inference
by @CharlieJCJ in #297 - Typo in raft.py by @danielfleischer in #311
- Leaderboard April 3 release by @HuanzhiMao in #309
- Support OSS Evaluation for Leaderboard by @HuanzhiMao in #318
- Update README.md by @HuanzhiMao in #320
- Fix typos by @viniciuslazzari in #323
- Correction in BFCL README instruction, fixed path in instructions by @CharlieJCJ in #329
New Contributors
- @elva01 made their first contribution in #185
- @kaiwen129 made their first contribution in #218
- @jennifer818 made their first contribution in #195
- @UponTheSky made their first contribution in #283
- @danielfleischer made their first contribution in #311
Full Changelog: v0.1...v0.2
Gorilla v0.1: OpenFunctions-v2, Berkeley Function Calling Leaderboard, and more.
😍 v0.1 release 🚀
Highlights
- 🎯 Berkeley Function Calling Leaderboard (BFCL): How do models stack up for function calling? Evaluation code for the Berkeley Function Calling Leaderboard.
- 🏆 Gorilla OpenFunctions v2: Inference examples for OpenFunctions-v2 - SoTA open-source LLM for function calling. On-par with GPT-4 🙌 Supports more languages 👌.
- API Zoo Index: An accessible collection of API documentation for humans to search through, and for LLMs to use as tools 👀
We are excited about our long due v0.1 release! Here's more:
What's Changed
- Adding BM25 and GPT retrievers by @ShishirPatil in #61
- update(anthropic): #63 to (0.3.x) by @AmirAflak in #64
- Add inference support for Macbook silicon chip by @benjaminhuo in #76
- Update README.md by @eltociear in #80
- PR for Gradio WebUI Feature ([feature] Gradio webui - #102) by @TanmayDoesAI in #105
- Update README.md by @abhi-databricks in #109
- Adds wandb to eval files by @morganmcg1 in #114
- Fix use_wandb in ast eval, responses file deletion, wandb artifacts renaming by @morganmcg1 in #115
- sentence optimization in docstring and examples by @rajveer43 in #117
- Gorilla OpenFunctions by @ShishirPatil in #142
- Example on running it locally with Hugging Face 🤗 Transformers by @Danielskry in #148
- Added Gmail api to api zoo by @saikolasani in #163
- Add Google Maps API (python client) by @felixzhu555 in #164
- Add support for the OpenWeatherMap API by @aryanvichare in #159
- Stripe Additions by @dangeo773 in #169
- Added Kubernetes Pod API and Pod Template API by @saikolasani in #170
- Quantized Gorilla by @CharlieJCJ in #160
- Add a guide on how to self-host the OpenFunctions model by @ramanv0 in #157
- Private Inference using Gorilla hosted endpoint on Replicate by @ramanv0 in #162
- added yfinance api to api zoo by @raywanb in #161
- Gorilla OpenFunctions run locally in Google Colab by @meenakshi-mittal in #166
- Fixed issue with Kubernetes Pod/Pod Template filename by @saikolasani in #198
- Create openfunctions-v2 issue template by @ShishirPatil in #203
- Add support for the ServiceNow REST API by @aryanvichare in #176
- Berkeley Function Calling Leaderboard evaluation scripts and OpenFunctions v2 inference by @ShishirPatil in #215
- [Berkeley-Function-Calling-Leaderboard] Refactor leaderboard result generation and checking by @Fanjia-Yan in #223
- Update openfunctions-v2 chatting format in README.md by @tianjunz in #239
- Update BFCL README.md by @CharlieJCJ in #241
- Local Inference script for openfunctions v2 by @ShishirPatil in #242
- [Update Gemini-1.0-Pro result checker] by @Fanjia-Yan in #245
- Update project roadmap and repository structure by @ShishirPatil in #257
New Contributors
- @AmirAflak made their first contribution in #64
- @benjaminhuo made their first contribution in #76
- @TanmayDoesAI made their first contribution in #105
- @abhi-databricks made their first contribution in #109
- @morganmcg1 made their first contribution in #114
- @rajveer43 made their first contribution in #117
- @Danielskry made their first contribution in #148
- @saikolasani made their first contribution in #163
- @felixzhu555 made their first contribution in #164
- @aryanvichare made their first contribution in #159
- @dangeo773 made their first contribution in #169
- @raywanb made their first contribution in #161
- @meenakshi-mittal made their first contribution in #166
Full Changelog: v0.0.1...v0.1
Gorilla release v0.0.1
🦍 Gorilla: An API store for LLMs 🚀
🚀 After 50,000 user requests through our hosted APIs, we are happy to tear the first release for Gorilla 💪
🤩 In this release:
💻 gorilla-cli, LLMs for your CLI!
🟢 Commercially usable, Apache 2.0 licensed Gorilla models
🚀 CLI interface to chat with Gorilla!
🚀 Torch Hub and TensorFlow Hub Models!
🚀 The first Gorilla model! Colab or 🤗!
🔥 APIZoo contribution guide for community API contributions!
🔥 APIBench dataset and the evaluation code of Gorilla!