Leaderboard April 3 release #309

HuanzhiMao · 2024-04-04T08:01:39Z

This PR is for the leaderboard April 3 release:

Bug fix for evaluation dataset possible answers, including those that are identified in Data issues identified in Gorilla leaderboard test dataset during data sanity checks #301.
Implement string standardization for the AST evaluation pipeline, i.e. removing white spaces and a subset of punctuations (,./-_*^) to make the AST evaluation more robust and accurate.
Fix AST evaluation issue for type tuple.
Fix AST evaluation issue for Java and JavaScript.
Add 2 new models meetkai/functionary-small-v2.4 (FC), meetkai/functionary-medium-v2.4 (FC) to the leaderboard.

This PR DOES change the leaderboard score. We will update the leaderboard website shortly, in a different PR.

Co-authored-by: Charlie Cheng-Jie Ji [email protected]
Co-authored-by: Fanjia Yan [email protected]

berkeley-function-call-leaderboard/README.md

CharlieJCJ

LGTM

This PR updates the leaderboard data, as mentioned in #309 This PR **DOES** change the leaderboard value.

This PR updates the evaluation metric in our leaderboard blog to be in sync with #309, as the AST evaluation pipeline has been updated. This PR **does not** change the leaderboard value. --------- Co-authored-by: Shishir Patil <[email protected]>

HuanzhiMao added 2 commits April 3, 2024 22:59

April 3 release

e49dc7d

update change log with correct PR number

b4e054f

ShishirPatil requested changes Apr 4, 2024

View reviewed changes

berkeley-function-call-leaderboard/README.md Outdated Show resolved Hide resolved

HuanzhiMao and others added 2 commits April 4, 2024 01:14

update readme

c1c22ac

add functionary setup instructions

416f644

HuanzhiMao mentioned this pull request Apr 4, 2024

Update leaderboard data with April 3 release #310

Merged

HuanzhiMao requested a review from ShishirPatil April 4, 2024 22:57

HuanzhiMao mentioned this pull request Apr 4, 2024

Data issues identified in Gorilla leaderboard test dataset during data sanity checks #301

Closed

CharlieJCJ approved these changes Apr 5, 2024

View reviewed changes

ShishirPatil approved these changes Apr 5, 2024

View reviewed changes

ShishirPatil merged commit 82f8fc5 into ShishirPatil:main Apr 5, 2024

ShishirPatil pushed a commit that referenced this pull request Apr 5, 2024

Update leaderboard data with April 3 release (#310)

1adf7d5

This PR updates the leaderboard data, as mentioned in #309 This PR **DOES** change the leaderboard value.

HuanzhiMao mentioned this pull request Apr 5, 2024

Update blog8 evaluation metric section #319

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Leaderboard April 3 release #309

Leaderboard April 3 release #309

Uh oh!

HuanzhiMao commented Apr 4, 2024 •

edited

Loading

Uh oh!

Uh oh!

CharlieJCJ left a comment

Uh oh!

Uh oh!

Leaderboard April 3 release #309

Leaderboard April 3 release #309

Uh oh!

Conversation

HuanzhiMao commented Apr 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

CharlieJCJ left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuanzhiMao commented Apr 4, 2024 •

edited

Loading