Skip to content

Commit 7d9204c

Browse files
Update blog8 evaluation metric section (#319)
This PR updates the evaluation metric in our leaderboard blog to be in sync with #309, as the AST evaluation pipeline has been updated. This PR **does not** change the leaderboard value. --------- Co-authored-by: Shishir Patil <[email protected]>
1 parent 5838a21 commit 7d9204c

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

blogs/8_berkeley_function_calling_leaderboard.html

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ <h4 class="text-center" style="margin: 0px;">
9999
<p></p>
100100
</h4>
101101
</div>
102-
<b><i style="font-size: 1.0em;">Last updated: 2024-04-01</i></b>
102+
<b><i style="font-size: 1.0em;">Last updated: 2024-04-01 <a href="https://github.com/ShishirPatil/gorilla/tree/main/berkeley-function-call-leaderboard#changelog">[Change Log]</a></i></b>
103103
<br></br>
104104

105105
<div class="preview">
@@ -702,6 +702,11 @@ <h5>Parameter Type & Value Matching</h5>
702702
<li style="margin-bottom: 5px;">For <code>String</code>:
703703
<ul>
704704
<li>The evaluation process is <i>case-insensitive</i>.</li>
705+
<li>All strings will be standardized before checking. This applies to both the model output and the possible answers. </li>
706+
<ul>
707+
<li>All white space is removed.</li>
708+
<li>A subset of punctuations <code>,./-_*^</code> are removed to make the evaluation more robust and accurate.</li>
709+
</ul>
705710
<li style="margin-bottom: 5px;">Possible date
706711
<code>["20th June", "2023-06-20", "06/20/2023", "Jun.20, 2023"]</code>
707712
</li>
@@ -1868,4 +1873,4 @@ <h4 id="citation">Citation</h4>
18681873
subMenu.classList.toggle('expanded');
18691874
parentItem.classList.toggle('expanded');
18701875
}
1871-
</script>
1876+
</script>

0 commit comments

Comments
 (0)