Skip to content

Releases: DaveCoDev/evaluate-ai

v0.5.1

02 Feb 16:07

Choose a tag to compare

v0.5.1 Pre-release
Pre-release
  • Added strip_thinking function to better support deepseek-r1
  • Added a probability meets criteria evaluation
  • Minor bug fixes and improvements
  • Updated dependencies

v0.5.0

25 Jan 18:55

Choose a tag to compare

v0.5.0 Pre-release
Pre-release
  • Updated to not-again-ai v.0.15
  • Updated to Poetry v2 and updated pyproject.toml accordingly

v0.4.0

18 Nov 01:44

Choose a tag to compare

v0.4.0 Pre-release
Pre-release
  • Added MMLU-Pro and IFEval evals
  • Many other improvements

v.0.3.0

08 Sep 00:14

Choose a tag to compare

v.0.3.0 Pre-release
Pre-release

Refactors the common abstractions to make it creating evaluations more flexible, while still providing some common logic.

v.0.2.0

27 Jul 21:11

Choose a tag to compare

v.0.2.0 Pre-release
Pre-release
  • Added an LLM-based evaluation, meets criteria.
  • Other quality of life script improvements

v.0.1.0

27 Jul 21:09

Choose a tag to compare

v.0.1.0 Pre-release
Pre-release

Initial release