Releases: DaveCoDev/evaluate-ai
Releases · DaveCoDev/evaluate-ai
v0.5.1
- Added
strip_thinkingfunction to better support deepseek-r1 - Added a probability meets criteria evaluation
- Minor bug fixes and improvements
- Updated dependencies
v0.5.0
- Updated to
not-again-ai v.0.15 - Updated to Poetry v2 and updated pyproject.toml accordingly
v0.4.0
- Added MMLU-Pro and IFEval evals
- Many other improvements
v.0.3.0
Refactors the common abstractions to make it creating evaluations more flexible, while still providing some common logic.
v.0.2.0
- Added an LLM-based evaluation, meets criteria.
- Other quality of life script improvements
v.0.1.0
Initial release