-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Pull requests: EleutherAI/lm-evaluation-harness
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: Add support for accelerate-wrapped models in simple_evaluate()
#3313
opened Sep 26, 2025 by
DhruvaKashyap
Loading…
Support empty response for Completions and ChatCompletions API
#3309
opened Sep 22, 2025 by
tboerstad
Loading…
Adding New Task SLR-Bench : Scalable Logical Reasoning Benchmark
#3305
opened Sep 20, 2025 by
Ahmad21Omar
Loading…
Add long-context evaluation benchmarks (LongBench v2, Babilong, InfiniteBench, Phonebook)
#3256
opened Aug 21, 2025 by
Mariani-code
Loading…
Trim thinking content from model output in IFEval
#3240
opened Aug 14, 2025 by
davideguidobene
Loading…
Adding support for evaluating with Mistral and Pixtral models
#3235
opened Aug 13, 2025 by
LearnerSXH
Loading…
Adding support for Structured Generation with XGrammar
#3232
opened Aug 12, 2025 by
ceferisbarov
Loading…
5 tasks
Fix: respect
target_delimiter
when using a gen_prefix
on multiple-choice tasks
#3220
opened Aug 7, 2025 by
karanikolopoulos
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.