Skip to content

Pull requests: EleutherAI/lm-evaluation-harness

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[fix] add math and longbench to test dependencies
#3321 opened Sep 30, 2025 by jannalulu Loading…
Add support for Titulm Bangla MMLU dataset
#3317 opened Sep 29, 2025 by Ismail-Hossain-1 Loading…
unpin datasets; update pre-commit
#3316 opened Sep 27, 2025 by baberabb Loading…
Add MATH500
#3311 opened Sep 26, 2025 by jannalulu Loading…
Support torchrun vllm DP
#3304 opened Sep 19, 2025 by luccafong Loading…
Fix: VLLM model when data_parallel_size>1
#3303 opened Sep 18, 2025 by Dornavineeth Loading…
Gemini evaluation support
#3300 opened Sep 15, 2025 by IsraelAbebe Loading…
Fix lambada_multilingual_stablelm
#3294 opened Sep 11, 2025 by jmichaelov Loading…
[feature] add support for Moore Threads GPU family
#3290 opened Sep 10, 2025 by houchen-li Loading…
Adding SPaRC to lm eval harness
#3262 opened Aug 25, 2025 by lkaesberg Loading…
fix gsm8k normalization
#3254 opened Aug 20, 2025 by huaanrui Loading…
Main
#3250 opened Aug 20, 2025 by seongtaehong Loading…
Adding 3LM to lm eval harness
#3241 opened Aug 14, 2025 by GeorgeSherif Loading…
Trim thinking content from model output in IFEval
#3240 opened Aug 14, 2025 by davideguidobene Loading…
Remove gen_prefix space and add warning
#3239 opened Aug 14, 2025 by bendboaz Loading…
Adding support for Structured Generation with XGrammar
#3232 opened Aug 12, 2025 by ceferisbarov Loading…
5 tasks
Pass dataset_kwargs for Unitxt tasks
#3230 opened Aug 11, 2025 by mprahl Loading…
Fewshot refactor
#3227 opened Aug 8, 2025 by baberabb Loading…
1 task
Fix the Unitxt init method to set the task name
#3225 opened Aug 8, 2025 by mprahl Loading…
ProTip! Exclude everything labeled bug with -label:bug.