truncate thinking tags in generations #3145

baberabb · 2025-07-14T13:58:38Z

No description provided.

…nd thinking tokens

…eration

# Conflicts: # lm_eval/models/vllm_causallms.py

* warning for "chat" pretrained; disable buggy evalita configs (EleutherAI#3127) * check for chat for warning * add test * remove yaml extension from some evalita configs * move unitxt to own test script * fix CI test * fix: remove warning (EleutherAI#3128) * Adding EgyMMLU and EgyHellaSwag (EleutherAI#3063) * add egy mmlu hellaswag * add egymmlu egyhellaswag to tasks readme * fix egymmlu config generation * fix _generate_configs formating * Added mixed_precision_dtype arg (EleutherAI#3138) * Fix for hang due to mp.Pool in bootstrap_stderr (EleutherAI#3135) * fix: vllm lora (EleutherAI#3132) * truncate thinking tags in generations (EleutherAI#3145) * feat: add postprocessing for generated text to strip stop sequences and thinking tokens * nit * fix: trim leading whitespace after stripping thinking tokens from generation * feat: add think_end_token to model_args * nit * nit * nit * add to readme * nit * `bbh_cot_fewshot`: Removed repeated "Let''s think step by step." text from bbh cot prompts (EleutherAI#3140) * Removed the 'Let''s think step by step.' text from the start of the target entry in each of the samples to prevent this phrase from being repeated twice in the few-shot prompts and to match the behavior from the original bbh repository. Worth noting that this applied to only 26 out of 27 subtasks, the only one it did not apply to is boolean_expressions.yaml. When it comes to boolean_expressions.yaml, in my opinion there is an error in that it doesn't say the 'Remember that (i) ...' text after the final 'A: Let's think step by step.' in the prompt. Models like EleutherAI/gpt-neo-125m seem to always begin answers with this string anyway (copying what was done in the few-shot prompts), but I think it really should've been part of the prompt, much like how 'A: Let's think step by step.' is included in the prompt for all of the cot tasks. However, the original bbh repo also has this issue, so I think it is fine to keep it this way for consistency, but just thought I'd point it out anyway. * feat: remove extra space from answers; add changelog --------- Co-authored-by: Baber <[email protected]> --------- Co-authored-by: Baber Abbasi <[email protected]> Co-authored-by: Atou Houdaifa <[email protected]> Co-authored-by: Avelina Asada Hadji-Kyriacou <[email protected]> Co-authored-by: Ankit Gola <[email protected]> Co-authored-by: MaYongQing <[email protected]> Co-authored-by: philipdoldo <[email protected]> Co-authored-by: Baber <[email protected]>

baberabb added 4 commits July 14, 2025 15:59

feat: add postprocessing for generated text to strip stop sequences a…

893ea66

…nd thinking tokens

nit

3614bd4

fix: trim leading whitespace after stripping thinking tokens from gen…

03a2402

…eration

feat: add think_end_token to model_args

fbbe695

baberabb requested a review from StellaAthena as a code owner July 14, 2025 13:58

baberabb added 6 commits July 14, 2025 19:18

nit

fed3ccf

nit

d29c1c3

nit

5acd6ea

add to readme

d58d07c

Merge branch 'main' into rmthink

843b7a3

# Conflicts: # lm_eval/models/vllm_causallms.py

nit

e28cd0a

baberabb merged commit 51ede33 into main Jul 16, 2025
6 checks passed

baberabb deleted the rmthink branch July 16, 2025 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

truncate thinking tags in generations #3145

truncate thinking tags in generations #3145

Uh oh!

baberabb commented Jul 14, 2025

Uh oh!

Uh oh!

Uh oh!

truncate thinking tags in generations #3145

truncate thinking tags in generations #3145

Uh oh!

Conversation

baberabb commented Jul 14, 2025

Uh oh!

Uh oh!

Uh oh!