refactor: switch seq_length to max_seq_length #448

akoumpa · 2025-09-10T06:39:52Z

Padding is handled inside collator, with the pad_seq_len_divisible option
We use the max_seq_length option to enforce the max tokenized sequence length the model sees. If an example has larger context, then it's truncated (to a space). If the truncation removes all the context, it raises an exception and will retry up to 64 times. If fetching an item fails, the next one is selected at random.

Signed-off-by: Alexandros Koumparoulis <[email protected]>

copy-pr-bot · 2025-09-10T06:39:55Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

akoumpa · 2025-09-10T06:40:03Z

/ok to test db74d1c

Signed-off-by: Alexandros Koumparoulis <[email protected]>

akoumpa · 2025-09-10T06:52:43Z

/ok to test bf62a61

Signed-off-by: Alexandros Koumparoulis <[email protected]>

akoumpa · 2025-09-10T18:45:55Z

/ok to test a315726

switch seq_length to max_seq_length

db74d1c

Signed-off-by: Alexandros Koumparoulis <[email protected]>

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 06:40 Inactive

copy-pr-bot bot temporarily deployed to test September 10, 2025 06:40 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 06:40 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci September 10, 2025 06:40 Error

akoumpa changed the title ~~switch seq_length to max_seq_length~~ refactor: switch seq_length to max_seq_length Sep 10, 2025

ruff

bf62a61

Signed-off-by: Alexandros Koumparoulis <[email protected]>

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 06:52 Inactive

copy-pr-bot bot temporarily deployed to test September 10, 2025 06:53 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 06:53 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci September 10, 2025 07:13 Error

copy-pr-bot bot had a problem deploying to nemo-ci September 10, 2025 07:59 Failure

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 07:59 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci September 10, 2025 07:59 Failure

copy-pr-bot bot had a problem deploying to nemo-ci September 10, 2025 16:19 Failure

akoumpa added 2 commits September 10, 2025 11:44

fix

4e13dcc

Signed-off-by: Alexandros Koumparoulis <[email protected]>

fix

a315726

Signed-off-by: Alexandros Koumparoulis <[email protected]>

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 18:46 Inactive

copy-pr-bot bot temporarily deployed to test September 10, 2025 18:46 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 18:46 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci September 10, 2025 19:10 Failure

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 19:10 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci September 10, 2025 19:10 Failure

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 19:10 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci September 10, 2025 19:10 Failure

copy-pr-bot bot temporarily deployed to nemo-ci September 10, 2025 19:10 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: switch seq_length to max_seq_length #448

refactor: switch seq_length to max_seq_length #448

Uh oh!

akoumpa commented Sep 10, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Sep 10, 2025

Uh oh!

akoumpa commented Sep 10, 2025

Uh oh!

akoumpa commented Sep 10, 2025

Uh oh!

akoumpa commented Sep 10, 2025

Uh oh!

Uh oh!

refactor: switch seq_length to max_seq_length #448

Are you sure you want to change the base?

refactor: switch seq_length to max_seq_length #448

Uh oh!

Conversation

akoumpa commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Sep 10, 2025

Uh oh!

akoumpa commented Sep 10, 2025

Uh oh!

akoumpa commented Sep 10, 2025

Uh oh!

akoumpa commented Sep 10, 2025

Uh oh!

Uh oh!

akoumpa commented Sep 10, 2025 •

edited

Loading