Which collator to use for padding-free packing: WithFlattening or Default LanguageModelling?

Hi!

To my knowledge, it seems like there are two DataCollators available that can handle, one, packed examples and, two, padding-free attention. 

WithFlattening: https://github.com/huggingface/transformers/blob/67ddc82fbc7e52c6f42a395b4a6d278c55b77a39/src/transformers/data/data_collator.py#L1973 

custom LanguageModelling in sft_trainer.py:
https://github.com/huggingface/trl/blob/686cd35a72f4d2a8afd8d15151738f83a118dae5/trl/trainer/sft_trainer.py#L109

Based on documentation it seems like WithFlattening has a nice feature of preventing the last token of a packed example from predicting the first token of the next example. Otherwise, not sure what the differences from an initial reading of the code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Which collator to use for padding-free packing: WithFlattening or Default LanguageModelling? #3692

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Which collator to use for padding-free packing: WithFlattening or Default LanguageModelling? #3692

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions