File tree Expand file tree Collapse file tree 1 file changed +1
-1
lines changed Expand file tree Collapse file tree 1 file changed +1
-1
lines changed Original file line number Diff line number Diff line change 3737- [`LongT5ForConditionalGeneration`] is an extension of [`T5ForConditionalGeneration`] exchanging the traditional
3838encoder *self-attention* layer with efficient either *local* attention or *transient-global* (*tglobal*) attention.
3939- Unlike the T5 model, LongT5 does not use a task prefix. Furthermore, it uses a different pre-training objective
40- inspired by the pre-training of `[ PegasusForConditionalGeneration]` .
40+ inspired by the pre-training of [` PegasusForConditionalGeneration`] .
4141- LongT5 model is designed to work efficiently and very well on long-range *sequence-to-sequence* tasks where the
4242input sequence exceeds commonly used 512 tokens. It is capable of handling input sequences of a length up to 16,384 tokens.
4343- For *Local Attention*, the sparse sliding-window local attention operation allows a given token to attend only `r`
You can’t perform that action at this time.
0 commit comments