Add global_attention_mask to gen_kwargs (#16485)

JohnGiorgi · web-flow · commit b33ab4eb59f3 · 2022-04-05T13:05:27.000-04:00
If global_attention_mask is found in the models inputs (used by certain
models, like LED) in the prediction_step method of Seq2SeqTrainer,
it is added to the gen_kwargs, which are passed to model.decode().
This allows us to properly set the global attention when decoding.
diff --git a/src/transformers/trainer_seq2seq.py b/src/transformers/trainer_seq2seq.py
@@ -163,9 +163,11 @@ def prediction_step(
 
         if "attention_mask" in inputs:
             gen_kwargs["attention_mask"] = inputs.get("attention_mask", None)
+        if "global_attention_mask" in inputs:
+            gen_kwargs["global_attention_mask"] = inputs.get("global_attention_mask", None)
 
         # prepare generation inputs
-        # some encoder-decoder models can have varying encder's and thus
+        # some encoder-decoder models can have varying encoder's and thus
         # varying model input names
         if hasattr(self.model, "encoder") and self.model.encoder.main_input_name != self.model.main_input_name:
             generation_inputs = inputs[self.model.encoder.main_input_name]