Skip to content

Error when passing encoder_outputs as tuple to EncoderDecoder models #15536

@jsnfly

Description

@jsnfly

Environment info

  • transformers version: 4.17.0.dev0
  • Platform: Linux-5.13.0-27-generic-x86_64-with-glibc2.34
  • Python version: 3.9.7
  • PyTorch version (GPU?): 1.10.1+cu102 (True)
  • Tensorflow version (GPU?): 2.7.0 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.3.6 (cpu)
  • Jax version: 0.2.26
  • JaxLib version: 0.1.75
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help

@patrickvonplaten

Information

In EncoderDecoder models one can pass encoder_outputs as a tuple of Tensors . However, if you do that this line will fail with

AttributeError: 'tuple' object has no attribute 'last_hidden_state'

since the tuple isn't modified in the forward method.
So if it is a tuple, encoder_outputs could maybe wrapped in a ModelOutput class or something similar. Or handle the tuple somehow explicitly.

On a slight tangent

I made a SpeechEncoderDecoderModel for the robust speech challenge: https://huggingface.co/jsnfly/wav2vec2-large-xlsr-53-german-gpt2. I found that adding the position embeddings of the decoder model to the outputs of the encoder model improved performance significantly (basically didn't work without it).
This needs small modifications to the __init__ and forward methods of the SpeechEncoderDecoderModel.

At the moment this seems to me too much of a "hack" to add it to the SpeechEncoderDecoderModel class generally (for example via a flag), because it may differ for different decoder models and probably also needs more verification. @patrickvonplaten showed some interest that this could be included in Transformers nonetheless. What do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions