XLA generation error with repetition_penalty

### System Info

- `transformers` version: 4.22.0.dev0
- Platform: Linux-5.13.0-40-generic-x86_64-with-glibc2.29
- Python version: 3.8.10
- Huggingface_hub version: 0.8.1
- PyTorch version (GPU?): 1.10.1+cu113 (True)
- Tensorflow version (GPU?): 2.9.1 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No


### Who can help?

@gante
@Rocketknight1


### Information

- [ ] The official example scripts
- [X] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

To reproduce error (adapted code from https://huggingface.co/blog/tf-xla-generate):
```python
import tensorflow as tf
from transformers import AutoTokenizer, TFAutoModelForCausalLM

generation_kwargs = {
    "max_new_tokens": 64,
    'eos_token_id': 198,
    'do_sample': True,
    'temperature': 0.72,
    'top_k': 0,
    'top_p': 0.725,
    'repetition_penalty': 1.13,
}

tokenizer = AutoTokenizer.from_pretrained(
    "gpt2", padding_side="left", pad_token="</s>"
)
model = TFAutoModelForCausalLM.from_pretrained("gpt2")
model.config.pad_token_id = model.config.eos_token_id
input_text = "repetition_penalty error"

xla_generate = tf.function(model.generate, jit_compile=True)

tokenized_input = tokenizer(input_text, return_tensors="tf")

print("model.generate")
model.generate(**tokenized_input, **generation_kwargs)

print("xla_generate")
xla_generate(**tokenized_input, **generation_kwargs)    # error here
```

Error:
```
    File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_utils.py", line 604, in generate  *
        seed=model_kwargs.pop("seed", None),
    File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_utils.py", line 1651, in _generate  *
        input_ids,
    File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_utils.py", line 2475, in sample_body_fn  *
        next_tokens_scores = logits_processor(generated, next_token_logits, cur_len)
    File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_logits_process.py", line 94, in __call__  *
        scores = processor(input_ids, scores, cur_len)
    File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_logits_process.py", line 278, in __call__  *
        score_penalties = self._create_score_penalties(input_ids[:, :cur_len], scores)
    File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_logits_process.py", line 265, in _create_score_penalties  *
        indexable_prev_input_ids = tf.concat(

    ValueError: None values not supported.
```

By setting `repetition_penalty` to 1.0 or by removing this parameter everything works fine.

### Expected behavior

The expected result is the work of text generation using `repetition_penalty` without any errors, taking into account the use of XLA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

XLA generation error with repetition_penalty #18630

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

XLA generation error with repetition_penalty #18630

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions