-
Notifications
You must be signed in to change notification settings - Fork 31.1k
Closed
Labels
Description
System Info
transformersversion: 4.22.0.dev0- Platform: Linux-5.13.0-40-generic-x86_64-with-glibc2.29
- Python version: 3.8.10
- Huggingface_hub version: 0.8.1
- PyTorch version (GPU?): 1.10.1+cu113 (True)
- Tensorflow version (GPU?): 2.9.1 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
To reproduce error (adapted code from https://huggingface.co/blog/tf-xla-generate):
import tensorflow as tf
from transformers import AutoTokenizer, TFAutoModelForCausalLM
generation_kwargs = {
"max_new_tokens": 64,
'eos_token_id': 198,
'do_sample': True,
'temperature': 0.72,
'top_k': 0,
'top_p': 0.725,
'repetition_penalty': 1.13,
}
tokenizer = AutoTokenizer.from_pretrained(
"gpt2", padding_side="left", pad_token="</s>"
)
model = TFAutoModelForCausalLM.from_pretrained("gpt2")
model.config.pad_token_id = model.config.eos_token_id
input_text = "repetition_penalty error"
xla_generate = tf.function(model.generate, jit_compile=True)
tokenized_input = tokenizer(input_text, return_tensors="tf")
print("model.generate")
model.generate(**tokenized_input, **generation_kwargs)
print("xla_generate")
xla_generate(**tokenized_input, **generation_kwargs) # error hereError:
File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_utils.py", line 604, in generate *
seed=model_kwargs.pop("seed", None),
File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_utils.py", line 1651, in _generate *
input_ids,
File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_utils.py", line 2475, in sample_body_fn *
next_tokens_scores = logits_processor(generated, next_token_logits, cur_len)
File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_logits_process.py", line 94, in __call__ *
scores = processor(input_ids, scores, cur_len)
File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_logits_process.py", line 278, in __call__ *
score_penalties = self._create_score_penalties(input_ids[:, :cur_len], scores)
File "/usr/local/lib/python3.8/dist-packages/transformers/generation_tf_logits_process.py", line 265, in _create_score_penalties *
indexable_prev_input_ids = tf.concat(
ValueError: None values not supported.
By setting repetition_penalty to 1.0 or by removing this parameter everything works fine.
Expected behavior
The expected result is the work of text generation using repetition_penalty without any errors, taking into account the use of XLA.