-
Notifications
You must be signed in to change notification settings - Fork 29.9k
Closed
Labels
Description
System Info
transformers
version: 4.49.0.dev0 (315a9f4~1)- Platform: Windows-10-10.0.20348-SP0
- Python version: 3.11.7
- Huggingface_hub version: 0.28.1
- Safetensors version: 0.5.2
- Accelerate version: 1.3.0
- Accelerate config: not found
- PyTorch version (GPU?): 2.6.0+cu124 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): 0.10.2 (cpu)
- Jax version: 0.5.0
- JaxLib version: 0.5.0
- Using distributed or parallel set-up in script?: no
- Using GPU in script?: no
- GPU type: NVIDIA RTX A5000
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Use Phi3 with any cache configuration, including default (DynamicCache)
I think get_max_length is probably declared on a mixin that isn't on the cache classes yet?
comfy_extras\nodes\nodes_language.py:361: in execute
return model.generate(tokens, max_new_tokens, repetition_penalty, seed, sampler),
comfy\language\transformers_model_management.py:228: in generate
output_ids = transformers_model.generate(
..\..\.venv\Lib\site-packages\torch\utils\_contextlib.py:116: in decorate_context
return func(*args, **kwargs)
..\..\.venv\Lib\site-packages\transformers\generation\utils.py:2224: in generate
result = self._sample(
..\..\.venv\Lib\site-packages\transformers\generation\utils.py:3198: in _sample
model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
C:\Users\bberman\.cache\huggingface\modules\transformers_modules\c1358f8a35e6d2af81890deffbbfa575b978c62f\modeling_phi3.py:1292: in prepare_inputs_for_generation
max_cache_length = past_key_values.get_max_length()
Expected behavior
related to #35168?
I'm not sure why this is only coming up with phi-3 so far