Skip to content

[Bug]: AMD MultiStep Feature Issue. Missing argument: 'turn_prefills_into_decodes' in advance_step() #9111

@tjtanaa

Description

@tjtanaa

Your current environment

The output of `python collect_env.py`
Your output of `python collect_env.py` here

Model Input Dumps

No response

🐛 Describe the bug

Cause

There is a change in advance_step API in PR #8378 . The changed is made to flash_attn.py and flashinfer.py but not other backend.

Code Snippet to highlight the difference in advance_step API

flash_attn.py

def advance_step(self,
                     model_input: "ModelInputForGPUWithSamplingMetadata",
                     sampled_token_ids: Optional[torch.Tensor],
                     block_size: int,
                     num_seqs: int,
                     num_queries: int,
                     turn_prefills_into_decodes: bool = False):

rocm_flash_attn.py

    def advance_step(self, model_input: "ModelInputForGPUWithSamplingMetadata",
                     sampled_token_ids: Optional[torch.Tensor],
                     block_size: int, num_seqs: int, num_queries: int):

Confusion whether this is a bug or because multistep feature is not supported on AMD

I saw there are other PR that is stating that multi steps feature are working on AMD e.g. #8474 .

Logs

The error logs and traceback is as follows:

ERROR 10-06 16:38:45 engine.py:157]

(VIImWorkerProcess pid=207827) ERROR 10-06 16:38:45 multiproc_worker_utils . py : 231
rker_base.py", line 327, in execute_model
ERROR 10-06 16:38:45 engine.py:157] TypeError("advance_step() got an unexpected keyword argument 'turn_prefills_into_decodes'")
ERROR 10-06 16:38:45 engine.py:157] Traceback (most recent call last):
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/engine/multiprocessing/engine.py", line 155, in start
self.run_engine_loop()
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/engine/multiprocessing/engine.py", line 218, in run_engi

request_outputs = self. engine_step()
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/engine/multiprocessing/engine.py", line 236, in engine_s

raise e
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/engine/multiprocessing/engine.py", line 227, in engine_s

return self.engine. step()
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/engine/1lm_engine.py", line 1404, in step
outputs = self.model_executor. execute_model(
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/executor/distributed_gpu_executor.py", line 78, in execu

driver_outputs = self ._ driver_execute_model(execute_model_req)
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/executor/multiproc_gpu_executor.py", line 155, in _drive

return self.driver_worker. execute_model(execute_model_req)
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/worker/worker_base.py", line 327, in execute_model
output = self.model_runner. execute_model(
File "/home/aac/anaconda3/envs/rocm611-0929/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in

return func(*args, ** kwargs)
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/worker/multi_step_model_runner.py", line 507, in execute

model_input = self ._ advance_step(
File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/vllm/worker/multi_step_model_runner.py", line 634, in _advanc

attn_metadata.advance_step(
TypeError: advance_step() got an unexpected keyword argument 'turn prefills into decodes'

File "/home/aac/apps/rocm611-0929/vllm-fix-spec-amd/v1lm/worker/wo

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions