Skip to content

KeyError: 'visual.patch_embed.proj.weight #18

@HumanZhong

Description

@HumanZhong

Hi, thanks for sharing your great work. I have encountered this keyerror issue after following your installation steps. This reported error is as follows:

(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/ray/_private/function_manager.py", line 689, in actor_method_executor
(WorkerDict pid=595461) [rank0]:     return method(__ray_actor, *args, **kwargs)
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/ray/util/tracing/tracing_helper.py", line 463, in _resume_span
(WorkerDict pid=595461) [rank0]:     return method(self, *_args, **_kwargs)
(WorkerDict pid=595461) [rank0]:   File "/mnt/DeepEyes/verl/single_controller/ray/base.py", line 446, in func
(WorkerDict pid=595461) [rank0]:     return getattr(self.worker_dict[key], name)(*args, **kwargs)
(WorkerDict pid=595461) [rank0]:   File "/mnt/DeepEyes/verl/single_controller/base/decorator.py", line 413, in inner
(WorkerDict pid=595461) [rank0]:     return func(*args, **kwargs)
(WorkerDict pid=595461) [rank0]:   File "/mnt/DeepEyes/verl/workers/fsdp_workers.py", line 560, in generate_sequences
(WorkerDict pid=595461) [rank0]:     with self.rollout_sharding_manager:
(WorkerDict pid=595461) [rank0]:   File "/mnt/DeepEyes/verl/utils/debug/performance.py", line 61, in f
(WorkerDict pid=595461) [rank0]:     return self.log(decorated_function, *args, **kwargs)
(WorkerDict pid=595461) [rank0]:   File "/mnt/DeepEyes/verl/utils/debug/performance.py", line 70, in log
(WorkerDict pid=595461) [rank0]:     output = func(*args, **kwargs)
(WorkerDict pid=595461) [rank0]:   File "/mnt/DeepEyes/verl/workers/sharding_manager/fsdp_vllm.py", line 119, in __enter__
(WorkerDict pid=595461) [rank0]:     self.update_params(params)
(WorkerDict pid=595461) [rank0]:   File "/mnt/DeepEyes/verl/workers/sharding_manager/fsdp_vllm.py", line 215, in update_params
(WorkerDict pid=595461) [rank0]:     loaded_params = model.load_weights(
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py", line 1098, in load_weights
(WorkerDict pid=595461) [rank0]:     return loader.load_weights(weights, mapper=self.hf_to_vllm_mapper)
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 235, in load_weights
(WorkerDict pid=595461) [rank0]:     autoloaded_weights = set(self._load_module("", self.module, weights))
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 196, in _load_module
(WorkerDict pid=595461) [rank0]:     yield from self._load_module(prefix,
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 173, in _load_module
(WorkerDict pid=595461) [rank0]:     loaded_params = module_load_weights(weights)
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/vllm/model_executor/models/qwen2.py", line 490, in load_weights
(WorkerDict pid=595461) [rank0]:     return loader.load_weights(weights)
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 235, in load_weights
(WorkerDict pid=595461) [rank0]:     autoloaded_weights = set(self._load_module("", self.module, weights))
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 196, in _load_module
(WorkerDict pid=595461) [rank0]:     yield from self._load_module(prefix,
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 173, in _load_module
(WorkerDict pid=595461) [rank0]:     loaded_params = module_load_weights(weights)
(WorkerDict pid=595461) [rank0]:   File "/mnt/anaconda3_new/envs/deepeyes/lib/python3.10/site-packages/vllm/model_executor/models/qwen2.py", line 400, in load_weights
(WorkerDict pid=595461) [rank0]:     param = params_dict[name]
(WorkerDict pid=595461) [rank0]: KeyError: 'visual.patch_embed.proj.weight

It seems the issue comes from verl/workers/sharding_manager/fsdp_vllm.py line109: self.update_params(params)
Further, it goes into vllm/model_executor/models/qwen2.py: load_weights(self, weights) of Qwen2Model.
I have checked the params_dict and weights_dict and they are indeed different.

I have tried these:

  1. replacing the self.update_params with:
from verl.third_party.vllm import load_dtensor_weights
load_dtensor_weights(
    params, self.inference_engine.llm_engine.model_executor.driver_worker.worker.model_runner.model)

which was implemented in an older version of verl. After doing so, the keyerror is handled but the vllm_engine will generate random meaningless sequences so I guess the weight loading is still wrong.
2. change vllm==0.8.2 to vllm==0.8.0. It does not help.

Have you ever encountered similar issue and can you give some advice to deal with this issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions