Skip to content

Current MLP_ATTR_MAPPING dictionary is wrong. #2834

@hyunwoongko

Description

@hyunwoongko

I got this error when I was training mixtral model with verl megatron backend.

(TaskRunner pid=925038)   File "/data/ib-a100-cluster-a-pri-lmalign_942/personal/kevin/verl-lmalign/verl/workers/sharding_manager/megatron_vllm.py", line 148, in __enter__
(TaskRunner pid=925038)     patch_vllm_moe_model_weight_loader(model)
(TaskRunner pid=925038)   File "/data/ib-a100-cluster-a-pri-lmalign_942/personal/kevin/verl-lmalign/verl/utils/vllm_utils.py", line 100, in patch_vllm_moe_model_weight_loader
(TaskRunner pid=925038)     mlp = getattr(layer, mlp_attr)
(TaskRunner pid=925038)   File "/home/bc-user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1928, in __getattr__
(TaskRunner pid=925038)     raise AttributeError(
(TaskRunner pid=925038) AttributeError: 'MixtralDecoderLayer' object has no attribute 'mlp'

I think the problem occurred from:

MLP_ATTR_MAPPING = {
MixtralForCausalLM: "block_sparse_moe",
}
DEFAULT_MLP_ATTR = "mlp"
if not isinstance(model, tuple(SUPPORTED_MOE_MODELS)):
return
model = getattr(model, "model", None) or getattr(model, "language_model", None)
if model is None:
raise ValueError("The provided model does not have a valid 'model' or 'language_model' attribute.")
for layer in model.layers:
mlp_attr = MLP_ATTR_MAPPING.get(type(model), DEFAULT_MLP_ATTR)
mlp = getattr(layer, mlp_attr)

When we call this function, the model object is MixtralForCausalLM type. And after model = getattr(model, "model", None) or getattr(model, "language_model", None) line, model object type changes to MixtralModel. but that class isn't in the MLP_ATTR_MAPPING dictionary so it occurs error.

When I changed code like the following, the error disappeared.

from vllm.model_executor.models.mixtral import MixtralForCausalLM, MixtralModel

... skip ...

    MLP_ATTR_MAPPING = {
        MixtralForCausalLM: "block_sparse_moe",
        MixtralModel: "block_sparse_moe",  # <--- added !
    }
    DEFAULT_MLP_ATTR = "mlp"

    if not isinstance(model, tuple(SUPPORTED_MOE_MODELS)):
        return

    model = getattr(model, "model", None) or getattr(model, "language_model", None)
    if model is None:
        raise ValueError("The provided model does not have a valid 'model' or 'language_model' attribute.")

    for layer in model.layers:
        mlp_attr = MLP_ATTR_MAPPING.get(type(model), DEFAULT_MLP_ATTR)
        mlp = getattr(layer, mlp_attr)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions