-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Open
Description
I got this error when I was training mixtral model with verl megatron backend.
(TaskRunner pid=925038) File "/data/ib-a100-cluster-a-pri-lmalign_942/personal/kevin/verl-lmalign/verl/workers/sharding_manager/megatron_vllm.py", line 148, in __enter__
(TaskRunner pid=925038) patch_vllm_moe_model_weight_loader(model)
(TaskRunner pid=925038) File "/data/ib-a100-cluster-a-pri-lmalign_942/personal/kevin/verl-lmalign/verl/utils/vllm_utils.py", line 100, in patch_vllm_moe_model_weight_loader
(TaskRunner pid=925038) mlp = getattr(layer, mlp_attr)
(TaskRunner pid=925038) File "/home/bc-user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1928, in __getattr__
(TaskRunner pid=925038) raise AttributeError(
(TaskRunner pid=925038) AttributeError: 'MixtralDecoderLayer' object has no attribute 'mlp'
I think the problem occurred from:
Lines 75 to 89 in bf89f61
MLP_ATTR_MAPPING = { | |
MixtralForCausalLM: "block_sparse_moe", | |
} | |
DEFAULT_MLP_ATTR = "mlp" | |
if not isinstance(model, tuple(SUPPORTED_MOE_MODELS)): | |
return | |
model = getattr(model, "model", None) or getattr(model, "language_model", None) | |
if model is None: | |
raise ValueError("The provided model does not have a valid 'model' or 'language_model' attribute.") | |
for layer in model.layers: | |
mlp_attr = MLP_ATTR_MAPPING.get(type(model), DEFAULT_MLP_ATTR) | |
mlp = getattr(layer, mlp_attr) |
When we call this function, the model object is MixtralForCausalLM
type. And after model = getattr(model, "model", None) or getattr(model, "language_model", None)
line, model object type changes to MixtralModel
. but that class isn't in the MLP_ATTR_MAPPING
dictionary so it occurs error.
When I changed code like the following, the error disappeared.
from vllm.model_executor.models.mixtral import MixtralForCausalLM, MixtralModel
... skip ...
MLP_ATTR_MAPPING = {
MixtralForCausalLM: "block_sparse_moe",
MixtralModel: "block_sparse_moe", # <--- added !
}
DEFAULT_MLP_ATTR = "mlp"
if not isinstance(model, tuple(SUPPORTED_MOE_MODELS)):
return
model = getattr(model, "model", None) or getattr(model, "language_model", None)
if model is None:
raise ValueError("The provided model does not have a valid 'model' or 'language_model' attribute.")
for layer in model.layers:
mlp_attr = MLP_ATTR_MAPPING.get(type(model), DEFAULT_MLP_ATTR)
mlp = getattr(layer, mlp_attr)
Metadata
Metadata
Assignees
Labels
No labels