Skip to content

Commit 80cc859

Browse files
authored
[MoE] fix the bug when using 0-D tensor in MoE model (#5538)
1 parent dd1da39 commit 80cc859

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

examples/language_model/moe/dygraph/run_moe_pretrain.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -551,7 +551,7 @@ def do_train(args):
551551

552552
if args.gate != "naive" and args.balance_loss_weight:
553553
aux_loss_list = [
554-
l.moe_mlp.gate.get_loss(clear=False)
554+
l.moe_mlp.gate.get_loss(clear=False).reshape([-1])
555555
for l in model.gpt.decoder.layers
556556
if hasattr(l.moe_mlp, "gate")
557557
]

0 commit comments

Comments
 (0)