Skip to content

trainable parameter of loramoe #29

@paulzyzy

Description

@paulzyzy

您好,我在训练loramoe的时候想print trainable parameter,但是按照instruction生成jason file并用moe_peft.py 或launch.py跑了之后得到如下print结果,请问可能是哪里出问题了呢?感谢您的解答

python moe_peft.py --base_model TinyLlama/TinyLlama_v1.1 --config moe_peft.json --bf16

{
"cutoff_len": 512,
"save_step": 1000,
"train_lora_candidate_num": 2,
"train_lora_simultaneously_num": 2,
"train_strategy": "optim",
"lora": [
{
"name": "mrpc_0",
"task_name": "glue:mrpc",
"optim": "adamw",
"scheduler_type": "constant",
"warmup_steps": 0,
"lr": 0.0002,
"batch_size": 16,
"micro_batch_size": 8,
"evaluate_batch_size": 16,
"num_epochs": 2,
"r": 24,
"lora_alpha": 48,
"lora_dropout": 0.05,
"target_modules": {
"q_proj": false,
"k_proj": false,
"v_proj": false,
"o_proj": false,
"gate_proj": true,
"down_proj": true,
"up_proj": true
},
"routing_strategy": "loramoe",
"num_experts": 6,
"group_by_length": false
}
]
}

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions