Skip to content

Commit b0f506a

Browse files
fix docs and a bug (modelscope#1023)
1 parent db9476f commit b0f506a

File tree

3 files changed

+4
-3
lines changed

3 files changed

+4
-3
lines changed

docs/source/LLM/命令行参数.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
- DeepSeek-VL模型: `https://github.com/deepseek-ai/DeepSeek-VL`
2121
- YI-VL模型: `https://github.com/01-ai/Yi`
2222
- LLAVA模型: `https://github.com/haotian-liu/LLaVA.git`
23-
- `--sft_type`: 表示微调的方式, 默认是`'lora'`. 你可以选择的值包括: 'lora', 'full', 'longlora', 'qalora'. 如果你要使用qlora, 你需设置`--sft_type lora --quantization_bit 4`.
23+
- `--sft_type`: 表示微调的方式, 默认是`'lora'`. 你可以选择的值包括: 'lora', 'full', 'longlora', 'adalora', 'ia3', 'llamapro', 'adapter', 'vera', 'boft'. 如果你要使用qlora, 你需设置`--sft_type lora --quantization_bit 4`.
2424
- `--packing`: pack数据集到`max-length`, 默认值`False`.
2525
- `--freeze_parameters`: 当sft_type指定为'full'时, 将模型最底部的参数进行freeze. 指定范围为0. ~ 1., 默认为`0.`. 该参数提供了lora与全参数微调的折中方案.
2626
- `--additional_trainable_parameters`: 作为freeze_parameters的补充, 只有在sft_type指定为'full'才允许被使用, 默认为`[]`. 例如你如果想训练50%的参数的情况下想额外训练embedding层, 你可以设置`--freeze_parameters 0.5 --additional_trainable_parameters transformer.wte`, 所有以`transformer.wte`开头的parameters都会被激活.

docs/source_en/LLM/Command-line-parameters.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
- DeepSeek-VL model: `https://github.com/deepseek-ai/DeepSeek-VL`
1919
- YI-VL model: `https://github.com/01-ai/Yi`
2020
- LLAVA model: `https://github.com/haotian-liu/LLaVA.git`
21-
- `--sft_type`: Fine-tuning method, default is `'lora'`. Options include: 'lora', 'full', 'longlora', 'qalora'. If using qlora, you need to set `--sft_type lora --quantization_bit 4`.
21+
- `--sft_type`: Fine-tuning method, default is `'lora'`. Options include: 'lora', 'full', 'longlora', 'adalora', 'ia3', 'llamapro', 'adapter', 'vera', 'boft'. If using qlora, you need to set `--sft_type lora --quantization_bit 4`.
2222
- `--packing`: pack the dataset length to `max-length`, default `False`.
2323
- `--freeze_parameters`: When sft_type is set to 'full', freeze the bottommost parameters of the model. Range is 0. ~ 1., default is `0.`. This provides a compromise between lora and full fine-tuning.
2424
- `--additional_trainable_parameters`: In addition to freeze_parameters, only allowed when sft_type is 'full', default is `[]`. For example, if you want to train embedding layer in addition to 50% of parameters, you can set `--freeze_parameters 0.5 --additional_trainable_parameters transformer.wte`, all parameters starting with `transformer.wte` will be activated.

swift/llm/utils/argument.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -467,7 +467,8 @@ class SftArguments(ArgumentsBase):
467467
lora_lr_ratio: float = None
468468
use_rslora: bool = False
469469
use_dora: bool = False
470-
init_lora_weights: Literal['gaussian', 'pissa', 'pissa_niter_[number of iters]', 'loftq', 'true', 'false'] = 'true'
470+
# Literal['gaussian', 'pissa', 'pissa_niter_[number of iters]', 'loftq', 'true', 'false']
471+
init_lora_weights: str = 'true'
471472

472473
# BOFT
473474
boft_block_size: int = 4

0 commit comments

Comments
 (0)