Skip to content

fix args.json #4566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/Instruction/命令行参数.md
Original file line number Diff line number Diff line change
Expand Up @@ -352,7 +352,7 @@ Vera使用`target_modules`, `target_regex`, `modules_to_save`三个参数.
- check_model: 检查本地模型文件有损坏或修改并给出提示,默认为True。如果是断网环境,请设置为False。
- 🔥create_checkpoint_symlink: 额外创建checkpoint软链接,方便书写自动化训练脚本。best_model和last_model的软链接路径分别为f'{output_dir}/best'和f'{output_dir}/last'。
- loss_type: loss类型。默认为None,使用模型自带损失函数。
- channels : 数据集包含的channel集合。默认为None。结合`--loss_type channel_loss`使用,可参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/channel_loss.sh)。
- channels: 数据集包含的channel集合。默认为None。结合`--loss_type channel_loss`使用,可参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/channel_loss.sh)。
- 🔥packing: 是否使用序列packing提升计算效率,默认为False。当前支持`swift pt/sft`。
- 注意:使用packing请结合`--attn_impl flash_attn`使用且"transformers>=4.44",具体查看[该PR](https://github.com/huggingface/transformers/pull/31629)。
- 支持的多模态模型参考:https://github.com/modelscope/ms-swift/blob/main/examples/train/packing/qwen2_5_vl.sh
Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/Instruction/Command-line-parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -361,7 +361,7 @@ Training arguments include the [base arguments](#base-arguments), [Seq2SeqTraine
- check_model: Check local model files for corruption or modification and give a prompt, default is True. If in an offline environment, please set to False.
- 🔥create_checkpoint_symlink: Creates additional checkpoint symlinks to facilitate writing automated training scripts. The symlink paths for `best_model` and `last_model` are `f'{output_dir}/best'` and `f'{output_dir}/last'` respectively.
- loss_type: Type of loss. Defaults to None, which uses the model's built-in loss function.
- channelsSet of channels included in the dataset. Defaults to None. Used in conjunction with `--loss_type channel_loss`. Refer to [this example](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/channel_loss.sh) for more details.
- channels: Set of channels included in the dataset. Defaults to None. Used in conjunction with `--loss_type channel_loss`. Refer to [this example](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/channel_loss.sh) for more details.
- 🔥packing: Whether to use sequence packing to improve computational efficiency. The default value is False. Currently supports `swift pt/sft`.
- Note: When using packing, please combine it with `--attn_impl flash_attn` and ensure "transformers>=4.44". For details, see [this PR](https://github.com/huggingface/transformers/pull/31629).
- Supported multimodal models reference: https://github.com/modelscope/ms-swift/blob/main/examples/train/packing/qwen2_5_vl.sh
Expand Down
4 changes: 3 additions & 1 deletion swift/llm/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -252,9 +252,11 @@ def save_checkpoint(model: Optional[PreTrainedModel],
if model and model.model_dir and model.model_dir not in model_dirs:
model_dirs.append(model.model_dir)
for src_file in (additional_saved_files or []) + ['preprocessor_config.json', 'args.json']:
tgt_path = os.path.join(output_dir, src_file)
if os.path.exists(tgt_path) and src_file == 'args.json':
continue
for model_dir in model_dirs:
src_path: str = os.path.join(model_dir, src_file)
tgt_path = os.path.join(output_dir, src_file)
if os.path.isfile(src_path):
shutil.copy(src_path, tgt_path)
break
Expand Down