Skip to content

Commit 694f991

Browse files
authored
support cogvlm2-video (modelscope#1318)
1 parent 4654c1b commit 694f991

File tree

12 files changed

+394
-12
lines changed

12 files changed

+394
-12
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ SWIFT has rich documentations for users, please check [here](https://github.com/
4747
SWIFT web-ui is available both on [Huggingface space](https://huggingface.co/spaces/tastelikefeet/swift) and [ModelScope studio](https://www.modelscope.cn/studios/iic/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary), please feel free to try!
4848

4949
## 🎉 News
50+
- 2024.07.08:Support cogvlm2-video-13b-chat. You can check the best practice [here](docs/source_en/Multi-Modal/cogvlm2-video-best-practice.md).
5051
- 2024.07.08: Support internlm-xcomposer2_5-7b-chat. You can check the best practice [here](docs/source_en/Multi-Modal/internlm-xcomposer2-best-practice.md).
5152
- 2024.07.06: Support for the llava-next-video series models: llava-next-video-7b-instruct, llava-next-video-7b-32k-instruct, llava-next-video-7b-dpo-instruct, llava-next-video-34b-instruct. You can refer to [llava-video best practice](docs/source_en/Multi-Modal/llava-video-best-practice.md) for more information.
5253
- 2024.07.06: Support internvl2 series: internvl2-2b, internvl2-4b, internvl2-8b, internvl2-26b.
@@ -558,7 +559,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
558559
| XComposer2<br>XComposer2.5 | [Pujiang AI Lab InternLM vision model](https://github.com/InternLM/InternLM-XComposer) | Chinese<br>English | 7B | chat model |
559560
| DeepSeek-VL | [DeepSeek series vision models](https://github.com/deepseek-ai) | Chinese<br>English | 1.3B-7B | chat model |
560561
| MiniCPM-V<br>MiniCPM-V-2<br>MiniCPM-V-2_5 | [OpenBmB MiniCPM vision model](https://github.com/OpenBMB/MiniCPM) | Chinese<br>English | 3B-9B | chat model |
561-
| CogVLM<br>CogVLM2<br>CogAgent<br>GLM4V | [Zhipu ChatGLM visual QA and Agent model](https://github.com/THUDM/) | Chinese<br>English | 9B-19B | chat model |
562+
| CogVLM<br>CogAgent<br>CogVLM2<br>CogVLM2-Video<br>GLM4V | [Zhipu ChatGLM visual QA and Agent model](https://github.com/THUDM/) | Chinese<br>English | 9B-19B | chat model |
562563
| Llava1.5<br>Llava1.6 | [Llava series models](https://github.com/haotian-liu/LLaVA) | English | 7B-34B | chat model |
563564
| Llava-Next<br>Llava-Next-Video | [Llava-Next series models](https://github.com/LLaVA-VL/LLaVA-NeXT) | Chinese<br>English | 7B-110B | chat model |
564565
| mPLUG-Owl | [mPLUG-Owl series models](https://github.com/X-PLUG/mPLUG-Owl) | English | 11B | chat model |

README_CN.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ SWIFT具有丰富的文档体系,如有使用问题请请查看[这里](https:
4848
可以在[Huggingface space](https://huggingface.co/spaces/tastelikefeet/swift)[ModelScope创空间](https://www.modelscope.cn/studios/iic/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary) 中体验SWIFT web-ui功能了。
4949

5050
## 🎉 新闻
51+
- 2024.07.08: 支持cogvlm2-video-13b-chat. 最佳实践可以查看[这里](docs/source/Multi-Modal/cogvlm2-video最佳实践.md).
5152
- 2024.07.08: 支持internlm-xcomposer2_5-7b-chat. 最佳实践可以查看[这里](docs/source/Multi-Modal/internlm-xcomposer2最佳实践.md).
5253
- 2024.07.06: 支持llava-next-video系列模型: llava-next-video-7b-instruct, llava-next-video-7b-32k-instruct, llava-next-video-7b-dpo-instruct, llava-next-video-34b-instruct. 可以查看[llava-video最佳实践](docs/source/Multi-Modal/llava-video最佳实践.md)了解更多.
5354
- 2024.07.06: 支持internvl-2系列: internvl2-2b, internvl2-4b, internvl2-8b, internvl2-26b.
@@ -555,7 +556,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
555556
| XComposer2<br>XComposer2.5 | [浦江实验室书生浦语视觉模型](https://github.com/InternLM/InternLM-XComposer) | 中文<br>英文 | 7B | chat模型 |
556557
| DeepSeek-VL | [幻方系列视觉模型](https://github.com/deepseek-ai) | 中文<br>英文 | 1.3B-7B | chat模型 |
557558
| MiniCPM-V<br>MiniCPM-V-2<br>MiniCPM-V-2_5 | [OpenBmB MiniCPM视觉模型](https://github.com/OpenBMB/MiniCPM) | 中文<br>英文 | 3B-9B | chat模型 |
558-
| CogVLM<br>CogVLM2<br>CogAgent<br>GLM4V | [智谱ChatGLM视觉问答和Agent模型](https://github.com/THUDM/) | 中文<br>英文 | 9B-19B | chat模型 |
559+
| CogVLM<br>CogAgent<br>CogVLM2<br>CogVLM2-Video<br>GLM4V | [智谱ChatGLM视觉问答和Agent模型](https://github.com/THUDM/) | 中文<br>英文 | 9B-19B | chat模型 |
559560
| Llava1.5<br>Llava1.6 | [Llava系列模型](https://github.com/haotian-liu/LLaVA) | 英文 | 7B-34B | chat模型 |
560561
| Llava-Next<br>Llava-Next-Video | [Llava-Next系列模型](https://github.com/LLaVA-VL/LLaVA-NeXT) | 中文<br>英文 | 7B-110B | chat模型 |
561562
| mPLUG-Owl | [mPLUG-Owl系列模型](https://github.com/X-PLUG/mPLUG-Owl) | 英文 | 11B | chat模型 |

docs/source/LLM/支持的模型和数据集.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -348,7 +348,7 @@
348348
|yi-vl-34b-chat|[01ai/Yi-VL-34B](https://modelscope.cn/models/01ai/Yi-VL-34B/summary)|q_proj, k_proj, v_proj|yi-vl|&#x2714;|&#x2718;|transformers>=4.34|vision|[01-ai/Yi-VL-34B](https://huggingface.co/01-ai/Yi-VL-34B)|
349349
|llava-llama-3-8b-v1_1|[AI-ModelScope/llava-llama-3-8b-v1_1-transformers](https://modelscope.cn/models/AI-ModelScope/llava-llama-3-8b-v1_1-transformers/summary)|q_proj, k_proj, v_proj|llava-llama-instruct|&#x2714;|&#x2718;|transformers>=4.36|vision|[xtuner/llava-llama-3-8b-v1_1-transformers](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers)|
350350
|internlm-xcomposer2-7b-chat|[Shanghai_AI_Laboratory/internlm-xcomposer2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2-7b/summary)|wqkv|internlm-xcomposer2|&#x2714;|&#x2718;||vision|[internlm/internlm-xcomposer2-7b](https://huggingface.co/internlm/internlm-xcomposer2-7b)|
351-
|internlm-xcomposer2_5-7b-chat|[Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b/summary)|wqkv|internlm-xcomposer2_5|&#x2714;|&#x2718;||vision, video|[internlm/internlm-xcomposer2d5-7b](https://huggingface.co/internlm/internlm-xcomposer2d5-7b)|
351+
|internlm-xcomposer2_5-7b-chat|[Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b/summary)|wqkv|internlm-xcomposer2_5|&#x2714;|&#x2718;||vision|[internlm/internlm-xcomposer2d5-7b](https://huggingface.co/internlm/internlm-xcomposer2d5-7b)|
352352
|internvl-chat-v1_5|[AI-ModelScope/InternVL-Chat-V1-5](https://modelscope.cn/models/AI-ModelScope/InternVL-Chat-V1-5/summary)|wqkv|internvl|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/InternVL-Chat-V1-5](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5)|
353353
|internvl-chat-v1_5-int8|[AI-ModelScope/InternVL-Chat-V1-5-int8](https://modelscope.cn/models/AI-ModelScope/InternVL-Chat-V1-5-int8/summary)|wqkv|internvl|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/InternVL-Chat-V1-5-int8](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5-int8)|
354354
|mini-internvl-chat-2b-v1_5|[OpenGVLab/Mini-InternVL-Chat-2B-V1-5](https://modelscope.cn/models/OpenGVLab/Mini-InternVL-Chat-2B-V1-5/summary)|wqkv|internvl|&#x2714;|&#x2718;|transformers>=4.35, timm|vision|[OpenGVLab/Mini-InternVL-Chat-2B-V1-5](https://huggingface.co/OpenGVLab/Mini-InternVL-Chat-2B-V1-5)|
@@ -373,6 +373,7 @@
373373
|cogvlm-17b-chat|[ZhipuAI/cogvlm-chat](https://modelscope.cn/models/ZhipuAI/cogvlm-chat/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense|cogvlm|&#x2718;|&#x2718;|transformers<4.42|vision|[THUDM/cogvlm-chat-hf](https://huggingface.co/THUDM/cogvlm-chat-hf)|
374374
|cogvlm2-19b-chat|[ZhipuAI/cogvlm2-llama3-chinese-chat-19B](https://modelscope.cn/models/ZhipuAI/cogvlm2-llama3-chinese-chat-19B/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense|cogvlm|&#x2718;|&#x2718;|transformers<4.42|vision|[THUDM/cogvlm2-llama3-chinese-chat-19B](https://huggingface.co/THUDM/cogvlm2-llama3-chinese-chat-19B)|
375375
|cogvlm2-en-19b-chat|[ZhipuAI/cogvlm2-llama3-chat-19B](https://modelscope.cn/models/ZhipuAI/cogvlm2-llama3-chat-19B/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense|cogvlm|&#x2718;|&#x2718;|transformers<4.42|vision|[THUDM/cogvlm2-llama3-chat-19B](https://huggingface.co/THUDM/cogvlm2-llama3-chat-19B)|
376+
|cogvlm2-video-13b-chat|[ZhipuAI/cogvlm2-video-llama3-chat](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-chat/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense|cogvlm2-video|&#x2718;|&#x2718;|transformers<4.42, decord, pytorchvideo|vision, video|[THUDM/cogvlm2-video-llama3-chat](https://huggingface.co/THUDM/cogvlm2-video-llama3-chat)|
376377
|cogagent-18b-chat|[ZhipuAI/cogagent-chat](https://modelscope.cn/models/ZhipuAI/cogagent-chat/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent-chat|&#x2718;|&#x2718;|timm|vision|[THUDM/cogagent-chat-hf](https://huggingface.co/THUDM/cogagent-chat-hf)|
377378
|cogagent-18b-instruct|[ZhipuAI/cogagent-vqa](https://modelscope.cn/models/ZhipuAI/cogagent-vqa/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent-instruct|&#x2718;|&#x2718;|timm|vision|[THUDM/cogagent-vqa-hf](https://huggingface.co/THUDM/cogagent-vqa-hf)|
378379

docs/source/LLM/自定义与拓展.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,17 +7,20 @@
77
## 自定义数据集
88
我们支持三种**自定义数据集**的方法.
99

10-
1. 【推荐】直接命令行传参的方式,指定`--dataset xxx.json yyy.jsonl zzz.csv`, **更加方便支持自定义数据集**, 支持五种数据集格式(即使用`SmartPreprocessor`,支持的数据集格式见下方), 支持`dataset_id``dataset_path`. 不需要修改`dataset_info.json`文件.
10+
1. 【推荐】直接命令行传参的方式,指定`--dataset xxx.json yyy.jsonl zzz.csv`, **更加方便支持自定义数据集**, 支持五种数据集格式(即使用`SmartPreprocessor`,支持的数据集格式见下方), 支持`dataset_id``dataset_path`. 不需要修改`dataset_info.json`文件. 该方法适合刚接触ms-swift的用户, 下两种方法适合对ms-swift进行拓展的开发者.
1111
2. 添加数据集到`dataset_info.json`中, 比第一种方式更灵活但繁琐, 支持对数据集使用两种预处理器并指定其参数: `RenameColumnsPreprocessor`, `ConversationsPreprocessor`(默认使用`SmartPreprocessor`). 支持直接修改swift内置的`dataset_info.json`, 或者通过`--custom_dataset_info xxx.json`的方式传入外置的json文件(方便pip install而非git clone的用户拓展数据集).
1212
3. **注册数据集**的方式: 比第1、2种方式更加灵活但繁琐, 支持使用函数对数据集进行预处理. 方法1、2在实现上借助了方法3. 可以直接修改源码进行拓展, 或者通过`--custom_register_path xxx.py`的方式传入, 脚本会对py文件进行解析(方便pip install的用户).
1313

1414
### 📌 【推荐】直接命令行传参
1515
支持直接传入行自定义的**dataset_id**(兼容MS和HF)和**dataset_path**, 以及同时传入多个自定义数据集以及对应采样数, 脚本会进行自动的预处理和拼接. 如果传入的是`dataset_id`, 默认会使用dataset\_id中的'default'子数据集, 并设置split为'train'. 如果该dataset\_id已经注册, 则会使用注册时传入的subsets、split以及预处理函数. 如果传入的是`dataset_path`, 则可以指定为相对路径和绝对路径, 其中相对路径为相对于当前运行目录.
1616

17+
每个数据集指定格式如下: `[HF or MS::]{dataset_name} or {dataset_id} or {dataset_path}[:subset1/subset2/...][#dataset_sample]`, 最简只需要指定dataset_name、dataset_id或者dataset_path即可.
18+
1719
```bash
18-
--dataset {dataset_id} {dataset_path}
20+
# 默认使用modelscope的dataset_id, 同时也支持huggingface的dataset_id
21+
--dataset {dataset_id} {dataset_path} HF::{dataset_id}
1922

20-
# 数据集混合: 以下取dataset_id中subset1和subset2子数据集并采样20000条
23+
# 数据集混合: 以下取dataset_id中subset1和subset2子数据集并采样20000条. 如果不使用`#{dataset_sample}`, 则使用数据集中的所有样本
2124
--dataset {dataset_name}#20000 {dataset_id}:{subset1}/{subset2}#20000 {dataset_path}#10000
2225
```
2326

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
2+
# CogVLM2 Video 最佳实践
3+
4+
## 目录
5+
- [环境准备](#环境准备)
6+
- [推理](#推理)
7+
- [微调](#微调)
8+
- [微调后推理](#微调后推理)
9+
10+
11+
## 环境准备
12+
```shell
13+
git clone https://github.com/modelscope/swift.git
14+
cd swift
15+
pip install -e '.[llm]'
16+
17+
# https://github.com/facebookresearch/pytorchvideo/issues/258
18+
# https://github.com/dmlc/decord/issues/177
19+
pip install decord pytorchvideo
20+
```
21+
22+
模型链接:
23+
- cogvlm2-video-13b-chat: [https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-chat](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-chat)
24+
25+
26+
## 推理
27+
28+
推理cogvlm2-video-13b-chat:
29+
```shell
30+
# Experimental environment: A100
31+
# 28GB GPU memory
32+
CUDA_VISIBLE_DEVICES=0 swift infer --model_type cogvlm2-video-13b-chat
33+
```
34+
35+
输出: (支持传入本地路径或URL)
36+
```python
37+
"""
38+
<<< 描述这段视频
39+
Input a video path or URL <<< https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/baby.mp4
40+
In the video, a young child is seen sitting on a bed and reading a book. The child is wearing glasses and is dressed in a light blue top and pink pants. The room appears to be a bedroom with a crib in the background. The child is engrossed in the book, and the scene is captured in a series of frames showing the child's interaction with the book.
41+
--------------------------------------------------
42+
<<< clear
43+
<<< Describe this video.
44+
Input a video path or URL <<< https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/fire.mp4
45+
In the video, a person is seen lighting a fire in a backyard setting. They start by holding a piece of food and then proceed to light a match to the food. The fire is then ignited, and the person continues to light more pieces of food, including a bag of chips and a piece of wood. The fire is seen burning brightly, and the person is seen standing over the fire, possibly enjoying the warmth. The video captures the process of starting a fire and the person's interaction with the flames, creating a cozy and inviting atmosphere.
46+
--------------------------------------------------
47+
<<< clear
48+
<<< who are you
49+
Input a video path or URL <<<
50+
I am a person named John.
51+
"""
52+
```
53+
54+
**单样本推理**
55+
56+
```python
57+
import os
58+
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
59+
60+
from swift.llm import (
61+
get_model_tokenizer, get_template, inference, ModelType,
62+
get_default_template_type, inference_stream
63+
)
64+
from swift.utils import seed_everything
65+
import torch
66+
67+
model_type = ModelType.cogvlm2_video_13b_chat
68+
template_type = get_default_template_type(model_type)
69+
print(f'template_type: {template_type}')
70+
71+
model, tokenizer = get_model_tokenizer(model_type, torch.float16,
72+
model_kwargs={'device_map': 'auto'})
73+
model.generation_config.max_new_tokens = 256
74+
template = get_template(template_type, tokenizer)
75+
seed_everything(42)
76+
77+
videos = ['https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/baby.mp4']
78+
query = '描述这段视频'
79+
response, history = inference(model, template, query, videos=videos)
80+
print(f'query: {query}')
81+
print(f'response: {response}')
82+
83+
# 流式
84+
query = 'Describe this video.'
85+
videos = ['https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/fire.mp4']
86+
gen = inference_stream(model, template, query, history, videos=videos)
87+
print_idx = 0
88+
print(f'query: {query}\nresponse: ', end='')
89+
for response, _ in gen:
90+
delta = response[print_idx:]
91+
print(delta, end='', flush=True)
92+
print_idx = len(response)
93+
print()
94+
95+
"""
96+
query: 描述这段视频
97+
response: The video depicts a young child sitting on a bed and reading a book. The child is wearing glasses and is seen in various positions, such as sitting on the bed, sitting on a couch, and sitting on a bed with a blanket. The child's attire changes from a light blue top and pink pants to a light blue top and pink leggings. The room has a cozy and warm atmosphere with soft lighting, and there are personal items scattered around, such as a crib, a television, and a white garment.
98+
query: Describe this video.
99+
response: The video shows a person lighting a fire in a backyard setting. The person is seen holding a piece of food and a lighter, and then lighting the food on fire. The fire is then used to light other pieces of wood, and the person is seen standing over the fire, holding a bag of food. The video captures the process of starting a fire and the person's interaction with the fire.
100+
"""
101+
```
102+
103+
104+
## 微调
105+
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:
106+
107+
(默认对LLM的qkv进行lora微调. 如果你想对所有linear都进行微调, 可以指定`--lora_target_modules ALL`)
108+
```shell
109+
# Experimental environment: A100
110+
# 40GB GPU memory
111+
CUDA_VISIBLE_DEVICES=0 swift sft \
112+
--model_type cogvlm2-video-13b-chat \
113+
--dataset video-chatgpt
114+
```
115+
116+
[自定义数据集](../LLM/自定义与拓展.md#-推荐命令行参数的形式)支持json, jsonl样式, 以下是自定义数据集的例子:
117+
118+
(支持多轮对话, 但总的轮次对话只能包含一张图片, 支持传入本地路径或URL)
119+
120+
```jsonl
121+
{"query": "55555", "response": "66666", "videos": ["video_path"]}
122+
{"query": "eeeee", "response": "fffff", "history": [], "videos": ["video_path"]}
123+
{"query": "EEEEE", "response": "FFFFF", "history": [["AAAAA", "BBBBB"], ["CCCCC", "DDDDD"]], "videos": ["video_path"]}
124+
```
125+
126+
127+
## 微调后推理
128+
直接推理:
129+
```shell
130+
CUDA_VISIBLE_DEVICES=0 swift infer \
131+
--ckpt_dir output/cogvlm2-video-13b-chat/vx-xxx/checkpoint-xxx \
132+
--load_dataset_config true \
133+
```
134+
135+
**merge-lora**并推理:
136+
```shell
137+
CUDA_VISIBLE_DEVICES=0 swift export \
138+
--ckpt_dir output/cogvlm2-video-13b-chat/vx-xxx/checkpoint-xxx \
139+
--merge_lora true
140+
141+
CUDA_VISIBLE_DEVICES=0 swift infer \
142+
--ckpt_dir output/cogvlm2-video-13b-chat/vx-xxx/checkpoint-xxx-merged \
143+
--load_dataset_config true
144+
```

docs/source/Multi-Modal/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,6 @@
2222
4. [florence最佳实践](florence最佳实践.md)
2323

2424
整个对话围绕一张图片(可能可以不含图片):
25-
1. [CogVLM最佳实践](cogvlm最佳实践.md), [CogVLM2最佳实践](cogvlm2最佳实践.md), [glm4v最佳实践](glm4v最佳实践.md)
25+
1. [CogVLM最佳实践](cogvlm最佳实践.md), [CogVLM2最佳实践](cogvlm2最佳实践.md), [glm4v最佳实践](glm4v最佳实践.md), [CogVLM2-Video最佳实践](cogvlm2-video最佳实践.md)
2626
2. [MiniCPM-V最佳实践](minicpm-v最佳实践.md), [MiniCPM-V-2最佳实践](minicpm-v-2最佳实践.md), [MiniCPM-V-2.5最佳实践](minicpm-v-2.5最佳实践.md)
2727
3. [InternVL-Chat-V1.5最佳实践](internvl最佳实践.md)

0 commit comments

Comments
 (0)