Skip to content

Commit a86acee

Browse files
committed
Revert "supports strategy 'qat' (PaddlePaddle#3271)"
This reverts commit 217a25c.
1 parent 51b0609 commit a86acee

File tree

5 files changed

+253
-505
lines changed

5 files changed

+253
-505
lines changed

docs/compression.md

Lines changed: 35 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
* [Step1:获取模型压缩参数 compression_args](#获取模型压缩参数compression_args)
77
* [Step2:实例化 Trainer 并调用 compress()](#实例化Trainer并调用compress())
88
* [Trainer 实例化参数介绍](#Trainer实例化参数介绍)
9-
* [Step3:实现自定义评估函数(按需可选)](#实现自定义评估函数(按需可选))
9+
* [Step3:实现自定义评估函数和 loss 计算函数(按需可选)](#实现自定义评估函数和loss计算函数(按需可选))
1010
* [Step4:传参并运行压缩脚本](#传参并运行压缩脚本)
1111
* [CompressionArguments 参数介绍](#CompressionArguments参数介绍)
1212
* [三大场景模型压缩 API 使用示例](#三大场景模型压缩API使用示例)
@@ -118,45 +118,11 @@ compression_args = parser.parse_args_into_dataclasses()
118118
#### Trainer 实例化参数介绍
119119

120120
- **--model** 待压缩的模型,目前支持 ERNIE、BERT、RoBERTa、ERNIE-M、ELECTRA、ERNIE-Gram、PP-MiniLM、TinyBERT 等结构相似的模型,是在下游任务中微调后的模型,当预训练模型选择 ERNIE 时,需要继承 `ErniePretrainedModel`。以分类任务为例,可通过`AutoModelForSequenceClassification.from_pretrained(model_name_or_path)` 等方式来获取,这种情况下,`model_name_or_path`目录下需要有 model_config.json, model_state.pdparams 文件;
121-
- **--data_collator** 三类任务均可使用 PaddleNLP 预定义好的 [DataCollator 类](../paddlenlp/data/data_collator.py)`data_collator` 可对数据进行 `Pad` 等操作。使用方法参考 [示例代码](../model_zoo/ernie-3.0/compress_seq_cls.py) 即可;
121+
- **--data_collator** 三类任务均可使用 PaddleNLP 预定义好的 [DataCollator 类](../../paddlenlp/data/data_collator.py)`data_collator` 可对数据进行 `Pad` 等操作。使用方法参考 [示例代码](../model_zoo/ernie-3.0/compress_seq_cls.py) 即可;
122122
- **--train_dataset** 裁剪训练需要使用的训练集,是任务相关的数据。自定义数据集的加载可参考 [文档](https://huggingface.co/docs/datasets/loading)。不启动裁剪时,可以为 None;
123123
- **--eval_dataset** 裁剪训练使用的评估集,也是量化使用的校准数据,是任务相关的数据。自定义数据集的加载可参考 [文档](https://huggingface.co/docs/datasets/loading)。是 Trainer 的必选参数;
124124
- **--tokenizer** 模型 `model` 对应的 `tokenizer`,可使用 `AutoTokenizer.from_pretrained(model_name_or_path)` 来获取。
125-
- **--criterion** 模型的 loss 计算方法,可以是一个 nn.Layer 对象,也可以是一个函数,用于在 ofa_utils.py 计算模型的 loss 用于计算梯度从而确定神经元重要程度。
126-
127-
其中,`criterion` 函数定义示例:
128-
129-
```python
130-
# 支持的形式一:
131-
def criterion(logits, labels):
132-
loss_fct = paddle.nn.BCELoss()
133-
start_ids, end_ids = labels
134-
start_prob, end_prob = outputs
135-
start_ids = paddle.cast(start_ids, 'float32')
136-
end_ids = paddle.cast(end_ids, 'float32')
137-
loss_start = loss_fct(start_prob, start_ids)
138-
loss_end = loss_fct(end_prob, end_ids)
139-
loss = (loss_start + loss_end) / 2.0
140-
return loss
141-
142-
# 支持的形式二:
143-
class CrossEntropyLossForSQuAD(paddle.nn.Layer):
144-
145-
def __init__(self):
146-
super(CrossEntropyLossForSQuAD, self).__init__()
147-
148-
def forward(self, y, label):
149-
start_logits, end_logits = y
150-
start_position, end_position = label
151-
start_position = paddle.unsqueeze(start_position, axis=-1)
152-
end_position = paddle.unsqueeze(end_position, axis=-1)
153-
start_loss = paddle.nn.functional.cross_entropy(input=start_logits,
154-
label=start_position)
155-
end_loss = paddle.nn.functional.cross_entropy(input=end_logits,
156-
label=end_position)
157-
loss = (start_loss + end_loss) / 2
158-
return loss
159-
```
125+
- **--criterion** 模型的 loss 对象,是一个 nn.Layer 对象,用于在 ofa_utils.py 计算模型的 loss 用于计算梯度从而确定神经元重要程度。
160126

161127
用以上参数实例化 Trainer 对象,之后直接调用 `compress()``compress()` 会根据选择的策略进入不同的分支,以进行裁剪或者量化的过程。
162128

@@ -181,11 +147,11 @@ trainer = Trainer(
181147
trainer.compress()
182148
```
183149

184-
<a name="实现自定义评估函数(按需可选)"></a>
150+
<a name="实现自定义评估函数和loss计算函数(按需可选)"></a>
185151

186-
### Step3:实现自定义评估函数,以适配自定义压缩任务
152+
### Step3:实现自定义评估函数和 loss 计算函数(按需可选),以适配自定义压缩任务
187153

188-
当使用 DynaBERT 裁剪功能时,如果模型、Metrics 不符合下表的情况,那么模型压缩 API 中评估函数需要自定义
154+
当使用 DynaBERT 裁剪功能时,如果模型、Metrics 不符合下表的情况,那么模型压缩 API 中自带的评估函数和计算 loss 的参数可能需要自定义
189155

190156
目前 DynaBERT 裁剪功能只支持 SequenceClassification 等三类 PaddleNLP 内置 class,并且内置评估器对应为 Accuracy、F1、Squad。
191157

@@ -197,26 +163,33 @@ trainer.compress()
197163

198164
- 如果模型是自定义模型,需要继承 `XXXPretrainedModel`,例如当预训练模型选择 ERNIE 时,继承 `ErniePretrainedModel`,模型需要支持调用 `from_pretrained()` 导入模型,且只含 `pretrained_model_name_or_path` 一个必选参数,`forward` 函数返回 `logits` 或者 `tuple of logits`
199165

200-
- 如果模型是自定义模型,或者数据集比较特殊,压缩 API 中 loss 的计算不符合使用要求,需要自定义 `custom_evaluate` 评估函数,需要同时支持 `paddleslim.nas.ofa.OFA` 模型和 `paddle.nn.layer` 模型。可参考下方示例代码。
166+
- 如果模型是自定义模型,或者数据集比较特殊,压缩 API 中 loss 的计算不符合使用要求,需要自定义 `custom_dynabert_calc_loss` 函数。计算 loss 后计算梯度,从而得出计算神经元的重要性以便裁剪使用。可参考下方示例代码。
167+
- 输入每个 batch 的数据,返回模型的 loss。
168+
- 将该函数传入 `compress()` 中的 `custom_dynabert_calc_loss` 参数;
169+
170+
- 如果评估器也不满足上述所支持情况,需实现自定义 `custom_dynabert_evaluate` 评估函数,需要同时支持 `paddleslim.nas.ofa.OFA` 模型和 `paddle.nn.layer` 模型。可参考下方示例代码。
201171
- 输入`model``dataloader`,返回模型的评价指标(单个 float 值)。
202-
- 将该函数传入 `compress()` 中的 `custom_evaluate` 参数;
172+
- 将该函数传入 `compress()` 中的 `custom_dynabert_evaluate` 参数;
203173

204-
`custom_evaluate()` 函数定义示例:
174+
`custom_dynabert_evaluate()` 函数定义示例:
205175

206176
```python
207177
import paddle
208178
from paddle.metric import Accuracy
179+
from paddleslim.nas.ofa import OFA
209180

210181
@paddle.no_grad()
211-
def evaluate_seq_cls(self, model, data_loader):
182+
def evaluate_seq_cls(model, data_loader):
212183
metric = Accuracy()
213184
model.eval()
214185
metric.reset()
215186
for batch in data_loader:
216187
logits = model(input_ids=batch['input_ids'],
217-
token_type_ids=batch['token_type_ids'])
188+
token_type_ids=batch['token_type_ids'],
189+
#必须写这一行
190+
attention_mask=[None, None])
218191
# Supports paddleslim.nas.ofa.OFA model and nn.layer model.
219-
if isinstance(model, paddleslim.nas.ofa.OFA):
192+
if isinstance(model, OFA):
220193
logits = logits[0]
221194
correct = metric.compute(logits, batch['labels'])
222195
metric.update(correct)
@@ -226,11 +199,22 @@ trainer.compress()
226199
return res
227200
```
228201

202+
`custom_dynabert_calc_loss` 函数定义示例:
229203

230-
在调用 `compress()` 时传入这个自定义函数:
204+
```python
205+
def calc_loss(loss_fct, model, batch, head_mask):
206+
logits = model(input_ids=batch["input_ids"],
207+
token_type_ids=batch["token_type_ids"],
208+
# 必须写下面这行
209+
attention_mask=[None, head_mask])
210+
loss = loss_fct(logits, batch["labels"])
211+
return loss
212+
```
213+
在调用 `compress()` 时传入这 2 个自定义函数:
231214

232215
```python
233-
trainer.compress(custom_evaluate=evaluate_seq_cls)
216+
trainer.compress(custom_dynabert_evaluate=evaluate_seq_cls,
217+
custom_dynabert_calc_loss=calc_loss)
234218
```
235219

236220

@@ -269,8 +253,8 @@ python compress.py \
269253

270254
公共参数中的参数和具体的压缩策略无关。
271255

272-
- **--strategy** 模型压缩策略,目前支持 `'dynabert+ptq'``'dynabert'` `'ptq'` `'qat'`
273-
其中 `'dynabert'` 代表基于 DynaBERT 的宽度裁剪策略,`'ptq'` 表示静态离线量化, `'dynabert+ptq'` 代表先裁剪后量化。`qat` 表示量化训练。默认是 `'dynabert+ptq'`
256+
- **--strategy** 模型压缩策略,目前支持 `'dynabert+ptq'``'dynabert'``'ptq'`
257+
其中 `'dynabert'` 代表基于 DynaBERT 的宽度裁剪策略,`'ptq'` 表示静态离线量化, `'dynabert+ptq'` 代表先裁剪后量化。默认是 `'dynabert+ptq'`
274258

275259
- **--output_dir** 模型压缩后模型保存目录;
276260

@@ -403,7 +387,7 @@ python compress_qa.py \
403387

404388
### Paddle2ONNX 部署
405389

406-
ONNX 导出及 ONNXRuntime 部署请参考:[ONNX 导出及 ONNXRuntime 部署指南](../model_zoo/ernie-3.0/deploy/paddle2onnx/README.md)
390+
ONNX 导出及 ONNXRuntime 部署请参考:[ONNX 导出及 ONNXRuntime 部署指南](./deploy/paddle2onnx/README.md)
407391

408392

409393
### Paddle Lite 移动端部署

model_zoo/ernie-3.0/compress_qa.py

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,6 @@
1717
from functools import partial
1818

1919
import paddle
20-
import paddle.nn.functional as F
2120

2221
from paddlenlp.data import DataCollatorWithPadding
2322
from paddlenlp.trainer import PdArgumentParser, CompressionArguments, Trainer
@@ -115,16 +114,6 @@ def post_processing_function(examples, features, predictions, stage="eval"):
115114
} for ex in examples]
116115
return EvalPrediction(predictions=predictions, label_ids=references)
117116

118-
def criterion(outputs, label):
119-
start_logits, end_logits = outputs
120-
start_position, end_position = label
121-
start_position = paddle.unsqueeze(start_position, axis=-1)
122-
end_position = paddle.unsqueeze(end_position, axis=-1)
123-
start_loss = F.cross_entropy(input=start_logits, label=start_position)
124-
end_loss = F.cross_entropy(input=end_logits, label=end_position)
125-
loss = (start_loss + end_loss) / 2
126-
return loss
127-
128117
trainer = QuestionAnsweringTrainer(
129118
model=model,
130119
args=compression_args,
@@ -133,8 +122,7 @@ def criterion(outputs, label):
133122
eval_examples=eval_examples,
134123
data_collator=data_collator,
135124
post_process_function=post_processing_function,
136-
tokenizer=tokenizer,
137-
criterion=criterion)
125+
tokenizer=tokenizer)
138126

139127
compression_args.print_config()
140128

paddlenlp/trainer/compression_args.py

Lines changed: 13 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ class CompressionArguments(TrainingArguments):
4646
default="dynabert+ptq",
4747
metadata={
4848
"help":
49-
"Compression strategy. It supports 'dynabert+ptq', 'dynabert', 'ptq' and 'qat' now."
49+
"Compression strategy. It supports 'dynabert+ptq', 'dynabert' and 'ptq' now."
5050
},
5151
)
5252
# dynabert
@@ -69,25 +69,6 @@ class CompressionArguments(TrainingArguments):
6969
metadata={
7070
"help": "Linear warmup over warmup_ratio fraction of total steps."
7171
})
72-
# quant
73-
weight_quantize_type: Optional[str] = field(
74-
default='channel_wise_abs_max',
75-
metadata={
76-
"help":
77-
"Quantization type for weights. Supports 'abs_max' and 'channel_wise_abs_max'. " \
78-
"This param only specifies the fake ops in saving quantized model, and " \
79-
"we save the scale obtained by post training quantization in fake ops. " \
80-
"Compared to 'abs_max' the model accuracy is usually higher when it is " \
81-
"'channel_wise_abs_max'."
82-
}, )
83-
activation_quantize_type: Optional[str] = field(
84-
default=None,
85-
metadata={
86-
"help":
87-
"Support 'abs_max', 'range_abs_max' and 'moving_average_abs_max'. " \
88-
"In strategy 'ptq', it defaults to 'range_abs_max' and in strategy " \
89-
"'qat', it defaults to 'moving_average_abs_max'."
90-
}, )
9172
# ptq:
9273
algo_list: Optional[List[str]] = field(
9374
default=None,
@@ -119,7 +100,15 @@ class CompressionArguments(TrainingArguments):
119100
"List of batch_size. 'batch_size' is the batch of data loader."
120101
},
121102
)
122-
103+
weight_quantize_type: Optional[str] = field(
104+
default='channel_wise_abs_max',
105+
metadata={
106+
"help":
107+
"Support 'abs_max' and 'channel_wise_abs_max'. This param only specifies " \
108+
"the fake ops in saving quantized model, and we save the scale obtained " \
109+
"by post training quantization in fake ops. Compared to 'abs_max', " \
110+
"the model accuracy is usually higher when it is 'channel_wise_abs_max'."
111+
}, )
123112
round_type: Optional[str] = field(
124113
default='round',
125114
metadata={
@@ -146,45 +135,16 @@ class CompressionArguments(TrainingArguments):
146135
"is None."
147136
},
148137
)
149-
# qat
150-
activation_preprocess_type: Optional[str] = field(
151-
default=None,
152-
metadata={
153-
"help":
154-
"Method of preprocessing the activation value of the quantitative " \
155-
"model. Currently, PACT method is supported. If necessary, it can be " \
156-
"set to 'PACT'. The default value is None, which means that no " \
157-
"preprocessing is performed on the active value."
158-
},
159-
)
160-
weight_preprocess_type: Optional[str] = field(
161-
default=None,
162-
metadata={
163-
"help":
164-
"Method of preprocessing the weight parameters of the quantitative " \
165-
"model. Currently, method 'PACT' is supported. If necessary, it can " \
166-
"be set to 'PACT'. The default value is None, which means that " \
167-
"no preprocessing is performed on weights."
168-
},
169-
)
170-
moving_rate: Optional[float] = field(
171-
default=0.9,
172-
metadata={
173-
"help": "The decay coefficient of moving average. Defaults to 0.9."
174-
},
175-
)
176138

177139
def print_config(self, args=None, key=""):
178140
"""
179141
Prints all config values.
180142
"""
181143

182144
compression_arg_name = [
183-
'strategy', 'width_mult_list', 'batch_num_list', 'bias_correction',
145+
'width_mult_list', 'batch_num_list', 'bias_correction',
184146
'round_type', 'algo_list', 'batch_size_list', 'strategy',
185-
'weight_quantize_type', 'activation_quantize_type',
186-
'input_infer_model_path', 'activation_preprocess_type',
187-
'weight_preprocess_type', 'moving_rate'
147+
'weight_quantize_type', 'input_infer_model_path'
188148
]
189149
default_arg_dict = {
190150
"width_mult_list": ['3/4'],
@@ -203,9 +163,7 @@ def print_config(self, args=None, key=""):
203163
"'dynabert' and 'ptq'. `width_mult_list` is needed in " \
204164
"`dynabert`, and `algo_list`, `batch_num_list`, `batch_size_list`," \
205165
" `round_type`, `bias_correction`, `weight_quantize_type`, " \
206-
"`input_infer_model_path` are needed in 'ptq'. `activation_preprocess_type'`, " \
207-
"'weight_preprocess_type', 'moving_rate', 'weight_quantize_type', " \
208-
"and 'activation_quantize_type' are needed in 'qat'."
166+
"`input_infer_model_path` are needed in 'ptq'. "
209167
)
210168
logger.info('{:30}:{}'.format("paddle commit id",
211169
paddle.version.commit))
@@ -218,12 +176,6 @@ def print_config(self, args=None, key=""):
218176
if v is None and arg in default_arg_dict:
219177
v = default_arg_dict[arg]
220178
setattr(args, arg, v)
221-
elif v is None and arg == 'activation_quantize_type':
222-
if key == "Compression" and 'ptq' in args.strategy:
223-
setattr(args, arg, 'range_abs_max')
224-
elif key == "Compression" and 'qat' in args.strategy:
225-
setattr(args, arg, 'moving_average_abs_max')
226-
227179
if not isinstance(v, types.MethodType):
228180
logger.info('{:30}:{}'.format(arg, v))
229181

0 commit comments

Comments
 (0)