Skip to content

Commit 30fa8b9

Browse files
authored
[CI] Fix ci of small models (#9633)
1 parent bb103a3 commit 30fa8b9

File tree

4 files changed

+7
-6
lines changed

4 files changed

+7
-6
lines changed

slm/examples/machine_reading_comprehension/SQuAD/run_squad.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -250,7 +250,7 @@ def run(args):
250250
partial(prepare_train_features, tokenizer=tokenizer, args=args),
251251
batched=True,
252252
remove_columns=column_names,
253-
num_proc=4,
253+
num_proc=1,
254254
)
255255
train_batch_sampler = paddle.io.DistributedBatchSampler(train_ds, batch_size=args.batch_size, shuffle=True)
256256
train_batchify_fn = DataCollatorWithPadding(tokenizer)
@@ -332,7 +332,7 @@ def run(args):
332332
partial(prepare_validation_features, tokenizer=tokenizer, args=args),
333333
batched=True,
334334
remove_columns=column_names,
335-
num_proc=4,
335+
num_proc=1,
336336
)
337337
dev_batch_sampler = paddle.io.BatchSampler(dev_ds, batch_size=args.batch_size, shuffle=False)
338338
dev_ds_for_model = dev_ds.remove_columns(["example_id", "offset_mapping"])

slm/model_zoo/ernie-3.0/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1329,6 +1329,7 @@ batch_size=32 和 1,预测精度为 FP16 时,GPU 下的效果-时延图:
13291329
- paddlepaddle >= 2.3
13301330
- paddlenlp >= 2.4
13311331
- paddleslim >= 2.4
1332+
- evaluate
13321333

13331334
### 数据准备
13341335
此次微调数据主要是以 CLUE benchmark 数据集为主, CLUE benchmark 包括了文本分类、实体抽取、问答三大类数据集,而 CLUE benchmark 数据目前已经集成在 PaddleNLP 的 datasets 里面,可以通过下面的方式来使用数据集

slm/model_zoo/ernie-3.0/run_qa.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ def main():
105105
train_dataset = train_dataset.map(
106106
partial(prepare_train_features, tokenizer=tokenizer, args=data_args),
107107
batched=True,
108-
num_proc=4,
108+
num_proc=1,
109109
batch_size=4,
110110
remove_columns=column_names,
111111
load_from_cache_file=not data_args.overwrite_cache,
@@ -118,7 +118,7 @@ def main():
118118
eval_dataset = eval_examples.map(
119119
partial(prepare_validation_features, tokenizer=tokenizer, args=data_args),
120120
batched=True,
121-
num_proc=4,
121+
num_proc=1,
122122
batch_size=4,
123123
remove_columns=column_names,
124124
load_from_cache_file=not data_args.overwrite_cache,
@@ -132,7 +132,7 @@ def main():
132132
predict_dataset = predict_examples.map(
133133
partial(prepare_validation_features, tokenizer=tokenizer, args=data_args),
134134
batched=True,
135-
num_proc=4,
135+
num_proc=1,
136136
batch_size=4,
137137
remove_columns=column_names,
138138
load_from_cache_file=not data_args.overwrite_cache,

slm/model_zoo/ernie-3.0/run_token_cls.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
import numpy as np
2020
import paddle
2121
import paddle.nn as nn
22-
from datasets import load_metric
22+
from evaluate import load as load_metric
2323
from utils import DataArguments, ModelArguments, load_config, token_convert_example
2424

2525
import paddlenlp

0 commit comments

Comments
 (0)