Skip to content

Conversation

@lugimzzz
Copy link
Contributor

@lugimzzz lugimzzz commented Feb 23, 2023

PR types

New features

PR changes

APIs

Description

新增predict函数

  • 重写_override_hp(),这样不再虚假加上TrainingArguments等类似前缀,比如同时有prompt模型和微调模型的话,override_hp就要定义一个"TrainingArguments.max_steps": 5 和 "PromptTuningArguments.max_steps"
  • 新增predict函数,修改evaluate函数,与trainer对齐
  • prompttrainer evaluate函数未与trainer对齐,重写了get_eval_dataloader
    -将数据预处理函数单独抽出来

@paddle-bot
Copy link

paddle-bot bot commented Feb 23, 2023

Thanks for your contribution!

@codecov
Copy link

codecov bot commented Feb 23, 2023

Codecov Report

Merging #4967 (f0da8a1) into develop (f354fe6) will increase coverage by 1.31%.
The diff coverage is 95.91%.

@@             Coverage Diff             @@
##           develop    #4967      +/-   ##
===========================================
+ Coverage    46.32%   47.63%   +1.31%     
===========================================
  Files          448      453       +5     
  Lines        64694    65455     +761     
===========================================
+ Hits         29967    31177    +1210     
+ Misses       34727    34278     -449     
Impacted Files Coverage Δ
...addlenlp/experimental/autonlp/auto_trainer_base.py 89.60% <83.33%> (-0.57%) ⬇️
...dlenlp/experimental/autonlp/text_classification.py 97.09% <96.55%> (-0.12%) ⬇️
paddlenlp/prompt/prompt_trainer.py 68.78% <100.00%> (+2.11%) ⬆️
paddlenlp/transformers/chineseclip/modeling.py 82.94% <0.00%> (-2.54%) ⬇️
paddlenlp/transformers/ernie_vil/modeling.py 76.36% <0.00%> (-0.94%) ⬇️
paddlenlp/transformers/bert/modeling.py 89.71% <0.00%> (-0.58%) ⬇️
paddlenlp/utils/doc_parser.py 11.26% <0.00%> (-0.04%) ⬇️
paddlenlp/transformers/__init__.py 100.00% <0.00%> (ø)
paddlenlp/transformers/auto/modeling.py 82.74% <0.00%> (ø)
paddlenlp/transformers/auto/tokenizer.py 84.17% <0.00%> (ø)
... and 18 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

test_dataset = self._map_dataset(test_dataset)
return super(PromptTrainer, self).get_test_dataloader(test_dataset)

def get_eval_dataloader(self, eval_dataset: Optional[Dataset] = None) -> DataLoader:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LemonNoel review一下prompt的修改有什么问题

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同一程序中多次调用 evaluate 可能有问题,需要验证一下 do_eval=True 的情况。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

实际验证加检查trainer内部代码,没有问题。目前prompt trainer暂时不支持eval_dataset为字典(也即传入多个eval_dataset)的场景,所以暂时不影响代码,如果之后支持eval_dataset为字典,需要同步修改get_eval_dataloader的逻辑。

Copy link
Contributor

@sijunhe sijunhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@sijunhe sijunhe merged commit f38a255 into PaddlePaddle:develop Feb 24, 2023
@lugimzzz lugimzzz deleted the PREDICT branch February 24, 2023 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants