-
Notifications
You must be signed in to change notification settings - Fork 3.1k
add sentence & character level data augmentation api #4194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
Codecov Report
@@ Coverage Diff @@
## develop #4194 +/- ##
===========================================
+ Coverage 40.10% 41.54% +1.44%
===========================================
Files 439 438 -1
Lines 61568 62142 +574
===========================================
+ Hits 24689 25816 +1127
+ Misses 36879 36326 -553
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里需要把tests/dataaug加入https://github.com/PaddlePaddle/PaddleNLP/blob/develop/pyproject.toml#L15 ,单测才能跑起来
tests/dataaug/test_word_aug.py
Outdated
| @@ -0,0 +1,103 @@ | |||
| # Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2022->2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
paddlenlp/dataaug/char.py
Outdated
| @@ -0,0 +1,560 @@ | |||
| # Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2022->2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
已添加 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
APIs
Description
新增句子和字级别数据增强策略,新增单词级别词表。