-
Notifications
You must be signed in to change notification settings - Fork 528
Add round_to feature to BERT&XLNet finetuning scripts #1133
Conversation
deleting trailing white space
merge from master
merge from master
Codecov Report
@@ Coverage Diff @@
## master #1133 +/- ##
=======================================
- Coverage 88.39% 87.39% -1%
=======================================
Files 71 71
Lines 6703 6703
=======================================
- Hits 5925 5858 -67
- Misses 778 845 +67
|
Reference |
@TaoLv Sorry for the late updating. Does this round_to feature meet your needs? |
Job PR-1133/1 is complete. |
LGTM from my side. I think padding it to 8 makes it easier for vectorization and reduces the number of specific kernels. |
So if we want each batch to have the same sentence length of 128, we need set round_to to 128, right? Also please be more specific it's rounding up or rounding down. |
@TaoLv Yes you can round_to 128, if you set max_len <= 128. And it's rounding up, thank you for pointing out, I will make the description clearer. |
Thank you @zburning . Could you please also clarify what will happen if the value of |
So the final length is possible to be larger than the max_len in the command line? Do I make any mistake in the below table?
|
Job PR-1133/2 is complete. |
@TaoLv , Yes, you are right. |
Got it. Thank you for the explanation. So even |
Yes, it seems to be a bit confusing... A clearer way is always setting round_to=max_len. But by introducing round_to, it can be more flexible for other requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test case; then LGTM
@leezu For batchify.Pad(), there are already test cases. Do you mean checking the model outputs with round_to VS outputs without round_to? |
Job PR-1133/3 is complete. |
Job PR-1133/4 is complete. |
Job PR-1133/5 is complete. |
Job PR-1133/6 is complete. |
Job PR-1133/7 is complete. |
Job PR-1133/8 is complete. |
Job PR-1133/9 is complete. |
Job PR-1133/10 is complete. |
Description
Add round_to feature, so that the padded dimension will be rounded up to multiple of this argument.
Checklist
Essentials
Changes
Comments