-
Notifications
You must be signed in to change notification settings - Fork 30.2k
Add canine in documentation_tests_file #18225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The documentation is not available anymore as the PR was closed or merged. |
@ydshieh Requesting you to review . |
Sorry, I missed this PR. Look it now. |
@oneraghavan , the doctest would fail for Would you like to follow the changes in PR #16441 for Don't hesitate if you have any question. Thank you! |
@ydshieh Will add those changes. Request you to reopen this PR. |
@oneraghavan Thank you 🤗 . I reopened the PR. Before continue the work, don't forget to update your local |
3c69d1f
to
c20b2c7
Compare
@ydshieh I request you to reopen the PR again. I have fixed the checkpoints, the tests should pass now. |
@ydshieh I think this is good to merge. |
@@ -1467,9 +1469,64 @@ def __init__(self, config): | |||
@add_start_docstrings_to_model_forward(CANINE_INPUTS_DOCSTRING.format("batch_size, sequence_length")) | |||
@add_code_sample_docstrings( | |||
processor_class=_TOKENIZER_FOR_DOC, | |||
checkpoint=_CHECKPOINT_FOR_DOC, | |||
checkpoint="aliosm/sha3bor-poetry-diacritizer-canine-s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checkpoint is in Arabic, and trained for diacritizer.
"LABEL_3", | ||
"LABEL_3", | ||
], | ||
expected_loss=0.46, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This list is too long and doesn't really provide meaningful information, as the model is character-based. If the list is short, we could accept it (as done in a few other doc examples).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oneraghavan The doctest for this model indeed passes!
However, I have some concern for CanineForTokenClassification
.
- The checkpoint is for Arabic text, and trained for diacritizer.
- The model is character-based, and the expected output contains a very long list of
LABEL_X
.
I would suggest not to use add_code_sample_docstrings
for CanineForTokenClassification
, but to overwrite the code sample in the modeling file without checking expected outputs.
@sgugger Any opinion for this?
Yes, agreed @ydshieh . In general any result that is LABEL_0 or a list of those should really not be included. |
@ydshieh I agree to the part where label_x is not so meaningful. Duplicating the function will make later debugging hard. I will remove the test for token classification. @sgugger Can we make add_code_sample_docstrings decorator use the expected output in optional way ? like if the function does not have the expected output, just don't validate the expected output ? |
I don't think we have an easy way to ignore the doctest in this case. The >>> predicted_tokens_classes
{expected_output} |
@ydshieh @sgugger Can we do add a paramerter in add_code_sample_docstrings in function and leave the default to None. Then when places we need to use custom sample, we can call it from there . The function definition will look like this def add_code_sample_docstrings( Inside I can use code_sample if it has been passed or look up the code sample from the templates. Let me know if this is okay. |
I don't see how the latest change is better than just putting the docstring under I will leave @sgugger to give his opinion. |
We don't need any other tooling here. Either the model falls in the "automatic docstring" category or it does not. If it does not, we just write the docstring (with the replace return decorator). |
What does this PR do?
modeling_canine has doc test setup by not included in documentation_tests.txt , this PR adds it
Fixes #16292
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ArthurZucker
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.