-
Notifications
You must be signed in to change notification settings - Fork 30.9k
Fix custom tokenizers test #19052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix custom tokenizers test #19052
Conversation
elif is_tf_available(): | ||
returned_tensor = "tf" | ||
else: | ||
elif is_flax_available(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was failing when no framework (PyTorch, TF, Flax) is installed. Returning instead of failing now.
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
But I don't understand well this test: why we test tokenization for these 3 models only?
Also with a quick search, I can see @custom_tokenizers
decorator is applied to cpm
and BertJapanese
. But here we run tests for openai
and clip
?
I must miss context.
@ydshieh Those three files contain tests that require some specific dependencies to be installed (ftfy for openai and clip). So in the other test jobs, those tests are never run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What does this PR do?
The custom tokenizers tests were never run because the test fetcher was not run before we look at its output to decide whether or not to run the tests. This PR fixes that and also adds missing tests to the nightly suite.