-
Notifications
You must be signed in to change notification settings - Fork 30.9k
Fix train_step, test_step and tests for CLIP #18684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
The tests passed. I'm as surprised as everyone else. It's ready for review! |
|
Thanks @Rocketknight1 for quick fix. I am wondering if it makes sense to wrap the loss computation for TFCLIP into a I also see for If doable, maybe it's good to add Just want to hear some opinions. cc @gante @amyeroberts |
|
@ydshieh I don't think that's necessary - the new check we use means we don't need to rely on |
|
Just realized that using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍 And I agree, relying on a loss output is safer!
Will re-review after fixing errors (I see the latest commit, which replaces the return, broke some tests :D)
|
Further updates: Now that we're no longer incorrectly skipping tests, this turned up quite a few bugs! The main source of issues is that some more recent models are returning scalar losses, but both our |
70a158e to
307f34e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall structure looks good to me. Leaving some comments as I was 👀 , but just nits.
| added_label_names = sorted(list(prepared_for_class.keys() - inputs_dict.keys()), reverse=True) | ||
| if not added_label_names: | ||
| continue # This test is only for models with easily-separable labels | ||
| added_label = prepared_for_class[added_label_names[0]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assumed there's only one added label? Not sure if it matters, but if we can and it does, can we add an assert above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a nit, but LGTM! Thanks for working on this!
Co-authored-by: amyeroberts <[email protected]>
26288b2 to
406c7fd
Compare
CLIP models were not being tested correctly with
fit()because the test skipped models without ahf_compute_lossmethod. This skip was added to skip base models likeTFBERTModelthat do not have specific output heads and losses. However, it also skips models like CLIP that do not usecompute_loss/hf_compute_lossmethods.The new test checks whether the model's return type dataclass has a
losskey, which is a more reliable check. Enabling this reveals the bug infit()for TFClip, so this PR also includes fixes totrain_stepandtest_stepfor CLIP and models like it that requirereturn_loss=Trueto be passed, but do not set it by default.Draft for now because this will likely flush out other bugs or cause other problems!
Fixes #18670.