-
Notifications
You must be signed in to change notification settings - Fork 30.2k
Add missing ckpt in config docs #16900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add missing ckpt in config docs #16900
Conversation
The documentation is not available anymore as the PR was closed or merged. |
@@ -31,7 +31,7 @@ class Speech2Text2Config(PretrainedConfig): | |||
This is the configuration class to store the configuration of a [`Speech2Text2ForCausalLM`]. It is used to | |||
instantiate an Speech2Text2 model according to the specified arguments, defining the model architecture. | |||
Instantiating a configuration with the defaults will yield a similar configuration to that of the Speech2Text2 | |||
[facebook/s2t-small-librispeech-asr](https://huggingface.co/facebook/s2t-small-librispeech-asr) architecture. | |||
[facebook/s2t-wav2vec2-large-en-de](https://huggingface.co/facebook/s2t-wav2vec2-large-en-de) architecture. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
facebook/s2t-small-librispeech-asr
has model_type": "speech_to_text
, but here is for the model Speech2Text2
(i.e. speech_to_text_2
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing all of those! It would be awesome to have some kind of quality script to check we don't introduce new faulty checkpoints.
Co-authored-by: Sylvain Gugger <[email protected]>
Yes, I do have some (draft) check locally. I plan to add it in another PR (unless it's necessary to do so in this PR). |
src/transformers/models/squeezebert/configuration_squeezebert.py
Outdated
Show resolved
Hide resolved
src/transformers/models/xlm_roberta/configuration_xlm_roberta.py
Outdated
Show resolved
Hide resolved
Thank you @NielsRogge I should try to use the correct names, as defined in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR, awesome that this gets improved.
Left some comments, just for consistency, I would always use the template:
"will yield a similar configuration of that of the - snake-cased model name - checkpoint name architecture".
I will add this to the check I currently have (locally, but will push to another PR), thanks! |
Co-authored-by: NielsRogge <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked at all the speech models - looks good to me!
Merge now. Thanks for the review. With this PR, all configs are good except the following (which are expected, since those composite models don't have full default config arguments - they rely on the encoder and decoder configs.)
|
* add missing ckpt in config docs * add more missing ckpt in config docs * fix wrong ckpts * fix realm ckpt * fix s2t2 * fix xlm_roberta ckpt * Fix for deberta v2 * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * use only one checkpoint for DPR * Apply suggestions from code review Co-authored-by: NielsRogge <[email protected]> Co-authored-by: ydshieh <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: NielsRogge <[email protected]>
What does this PR do?
As discussed on Slack, I worked on the
Config
files to add missing information about checkpoints, or correct them.Instantiating a configuration with the defaults will yield a similar configuration to that of the Speech2Text2 [mentioned checkpoint]
hidden_dim
,num_layers
might be differentsimilar
, so I think it is fine (..?)@patrickvonplaten Could you take a look on the speech models?
@NielsRogge Could you take a look on the vision models?