Add missing ckpt in config docs #16900

ydshieh · 2022-04-22T20:46:14Z

What does this PR do?

As discussed on Slack, I worked on the Config files to add missing information about checkpoints, or correct them.

I tried to check the mentioned checkpoints are actually on the Hub
also tried to make sure the checkpoints are for the target architecture
I didn't verify the statement Instantiating a configuration with the defaults will yield a similar configuration to that of the Speech2Text2 [mentioned checkpoint]
- in particular, the hyperparameters like hidden_dim, num_layers might be different
- it says similar, so I think it is fine (..?)

@patrickvonplaten Could you take a look on the speech models?
@NielsRogge Could you take a look on the vision models?

HuggingFaceDocBuilderDev · 2022-04-22T21:03:45Z

The documentation is not available anymore as the PR was closed or merged.

ydshieh · 2022-04-25T09:05:24Z

src/transformers/models/speech_to_text_2/configuration_speech_to_text_2.py

@@ -31,7 +31,7 @@ class Speech2Text2Config(PretrainedConfig):
    This is the configuration class to store the configuration of a [`Speech2Text2ForCausalLM`]. It is used to
    instantiate an Speech2Text2 model according to the specified arguments, defining the model architecture.
    Instantiating a configuration with the defaults will yield a similar configuration to that of the Speech2Text2
-    [facebook/s2t-small-librispeech-asr](https://huggingface.co/facebook/s2t-small-librispeech-asr) architecture.
+    [facebook/s2t-wav2vec2-large-en-de](https://huggingface.co/facebook/s2t-wav2vec2-large-en-de) architecture.


facebook/s2t-small-librispeech-asr has model_type": "speech_to_text, but here is for the model Speech2Text2 (i.e. speech_to_text_2)

Great catch!

sgugger

Thanks for fixing all of those! It would be awesome to have some kind of quality script to check we don't introduce new faulty checkpoints.

src/transformers/models/dpr/configuration_dpr.py

src/transformers/models/retribert/configuration_retribert.py

Co-authored-by: Sylvain Gugger <[email protected]>

src/transformers/models/luke/configuration_luke.py

src/transformers/models/longformer/configuration_longformer.py

src/transformers/models/ibert/configuration_ibert.py

src/transformers/models/flaubert/configuration_flaubert.py

src/transformers/models/maskformer/configuration_maskformer.py

src/transformers/models/mobilebert/configuration_mobilebert.py

src/transformers/models/reformer/configuration_reformer.py

ydshieh · 2022-04-25T13:19:09Z

Thanks for fixing all of those! It would be awesome to have some kind of quality script to check we don't introduce new faulty checkpoints.

Yes, I do have some (draft) check locally. I plan to add it in another PR (unless it's necessary to do so in this PR).

src/transformers/models/regnet/configuration_regnet.py

src/transformers/models/rembert/configuration_rembert.py

src/transformers/models/resnet/configuration_resnet.py

src/transformers/models/retribert/configuration_retribert.py

src/transformers/models/roberta/configuration_roberta.py

src/transformers/models/squeezebert/configuration_squeezebert.py

src/transformers/models/tapas/configuration_tapas.py

src/transformers/models/xlm_roberta/configuration_xlm_roberta.py

ydshieh · 2022-04-25T13:23:53Z

Thank you @NielsRogge I should try to use the correct names, as defined in MODEL_NAMES_MAPPING.

NielsRogge

Thanks a lot for this PR, awesome that this gets improved.

Left some comments, just for consistency, I would always use the template:

"will yield a similar configuration of that of the - snake-cased model name - checkpoint name architecture".

ydshieh · 2022-04-25T13:33:27Z

Thanks a lot for this PR, awesome that this gets improved.

Left some comments, just for consistency, I would always use the template:

"will yield a similar configuration of that of the - snake-cased model name - checkpoint name architecture".

I will add this to the check I currently have (locally, but will push to another PR), thanks!

Co-authored-by: NielsRogge <[email protected]>

patrickvonplaten

Looked at all the speech models - looks good to me!

ydshieh · 2022-04-25T15:29:58Z

Merge now. Thanks for the review.

With this PR, all configs are good except the following (which are expected, since those composite models don't have full default config arguments - they rely on the encoder and decoder configs.)

DecisionTransformerConfig
VisionEncoderDecoderConfig
VisionTextDualEncoderConfig
CLIPConfig
SpeechEncoderDecoderConfig
EncoderDecoderConfig
RagConfig

* add missing ckpt in config docs * add more missing ckpt in config docs * fix wrong ckpts * fix realm ckpt * fix s2t2 * fix xlm_roberta ckpt * Fix for deberta v2 * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * use only one checkpoint for DPR * Apply suggestions from code review Co-authored-by: NielsRogge <[email protected]> Co-authored-by: ydshieh <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: NielsRogge <[email protected]>

add missing ckpt in config docs

1afa172

ydshieh added 7 commits April 23, 2022 14:36

add more missing ckpt in config docs

c45fdcb

fix wrong ckpts

aacf993

fix realm ckpt

ad194c8

fix s2t2

97912e9

fix xlm_roberta ckpt

de8f73d

Fix for deberta v2

841431f

fix style

b020135

ydshieh requested review from sgugger, NielsRogge and patrickvonplaten April 25, 2022 09:01

ydshieh commented Apr 25, 2022

View reviewed changes

ydshieh marked this pull request as ready for review April 25, 2022 09:05

sgugger approved these changes Apr 25, 2022

View reviewed changes

Apply suggestions from code review

389014f

Co-authored-by: Sylvain Gugger <[email protected]>