Skip to content

pretrained_model setting in hparams.yaml has no effect  #156

@djangodesmet

Description

@djangodesmet

🐛 Bug

When calling the load_from_checkpoint function to load a model from a checkpoint, the hparams.yml file located in the parent folder does not get taken into account. For example, the pretrained_model setting in hparams.yml has no effect.

To Reproduce

contents of hparams.yml:

activations: Tanh
batch_size: 4
class_identifier: regression_metric
dropout: 0.1
encoder_learning_rate: 1.0e-06
encoder_model: XLM-RoBERTa
final_activation: null
hidden_sizes:
  - 3072
  - 1024
keep_embeddings_frozen: true
layer: mix
layer_norm: false
layer_transformation: sparsemax
layerwise_decay: 0.95
learning_rate: 1.5e-05
loss: mse
nr_frozen_epochs: 0.3
optimizer: AdamW
pool: avg
pretrained_model: /home/jovyan/nllb-gpu/models/xlm-roberta-large
train_data:
  - data/1720-da.csv
validation_data:
  - data/wmt-ende-newstest2021.csv
  - data/wmt-enru-newstest2021.csv
  - data/wmt-zhen-newstest2021.csv
comet = load_from_checkpoint(os.path.join(model_path, "checkpoint", "model.ckpt"))`

The roberta model is located in models/xlm-roberta-large as indicated by pretrained_model but an error is thrown because it still expects the roberta model to be in xlm-roberta-large in root. This gives the following error message:

EnvironmentError(
OSError: Can't load tokenizer for 'xlm-roberta-large'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'xlm-roberta-large' is the correct path to a directory containing all relevant files for a XLMRobertaTokenizerFast tokenizer.

Expected behaviour

I would expect that the pretrained_model parameter is used to determine the location of the model.

This could be achieved by adding the hparams_file as an argument to the model_class.load_from_checkpoint function in models/__init__.py

def load_from_checkpoint(checkpoint_path: str) -> CometModel:
    """Loads models from a checkpoint path.

    Args:
        checkpoint_path (str): Path to a model checkpoint.

    Return:
        COMET model.
    """
    checkpoint_path = Path(checkpoint_path)

    if not checkpoint_path.is_file():
        raise Exception(f"Invalid checkpoint path: {checkpoint_path}")

    parent_folder = checkpoint_path.parents[1]  # .parent.parent
    hparams_file = parent_folder / "hparams.yaml"

    if hparams_file.is_file():
        with open(hparams_file) as yaml_file:
            hparams = yaml.load(yaml_file.read(), Loader=yaml.FullLoader)
        model_class = str2model[hparams["class_identifier"]]
        model = model_class.load_from_checkpoint(
            checkpoint_path, load_pretrained_weights=False, hparams_file=hparams_file
        )
        return model
    else:
        raise Exception(f"hparams.yaml file is missing from {parent_folder}!")

Environment

OS: Linux
Packaging: pip
Version 2.0.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions