Skip to content

Conversation

NielsRogge
Copy link
Contributor

What does this PR do?

This PR simplifies the code examples of VideoMAE, and adds a seed to make sure the video classifier always predicts "eating spaghetti" on the video (as, due to the sampling of frames, it may occur the model predicts another class, like "eating ice cream"):

1019 
1020         >>> inputs = feature_extractor(list(video), return_tensors="pt")
1021 
1022         >>> with torch.no_grad():
1023         ...     outputs = model(**inputs)
1024         ...     logits = outputs.logits
1025 
1026         >>> # model predicts one of the 400 Kinetics-400 classes
1027         >>> predicted_label = logits.argmax(-1).item()
1028         >>> print(model.config.id2label[predicted_label])
Expected:
    eating spaghetti
Got:
    eating ice cream

Weirdly, this wasn't caught by the doc test CI. It could have to do with the addition of import numpy as np to the code snippet.

@NielsRogge NielsRogge requested a review from ydshieh September 7, 2022 09:43
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 7, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @NielsRogge .

I don't know why, but so far I always get spaghetti, despite the random indices (I removed the seed)

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean!

@NielsRogge NielsRogge merged commit c25f27f into huggingface:main Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants