You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tested the robustness of the VIMA model for various words.
For example, I modified this task Put the {dragged_texture} object in {scene} into the {base_texture} object.
into jfasfo jdfjs {dragged_texture} aosdj sdfj {scene} asoads jsidf {base_texture} aidfoads.
which is not making any sense for human.
I expected the model not to perform well, however, the success rate was almost 100%
I need further investigation but I think this model only sees images, overfitted for only images.
aleSuglia, Kashu7100, NicholasCG and FinAminToastCrunchamitkparekh, swstbe, NicholasCG and FinAminToastCrunch