I tried to shuffle object tokens with fixed order for some tasks, and the result is interesting https://github.com/vimalabs/VIMA/blob/8449837aa453f8ec9ba229cb956e3bbef5c796ea/scripts/example.py#L438 I added these lines ``` cropped_imgs = [cropped_imgs[i] for i in [0,2,1]] bboxes = [bboxes[i] for i in [0,2,1]] ``` The robot tries to pick up the distractor instead of dragged object. I didn't make any changes to the prompt.