You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As shown in below during training DPOTrainer loss goes down to 0.0 while at the end it reports the train_loss is 0.06, also in another try with another dataset still before the end of the epoch training loss drops to 0.0 and eval loss follows 2.14e-6.