-
Notifications
You must be signed in to change notification settings - Fork 4.7k
fix: learn the stop tokens when training. #3063
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi @christobill, could you help to test with your models? |
@congchan yes working with my models ie Thank you 🙏 |
I think llama2 also have the same can't stop situation. So for llama2
|
Yes, the stop tokens for llama2 is |
Hi, @congchan Actually not swapped. I am not clearly set the system prompt so just set LLama2's stop token is
My question is that in you list Llama2 example tokenize string was end with
|
HI, I mean the official stop tokens is indeed Per the |
Yeah, i see, but i still think |
Hi @infwinston, could you have a look? 🙏 |
Hi @infwinston , this PR is ready to be merged, could you help to have a final reviews and merged, so as the associated doc PR #3139 |
@congchan Do you think if it should add
|
Hi, thanks for reminding, I did not notice that the original |
Why are these changes needed?
Some models need to specifically learn to generate the stop tokens. Otherwise these trained models will not stop when serving. This is a model specific behavior, not all models need that. But for compatibility, I think it is better to make this setting as default.
Currently, different models behavior:
</s>
as we guessed in Using train_with_template on mistral end up in a model with a loop #3055Testing with below models:
Mistral-7B-Instruct-v0.2
Llama2
Qwen1.5-14b
Related issue number (if applicable)
#3055
Checks
format.sh
to lint the changes in this PR.