Skip to content

Conversation

Oscarjia
Copy link

Why are these changes needed?

I create an example example_finetune_environment.md file. Additionally, I incorporated the newly introduced features in train_with_template.py, aligning with our discussion in the #2998 pull request.

Related issue number (if applicable)

issues/3054

Checks

  • I've run format.sh to lint the changes in this PR.
  • I've included any doc changes needed.
  • I've made sure the relevant tests are passing (if applicable).

@Oscarjia
Copy link
Author

Oscarjia commented Feb 20, 2024

@congchan @merrymercy would you mind to review the latest enhance the prompt format and mask function commit, i think i have fix some bugs.

like the assistant answer is:

I'm built by researchers

and the mask is:

<unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk>'m built by researchers

i was missing.

I have a question regarding the training process. Is the input comprised of formatted prompts along with the assistant's answer?
And are the targets specifically the masked versions of the assistant's answer only? Thus, do we calculate the loss by comparing the model-generated responses to masked targets answers? Is my understanding correct?

I've reviewed the train_with_template.py file def get_prompt(self) -> str: function and have a question regarding the input to the language model (LLM) during training. Why isn't the input limited to the user's instructions without including the assistant's answers? I'm concerned that if the correct answers are also provided, it might lead the LLM to simply copy these responses. Could this approach potentially hinder the model's ability to independently generate or infer answers based on the instructions alone?"

elif self.sep_style == SeparatorStyle.LLAMA2:
            # sep=" ",
            # sep2=" </s><s>",
            seps = [self.sep, self.sep2]
            if self.system_message:
                ret = system_prompt
            else:
                ret = "[INST] "
            for i, (role, message) in enumerate(self.messages):
                # The % symbol is known as the modulo operator. It doesn't give you the result of dividing i by 2, but rather the remainder of that division. 
                #self.roles=[[INST],[/INST]]
                tag = self.roles[i % 2]
                if message:
                    if i == 0:
                        ret += str(message).strip() + " "
                    elif len(self.messages)==2:
                        ret += tag + " " + str(message).strip()
                    else:
                        ret += tag + " " + str(message).strip() + seps[i % 2]
                else:
                    ret += tag
            # print(f"formated prompt {ret}")
            return ret
            ```

@congchan
Copy link
Contributor

Hi, I believe the masking for Llama2 tokenizer has been fixed in #3063
Could you help to test it on your models and data?

@Oscarjia
Copy link
Author

Hi, I believe the masking for Llama2 tokenizer has been fixed in #3063 Could you help to test it on your models and data?

Great, thank you for your great work, let me try on LLAMA2 7B and will tell you the results..

@Oscarjia
Copy link
Author

@congchan I test it on llama2 7b, and i think now the mask it is right, and mask also will end with </s><s> Thank you!

@Oscarjia Oscarjia closed this Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants