peft to opendelta migration (#434) + memory optimization (#320) #477

glerzing · 2023-05-13T14:42:06Z

It was simpler to solve both issues in the same PR.

peft is not completely stable, there are still a few bugs, especially on seq2seq (e.g. flan-t5-small does not work). LORA looks pretty stable though. I mostly relied on unit tests, but I will keep testing a few things in the coming days. And I will maybe add in a next PR a use-case example if you think that's useful. I removed "ref_model" because it didn't seem to be actually used.

Feel free to ask questions or to make remarks. For example on the coding style, I don't know if there are some guidelines. Also, if you want the automated tests to be faster, I can reduce the number of models or configs to test.

glerzing · 2023-05-14T03:20:01Z

Some things I could do next, tell me if you agree :

See if I can make it possible to use multiple methods (e.g. prefix tuning + Lora)
Make it possible to retrain a peft fine-tuned model (currently trlx can create but can't load a peft model)

glerzing · 2023-05-14T08:56:57Z

Is it ok if we save only the peft adapter and not the value head ?

LouisCastricato · 2023-05-14T13:21:23Z

no there are many situations you would need to save the value head. also yes we should have loading, especially for resuming from checkpoints.

I am going to mark this as draft as it seems to be not ready for review. I do think in the interim you should resolve the loading/saving issues and add more unit tests.

glerzing · 2023-05-14T18:56:28Z

no there are many situations you would need to save the value head.

But then should we save the whole model instead of just the peft adapter ? Or let a boolean option for whether to save the whole model or just the adapter ? Or make some sophisticated code to save just the peft adapter and the value head ? (I'm not sure how important it is to limit the memory size.)

…CarperAI#434) + Collapse reference+learner hydra heads when using LoRa (CarperAI#320)

LouisCastricato marked this pull request as draft May 14, 2023 13:21

glerzing force-pushed the peft_migration branch 7 times, most recently from 984001f to 3430029 Compare May 23, 2023 22:45

Migrate to peft from opendelta for parameter efficient tuning methods (…

66a56f5

…CarperAI#434) + Collapse reference+learner hydra heads when using LoRa (CarperAI#320)

glerzing force-pushed the peft_migration branch from 3430029 to 66a56f5 Compare May 23, 2023 23:59

glerzing closed this May 24, 2023

glerzing mentioned this pull request May 24, 2023

peft to opendelta migration (#434) + memory optimization (#320) #486

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

peft to opendelta migration (#434) + memory optimization (#320) #477

peft to opendelta migration (#434) + memory optimization (#320) #477

Uh oh!

glerzing commented May 13, 2023

Uh oh!

glerzing commented May 14, 2023

Uh oh!

glerzing commented May 14, 2023

Uh oh!

LouisCastricato commented May 14, 2023

Uh oh!

glerzing commented May 14, 2023

Uh oh!

Uh oh!

peft to opendelta migration (#434) + memory optimization (#320) #477

peft to opendelta migration (#434) + memory optimization (#320) #477

Uh oh!

Conversation

glerzing commented May 13, 2023

Uh oh!

glerzing commented May 14, 2023

Uh oh!

glerzing commented May 14, 2023

Uh oh!

LouisCastricato commented May 14, 2023

Uh oh!

glerzing commented May 14, 2023

Uh oh!

Uh oh!