-
Couldn't load subscription status.
- Fork 213
FEAT: Decoupled CLIP ratio (DAPO Trick-I) #285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @ZiyiTsang , thank you for the contribution! Could you please format the code with pre-commit? Please refer to the code quality section in the contribution guide: https://inclusionai.github.io/AReaL/contrib.html |
|
@ZiyiTsang hi ziyi, I don't know why this PR becomes wild... pre-commit should only format the files you changed, rather than all files inside the repo. Did you trigger something that formats all the code inside this repo? If possible, you can add me in your forked repo with write permissions. I'll push to this |
|
FYI I've made these changes in the |
|
This time ok finnally....emmm take mee looooog time. check this version |
Merge branch fw/refactor-examples of [email protected]:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/810?tab=comment Reviewed-by: 晓雷 <[email protected]> * FEAT: Decoupled CLIP ratio (DAPO Trick-I) (#285) * Add agent-related logging logic in ppo actor & Update notebook example (#290) * refactor examples
…pendent device meshes (#297) * PullRequest: 805 fix: remove `set_gradient_sync` call in fsdp engine to reduce memory usage Merge branch fw/fix-grad-sync of [email protected]:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/805 Reviewed-by: 晓雷 <[email protected]> * . * PullRequest: 810 refactor: remove unused code inside the examples folder Merge branch fw/refactor-examples of [email protected]:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/810?tab=comment Reviewed-by: 晓雷 <[email protected]> * FEAT: Decoupled CLIP ratio (DAPO Trick-I) (#285) * Add agent-related logging logic in ppo actor & Update notebook example (#290) * refactor examples * PullRequest: 778 Enhance FSDP and Ulysses integration with improved data parallel handling and cleanup Merge branch zwt/dp_head of [email protected]:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/778?tab=diff Reviewed-by: 博惟 <[email protected]> * Enhance FSDP and Ulysses integration with improved data parallel handling and cleanup * WIP * WIP * WIP * PullRequest: 814 [FIX] fix megatron grad sync for replicas Merge branch sxj/fix-megatron-grad-sync of [email protected]:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/814 Reviewed-by: 晓雷 <[email protected]> * [FIX] fix megatron grad sync for replicas * fix * fix --------- Co-authored-by: 巧林 <[email protected]> Co-authored-by: 冰临 <[email protected]>
* FEAT: add CLIP_higher (DAPO Trick-I) * Change default value for eps_clip_higher * rewrite logic in fuctional (CLIP higher) * 你try to fromatting * try to fromatting * modify formula
…pendent device meshes (inclusionAI#297) * PullRequest: 805 fix: remove `set_gradient_sync` call in fsdp engine to reduce memory usage Merge branch fw/fix-grad-sync of [email protected]:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/805 Reviewed-by: 晓雷 <[email protected]> * . * PullRequest: 810 refactor: remove unused code inside the examples folder Merge branch fw/refactor-examples of [email protected]:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/810?tab=comment Reviewed-by: 晓雷 <[email protected]> * FEAT: Decoupled CLIP ratio (DAPO Trick-I) (inclusionAI#285) * Add agent-related logging logic in ppo actor & Update notebook example (inclusionAI#290) * refactor examples * PullRequest: 778 Enhance FSDP and Ulysses integration with improved data parallel handling and cleanup Merge branch zwt/dp_head of [email protected]:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/778?tab=diff Reviewed-by: 博惟 <[email protected]> * Enhance FSDP and Ulysses integration with improved data parallel handling and cleanup * WIP * WIP * WIP * PullRequest: 814 [FIX] fix megatron grad sync for replicas Merge branch sxj/fix-megatron-grad-sync of [email protected]:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/814 Reviewed-by: 晓雷 <[email protected]> * [FIX] fix megatron grad sync for replicas * fix * fix --------- Co-authored-by: 巧林 <[email protected]> Co-authored-by: 冰临 <[email protected]>
In this PR, I decouple the CLIP ratio used to compute PPO loss, as one of the trick from DAPO.
To maximize availability, I did not change the
eps_clipkey. I only added eps_clip_higer and decoupled it when it has a value.Please review