Skip to content

Conversation

@ZiyiTsang
Copy link
Collaborator

In this PR, I decouple the CLIP ratio used to compute PPO loss, as one of the trick from DAPO.

To maximize availability, I did not change the eps_clip key. I only added eps_clip_higer and decoupled it when it has a value.

Please review

@garrett4wade
Copy link
Collaborator

Hi @ZiyiTsang , thank you for the contribution! Could you please format the code with pre-commit? Please refer to the code quality section in the contribution guide: https://inclusionai.github.io/AReaL/contrib.html

@garrett4wade
Copy link
Collaborator

@ZiyiTsang hi ziyi, I don't know why this PR becomes wild... pre-commit should only format the files you changed, rather than all files inside the repo. Did you trigger something that formats all the code inside this repo? If possible, you can add me in your forked repo with write permissions. I'll push to this dapo branch to revert these format changes and merge the latest main.

@garrett4wade
Copy link
Collaborator

FYI I've made these changes in the fw/dapo branch in this repo. You can checkout to the previous commit without formatting, and pull this branch.

@ZiyiTsang
Copy link
Collaborator Author

This time ok finnally....emmm take mee looooog time. check this version

@garrett4wade garrett4wade merged commit 8ffa750 into inclusionAI:main Sep 3, 2025
1 of 4 checks passed
garrett4wade added a commit that referenced this pull request Sep 5, 2025
Merge branch fw/refactor-examples of [email protected]:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/810?tab=comment

Reviewed-by: 晓雷 <[email protected]>


* FEAT: Decoupled CLIP ratio (DAPO Trick-I) (#285)
* Add agent-related logging logic in ppo actor & Update notebook example (#290)
* refactor examples
garrett4wade added a commit that referenced this pull request Sep 5, 2025
…pendent device meshes (#297)

* PullRequest: 805 fix: remove `set_gradient_sync` call in fsdp engine to reduce memory usage

Merge branch fw/fix-grad-sync of [email protected]:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/805

Reviewed-by: 晓雷 <[email protected]>


* .

* PullRequest: 810 refactor: remove unused code inside the examples folder

Merge branch fw/refactor-examples of [email protected]:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/810?tab=comment

Reviewed-by: 晓雷 <[email protected]>


* FEAT: Decoupled CLIP ratio (DAPO Trick-I) (#285)
* Add agent-related logging logic in ppo actor & Update notebook example (#290)
* refactor examples

* PullRequest: 778 Enhance FSDP and Ulysses integration with improved data parallel handling and cleanup

Merge branch zwt/dp_head of [email protected]:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/778?tab=diff

Reviewed-by: 博惟 <[email protected]>


* Enhance FSDP and Ulysses integration with improved data parallel handling and cleanup
* WIP
* WIP
* WIP

* PullRequest: 814 [FIX] fix megatron grad sync for replicas

Merge branch sxj/fix-megatron-grad-sync of [email protected]:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/814

Reviewed-by: 晓雷 <[email protected]>


* [FIX] fix megatron grad sync for replicas
* fix
* fix

---------

Co-authored-by: 巧林 <[email protected]>
Co-authored-by: 冰临 <[email protected]>
@garrett4wade garrett4wade mentioned this pull request Sep 20, 2025
29 tasks
mjbmjb pushed a commit to mjbmjb/AReaL that referenced this pull request Sep 22, 2025
* FEAT: add CLIP_higher (DAPO Trick-I)

* Change default value for eps_clip_higher

* rewrite logic in fuctional (CLIP higher)

* 你try to fromatting

* try to fromatting

* modify formula
mjbmjb pushed a commit to mjbmjb/AReaL that referenced this pull request Sep 22, 2025
…pendent device meshes (inclusionAI#297)

* PullRequest: 805 fix: remove `set_gradient_sync` call in fsdp engine to reduce memory usage

Merge branch fw/fix-grad-sync of [email protected]:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/805

Reviewed-by: 晓雷 <[email protected]>


* .

* PullRequest: 810 refactor: remove unused code inside the examples folder

Merge branch fw/refactor-examples of [email protected]:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/810?tab=comment

Reviewed-by: 晓雷 <[email protected]>


* FEAT: Decoupled CLIP ratio (DAPO Trick-I) (inclusionAI#285)
* Add agent-related logging logic in ppo actor & Update notebook example (inclusionAI#290)
* refactor examples

* PullRequest: 778 Enhance FSDP and Ulysses integration with improved data parallel handling and cleanup

Merge branch zwt/dp_head of [email protected]:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/778?tab=diff

Reviewed-by: 博惟 <[email protected]>


* Enhance FSDP and Ulysses integration with improved data parallel handling and cleanup
* WIP
* WIP
* WIP

* PullRequest: 814 [FIX] fix megatron grad sync for replicas

Merge branch sxj/fix-megatron-grad-sync of [email protected]:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/814

Reviewed-by: 晓雷 <[email protected]>


* [FIX] fix megatron grad sync for replicas
* fix
* fix

---------

Co-authored-by: 巧林 <[email protected]>
Co-authored-by: 冰临 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants