-
Notifications
You must be signed in to change notification settings - Fork 100
[trainer/algorithm] Implement DAPO and Polaris style dynamic sampling + add DAPO docs + example #130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
[trainer/algorithm] Implement DAPO and Polaris style dynamic sampling + add DAPO docs + example #130
Changes from 22 commits
Commits
Show all changes
54 commits
Select commit
Hold shift + click to select a range
69fd7e0
fix bug
erictang000 f59abaa
remove fsdp from fsdp2 hf save model architecture
erictang000 51119e5
merge
erictang000 5659f9b
x
erictang000 fc9e355
thanks gemini
erictang000 6810779
remove extra ray.shutdown
erictang000 c4bde2a
deepspeed + fsdp add configs to checkpoint folder
erictang000 ac018fd
Merge branch 'main' of https://github.com/erictang000/SkyRL into conf…
erictang000 0e8facc
pull to parent function for shared logic
erictang000 9a865f5
x
erictang000 7202d21
docs
erictang000 4445e42
x
erictang000 bec693e
x
erictang000 9b7c7d2
address gemini comments
erictang000 119d9cd
x
erictang000 f32ffa9
Merge branch 'main' of https://github.com/erictang000/SkyRL
erictang000 35db88e
Merge branch 'config_checkpointing' of https://github.com/erictang000…
erictang000 3cce025
Merge branch 'main' of https://github.com/erictang000/SkyRL
erictang000 f5267b1
Merge branch 'main' of https://github.com/erictang000/SkyRL
erictang000 e11db0a
x
erictang000 ad7b045
unit tests passing - need to test both e2e
erictang000 5a909f5
x
erictang000 615133d
Merge branch 'main' of https://github.com/erictang000/SkyRL into dyna…
erictang000 43edec4
x
erictang000 19f5816
fixes
erictang000 8419051
fixes
erictang000 7bf7e54
x
erictang000 2b2e326
Merge branch 'main' of https://github.com/erictang000/SkyRL into dyna…
erictang000 40f26e8
fix weight manager logic
erictang000 4822d52
x
erictang000 5232569
thanks gemini
erictang000 81a3819
x
erictang000 3706f07
x
erictang000 0c40fed
x
erictang000 13574c9
x
erictang000 b9f03d7
x
erictang000 bcd53eb
Apply suggestions from code review
erictang000 c063aee
address comments
erictang000 f0890d2
Merge branch 'dynamic_sampling' of https://github.com/erictang000/Sky…
erictang000 46ddda9
fix tests
erictang000 8118f55
add soft overlong punishment
erictang000 3b2d007
x
erictang000 a2ac205
thanks gemini
erictang000 8fa7e42
x
erictang000 0ba0d52
change to overriding trainer
erictang000 1c24673
x
erictang000 e8f7be4
x
erictang000 87d6add
x
erictang000 5c91bcc
x
erictang000 eac1c77
x
erictang000 c777948
add more docs for custom trainer
erictang000 401fd72
add ref to dapo example
erictang000 e7c5616
x
erictang000 59bb246
thanks gemini
erictang000 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.