[BREAKING][misc] feat: Abstract optimizer #3656

EduardDurech · 2025-09-30T23:03:15Z

Abstract optimizer so can be used with whatever module and method a user wants, should be backwards compatible as default is torch.optim.AdamW, adds {actor_rollout_ref.actor,critic}.optim.{optimizer,optimizer_impl,override_optimizer_config}

# Default
optimizer_impl: torch.optim
optimizer: AdamW

# Example
optimizer_impl: torchao.optim
optimizer: _AdamW
override_optimizer_config:
  bf16_stochastic_round: true

Important: fsdp_sft_trainer optim aligned with FSDP optim optim.warmup_steps_ratio->optim.lr_warmup_steps_ratio

gemini-code-assist

Code Review

This pull request introduces a flexible optimizer abstraction, allowing users to specify any optimizer via configuration. This is a great enhancement for modularity. My review focuses on the implementation of the new build_optimizer function. I've identified a critical issue where the argument handling for the dynamically loaded optimizer is not robust and can lead to runtime TypeError exceptions. My suggestion involves using Python's inspect module to build the arguments dictionary safely, ensuring only valid parameters are passed to the optimizer's constructor. This will make the implementation more generic and prevent unexpected crashes.

verl/workers/config/optimizer.py

wuxibin89 · 2025-10-10T05:44:20Z

@EduardDurech Please resolve conflicts with main branch.

EduardDurech · 2025-10-10T14:53:41Z

@wuxibin89 I overwrote the PR #3692 as extra parameters should now be defined in {actor_rollout_ref.actor,critic}.optim.override_optimizer_config I'll comment on the PR

ci is unrelated to PR, tests passed #cf4cc6a6c60b2a21b1765825b83158ae6bea101b

cpu_unit_tests

huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

sgl

RuntimeError: Maybe you called ray.init twice by accident? This error can be suppressed by passing in 'ignore_reinit_error=True' or by calling 'ray.shutdown()' prior to 'ray.init()'.

e2e_ascend

File "/usr/local/python3.10.17/lib/python3.10/site-packages/requests/models.py", line 978, in json
(TaskRunner pid=213242)     raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
(TaskRunner pid=213242) requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Abstract optimizer

f4c7285

EduardDurech requested review from PeterSH6, ZihengJiang, eric-haibin-lin, tongyx361, vermouth1992 and zhaochenyang20 as code owners September 30, 2025 23:03

gemini-code-assist bot reviewed Sep 30, 2025

View reviewed changes

verl/workers/config/optimizer.py Show resolved Hide resolved

EduardDurech marked this pull request as draft October 6, 2025 21:04

Align fsdp_sft_trainer warmup_steps_ratio->lr_warmup_steps_ratio

e9cfd78

EduardDurech force-pushed the feat/optim_abstract branch from 098c2c8 to e9cfd78 Compare October 6, 2025 21:11

EduardDurech added 2 commits October 6, 2025 23:50

ci fix

2df5a25

ci fix

cf4cc6a

EduardDurech marked this pull request as ready for review October 7, 2025 00:25

Merge branch 'main' into feat/optim_abstract

930b779

EduardDurech mentioned this pull request Oct 10, 2025

[trainer] feat: Enabled fused adamw #3692

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BREAKING][misc] feat: Abstract optimizer #3656

[BREAKING][misc] feat: Abstract optimizer #3656

Uh oh!

EduardDurech commented Sep 30, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

wuxibin89 commented Oct 10, 2025

Uh oh!

EduardDurech commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[BREAKING][misc] feat: Abstract optimizer #3656

Are you sure you want to change the base?

[BREAKING][misc] feat: Abstract optimizer #3656

Uh oh!

Conversation

EduardDurech commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

wuxibin89 commented Oct 10, 2025

Uh oh!

EduardDurech commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

EduardDurech commented Sep 30, 2025 •

edited

Loading