-
Notifications
You must be signed in to change notification settings - Fork 294
Closed
Description
🚀 Feature
Change the param_groups handling in the state dict, in order to follow more closely the default PyTorch assumptions
https://pytorch.org/docs/stable/optim.html#torch.optim.Optimizer.state_dict
Motivation
- Some users may assume that the default pytorch optimizer interface with respect to the state dict is respected by fairscale/oss
- People familiar with pytorch optimizers would have an easier learning curve when peeking into OSS
Pitch
Rewrite the exposed state dict in order to return "state" and "param_groups" in accordance to pytorch expectations, without duplications
Alternatives
- rely on the python/pytorch memory model to remove duplicates in memory and while serializing
- add wrappers on the user side
Additional context
Metadata
Metadata
Assignees
Labels
No labels