Skip to content

Support broadcast_buffers in OssDdp #68

@myleott

Description

@myleott

🚀 Feature

We should add support for the broadcast_buffers flag to OssDdp.

Motivation

Distributed training with BatchNorm requires it. We removed it from the fairseq implementation because it slows things down a bit, but for the generalized implementation here we should add it back (as a configurable option).

Additional context

See documentation for broadcast_buffers in the main DDP module: https://pytorch.org/docs/master/generated/torch.nn.parallel.DistributedDataParallel.html

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions