Support broadcast_buffers in OssDdp

## 🚀 Feature
We should add support for the `broadcast_buffers` flag to OssDdp.

## Motivation
Distributed training with BatchNorm requires it. We removed it from the fairseq implementation because it slows things down a bit, but for the generalized implementation here we should add it back (as a configurable option).

## Additional context
See documentation for `broadcast_buffers` in the main DDP module: https://pytorch.org/docs/master/generated/torch.nn.parallel.DistributedDataParallel.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support broadcast_buffers in OssDdp #68

🚀 Feature

Motivation

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support broadcast_buffers in OssDdp #68

Description

🚀 Feature

Motivation

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions