Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Parameter fusion support in Gluon #18077

@leezu

Description

@leezu

Description

It's common that the parameters declared by a Block in Gluon don't exactly match the format used by operators in the backend. Thus we have examples where some parameters are concatenated every forward pass

A naive approach is to refactor the respective Gluon Blocks, to declare the concatenated version of the parameter. This does not work in all cases, as we wish to initialize different parameters differently. For example, RNN biases should be initialized differently from RNN weights.

The status quo, where in such cases concatenation / fusion has to happen at every forward pass is not acceptable either.

Proposed solution: Introduce Block.fuse() and Block.unfuse() APIs. By default, they represent no-ops. User can overwrite fuse and unfuse to declare how to fuse the Block's parameters into a new set (or single) parameter. fuse is called prior to the first forward, after the infer_shape.
export will require fused parameters. Prior to save_parameters or load_parameters, the Block is unfused.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions