You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
It's common that the parameters declared by a Block in Gluon don't exactly match the format used by operators in the backend. Thus we have examples where some parameters are concatenated every forward pass
A naive approach is to refactor the respective Gluon Blocks, to declare the concatenated version of the parameter. This does not work in all cases, as we wish to initialize different parameters differently. For example, RNN biases should be initialized differently from RNN weights.
The status quo, where in such cases concatenation / fusion has to happen at every forward pass is not acceptable either.
Proposed solution: Introduce Block.fuse() and Block.unfuse() APIs. By default, they represent no-ops. User can overwrite fuse and unfuse to declare how to fuse the Block's parameters into a new set (or single) parameter. fuse is called prior to the first forward, after the infer_shape. export will require fused parameters. Prior to save_parameters or load_parameters, the Block is unfused.