-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Closed
Labels
Description
背景
框架现存部分 API 文档使用了 @templatedoc
装饰器来通过提取 OP 定义中的参数描述来自动生成 docstring,比如 row_conv
:
Paddle/python/paddle/static/nn/common.py
Lines 3337 to 3383 in 08c0424
@templatedoc() | |
def row_conv(input, future_context_size, param_attr=None, act=None): | |
""" | |
:api_attr: Static Graph | |
${comment} | |
Args: | |
input (${x_type}): ${x_comment}. | |
future_context_size (int): Future context size. Please note, the shape | |
of convolution kernel is [future_context_size + 1, D]. | |
param_attr (ParamAttr): Attributes of parameters, including | |
name, initializer etc. | |
act (str): Non-linear activation to be applied to output variable. | |
Returns: | |
${out_comment}. | |
Examples: | |
.. code-block:: python | |
>>> # for LodTensor inputs | |
>>> import paddle | |
>>> paddle.enable_static() | |
>>> x = paddle.static.data(name='x', shape=[9, 16], | |
... dtype='float32', lod_level=1) | |
>>> out_x = paddle.static.nn.row_conv(input=x, future_context_size=2) | |
>>> # for Tensor inputs | |
>>> y = paddle.static.data(name='y', shape=[9, 4, 16], dtype='float32') | |
>>> out_y = paddle.static.nn.row_conv(input=y, future_context_size=2) | |
""" | |
helper = LayerHelper('row_conv', **locals()) | |
check_variable_and_dtype(input, 'input', ['float32'], 'row_conv') | |
dtype = helper.input_dtype() | |
filter_shape = [future_context_size + 1, input.shape[-1]] | |
filter_param = helper.create_parameter( | |
attr=helper.param_attr, shape=filter_shape, dtype=dtype | |
) | |
out = helper.create_variable_for_type_inference(dtype) | |
helper.append_op( | |
type='row_conv', | |
inputs={'X': [input], 'Filter': [filter_param]}, | |
outputs={'Out': [out]}, | |
) | |
return helper.append_activation(out) |
很明显,它通过其模板部分(如 ${comment}
、${x_type}
、${x_comment}
等)是通过 OpMaker 中的参数描述来填充的:
Paddle/paddle/fluid/operators/row_conv_op.cc
Lines 77 to 135 in 08c0424
class RowConvOpMaker : public framework::OpProtoAndCheckerMaker { | |
public: | |
void Make() override { | |
AddInput("X", | |
"the input(X) is a LodTensor or tensor, LodTensor(X) supports " | |
"variable time-length input sequences. The underlying tensor " | |
"in this phi::DenseTensor is a matrix with shape (T x N), where T " | |
"is the total time steps in this mini-batch and N is the input " | |
"data dimension. the shape of Tensor input(X) has shape " | |
"(B x T x N), B is batch size;"); | |
AddInput("Filter", | |
"the input(Filter) is a learnable parameter. It " | |
"is a 2-D tensor with shape (future_context x N), where, " | |
"future_context is the future context length and N is the data " | |
"dimension."); | |
AddOutput("Out", | |
"the output(Out) is a LodTensor or Tensor, which has same type" | |
" and same shape as X."); | |
AddComment(R"DOC( | |
:strong:`Row-convolution operator` | |
The row convolution is called lookahead convolution. This operator was | |
introduced in the following paper for DeepSpeech2: | |
http://www.cs.cmu.edu/~dyogatam/papers/wang+etal.iclrworkshop2016.pdf | |
The main motivation is that a bidirectional RNN, useful in DeepSpeech | |
like speech models, learns representation for a sequence by performing a | |
forward and a backward pass through the entire sequence. However, unlike | |
unidirectional RNNs, bidirectional RNNs are challenging to deploy in an online | |
and low-latency setting. The lookahead convolution incorporates information | |
from future subsequences in a computationally efficient manner to improve | |
unidirectional recurrent neural networks. The row convolution operator is | |
different from the 1D sequence convolution, and is computed as follows: | |
Given an input sequence $X$ of length $t$ and input dimension $D$, | |
and a filter ($W$) of size $context \times D$, | |
the output sequence is convolved as: | |
$$ | |
out_{i} = \\sum_{j=i}^{i + context - 1} X_{j} \\cdot W_{j-i} | |
$$ | |
In the above equation: | |
* $Out_{i}$: The i-th row of output variable with shape [1, D]. | |
* $context$: Future context size. | |
* $X_{j}$: The j-th row of input variable with shape [1, D]. | |
* $W_{j-i}$: The (j-i)-th row of parameters with shape [1, D]. | |
More details about row_conv please refer to | |
the design document | |
https://github.com/PaddlePaddle/Paddle/issues/2228#issuecomment-303903645 . | |
)DOC"); | |
} | |
}; |
可以猜测,当初的设计是为了避免重复书写参数描述,但是这样一方面会导致文档编写不直观,另一方面 IDE 无法通过静态分析获取完整的 docstring,开发体验较差。而 OpMaker 在未来是会被清理的,因此我们需要将这部分文档迁移到 docstring 中。
另外,因为一些 fluid 等的历史原因,我们框架内也存在了两份 layer_function_generator.py
(templatedoc
的定义文件),因此我们希望尽可能清理掉无用的函数 / 文件,保持框架代码的整洁
目标
- 清理无用
templatedoc
装饰器的使用,将 OP 定义中的参数描述添加到现有的 docstring 中(不必清理 OpMaker 中的,只需要 copy 到 docstirng 即可),清理后框架不再存在templatedoc
装饰器的使用 - 清理掉
templatedoc
函数定义,清理layer_function_generator.py
(python/paddle/base/layers/layer_function_generator.py
和python/paddle/tensor/layer_function_generator.py
)无用函数 - 清理
layer_function_generator.py
其余函数(如add_sample_code
,看是否能完全清理掉,此为进取项)
Tip
PR 应拆尽拆,尽可能每个 PR 只做一件事情~
收益
- 确保框架内代码整洁性,无冗余代码
- 确保 OpMaker 清理后不会导致 API 文档的破坏和丢失
- 提升开发体验,IDE 可以通过静态分析获取完整的 docstring
ooooo-create