-
Notifications
You must be signed in to change notification settings - Fork 5.9k
block design #3708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block design #3708
Changes from 34 commits
ea9eed0
7496f8b
e69aa44
f1da88d
6448339
124dd0c
e55a3d8
d84bde3
8ab4951
2253219
da4f2a6
64ed5dd
247f4a9
6e62cb2
c59e697
3f3de4a
6b6679a
5acd4bf
49606ad
6672878
1c729e3
2a06c47
c9c0898
b0785f6
6ab72db
920a66f
b09d5db
37b285a
f2e0a5e
91888a9
69d44ca
4a15597
f037b6e
b9d9c2d
84850d3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,215 @@ | ||
# Design Doc: Use Block in RNNOp, While Op, IfElseOp | ||
|
||
In C++ and Java programming language, a block is a lexical structure of source code which is grouped as one line of code. | ||
|
||
RNNOp looks like the loop structure in programming languages. | ||
And similarly, WhileOp and IfElseOp are like loop and conditions respectively. | ||
So we want to verify if we should have a class Block in PaddlePaddle that works like a pair of curly braces in the loop and condition structures of programming languages. | ||
|
||
Blocks do not only group source code, but also narrow the lexical scope of variables so that they do not conflict with variables having the same name used elsewhere in a program. | ||
|
||
In Paddle, we need a similar concept called Block to support following scenes: | ||
|
||
- define a PaddlePaddle program by writing blocks of codes, which includes the definitions of variables and operators. | ||
- `RNNOp`, `SwitchOp`, `WhileOp` and `IfElseOp`, etc, need Block to help to define sub-block. | ||
- help to execute multiple operators, blocks should group operators and runs like a single operator. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This statement is not reasonable.
We already differentiate the static phase and dynamic phase. In the words declared above, |
||
## How to use Block | ||
In `RNNOp`, `SwitchOp`, `WhileOp` and `IfElseOp`, a with-statement should be used to help to define a sub-block. | ||
|
||
Let's start from how a RNNOp is described using Block: | ||
|
||
```python | ||
v = some_op() | ||
m_boot = some_op() | ||
|
||
W = pd.Variable(shape=[20, 20]) | ||
U = pd.Varable(shape=[20, 20]) | ||
|
||
rnn = create_rnn() | ||
|
||
with rnn.stepnet() as net: | ||
# declare the input variables that need to be segmented into steps | ||
x = net.set_inputs(v) | ||
# declare rnn's memory (state) | ||
h = net.add_memory(init=m_boot) | ||
|
||
|
||
fc_out = pd.matmul(W, x) | ||
hidden_out = pd.matmul(U, h.pre(n=1)) | ||
sum = pd.add_two(fc_out, hidden_out) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
act = pd.sigmoid(sum) | ||
h.update(act) # update memory | ||
|
||
# declare outputs that needs to be merged across all the steps | ||
net.set_outputs(act, hidden_out) | ||
|
||
acts, hs = rnn() | ||
``` | ||
|
||
The with-statement above describes a `RNNOp`'s stepnet as a block, this description will be transformed to a protobuf message as follows | ||
|
||
``` | ||
BlockDesc RNNOp_stepnet { | ||
vars = { | ||
x {...} | ||
h {...} | ||
fc_out {...} | ||
hidden_out {...} | ||
sum {...} | ||
act {...} | ||
} | ||
|
||
ops = { | ||
matmul, | ||
add_two, | ||
sigmoid | ||
} | ||
}; | ||
|
||
RNNOpDesc rnn { | ||
inputs = {x}; | ||
outputs = {act, hidden_out}; | ||
attrs { memories={h} }; | ||
stepnet {RNNOp_stepnet}; | ||
}; | ||
``` | ||
|
||
and pass it to a C++ Block, the C++ Block will create the Variables and Operators. | ||
|
||
|
||
## Block Implementation | ||
|
||
During the generation of the Protobuf message, the Block should store VarDesc (the Protobuf message which describes Variable) and OpDesc (the Protobuf message which describes Operator). | ||
|
||
VarDesc in a block should have its name scope to avoid local variables affect parent block's name scope. | ||
Child block's name scopes should inherit the parent's so that OpDesc in child block can reference a VarDesc that stored in parent block. For example | ||
|
||
```python | ||
a = pd.Varaible(shape=[20, 20]) | ||
b = pd.fc(a, params=["fc.w", "fc.b"]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not think we should passing weight and bias to Maybe adding a |
||
|
||
rnn = pd.create_rnn() | ||
with rnn.stepnet() as net: | ||
x = net.set_inputs(a) | ||
# reuse fc's parameter | ||
fc_without_b = pd.get_variable("fc.w") | ||
net.set_outputs(fc_without_b) | ||
|
||
out = rnn() | ||
``` | ||
the method `pd.get_variable` can help retrieve a Variable by a name, a Variable may store in a parent block, but might be retrieved in a child block, so block should have a variable scope that supports inheritance. | ||
|
||
In compiler design, the symbol table is an data structure created and maintained by compilers in order to store information about the occurrence of various entities such as variable names, function names, classes, etc. | ||
|
||
To store the definition of Variables and Operators, a C++ class `SymbolTable` is introduced as a similar concept with compiler's symbol table. | ||
|
||
`SymbolTable` will has following functions: | ||
|
||
- store the definitions (some names and attributes) of variables and operators, | ||
- to verify if a variable name has been declared, | ||
- to make it possible to implement type checking (offer Protobuf message pointers to `InferShape` handlers). | ||
|
||
|
||
```c++ | ||
// Information in SymbolTable is enough to trace the dependency graph. So maybe | ||
// the Eval() interface takes a SymbolTable is enough. | ||
class SymbolTable { | ||
public: | ||
SymbolTable(SymbolTable* parent) : parent_(parent) {} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Scope has parent, SymbolTable stores VarDesc, and VarDesc may exists in SymbolTable's parent. |
||
|
||
OpDesc* NewOp(const string& name=""); | ||
|
||
// TODO determine whether name is generated by python or C++ | ||
// currently assume that a unique name will be generated by C++ if the | ||
// argument name left default. | ||
VarDesc* NewVar(const string& name=""); | ||
|
||
// find a VarDesc by name, if recursive true, find parent's SymbolTable | ||
// recursively. | ||
// this interface is introduced to support InferShape, find protobuf messages | ||
// of variables and operators, pass pointers into InferShape. | ||
// operator | ||
// | ||
// NOTE maybe some C++ classes such as VarDescBuilder and OpDescBuilder should | ||
// be proposed and embedded into pybind to enable python operate on C++ pointers. | ||
VarDesc* FindVar(const string& name, bool recursive=true); | ||
|
||
OpDesc* FindOp(const string& name); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Variables in parent block may be referenced by child block, but op won't. So variables need a recursive |
||
|
||
BlockDesc Compile() const; | ||
|
||
private: | ||
SymbolTable* parent_; | ||
|
||
map<string, OpDesc> ops_; | ||
map<string, VarDesc> vars_; | ||
}; | ||
``` | ||
|
||
After all the description of variables and operators is added into SymbolTable, | ||
the block has enough information to run. | ||
|
||
The `Block` class takes a `BlockDesc` as input, and provide `Run` and `InferShape` functions. | ||
|
||
|
||
```c++ | ||
namespace { | ||
|
||
class Block : OperatorBase { | ||
public: | ||
Block(const BlockDesc& desc) desc_(desc) {} | ||
|
||
void InferShape(const framework::Scope& scope) const override { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does |
||
if (!symbols_ready_) { | ||
CreateVariables(scope); | ||
CreateOperators(); | ||
} | ||
// should run InferShape first. | ||
for (auto& op : runtime_table_.ops()) { | ||
op->InferShape(scope); | ||
} | ||
} | ||
|
||
void Run(const framework::Scope& scope, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To harness multiple CPU cores, we need a scheduler to run the OPs (none interdependent OPs should be able to be scheduled concurrently on different thread in a thread pool). However, making There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we may add a |
||
const platform::DeviceContext& dev_ctx) const override { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if ops from a block require to run on different device contexts? (e.g., one OP can run on CPU only, other OPs must run on GPU). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is a historical issue, Block inherent from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think multi-device is a very important feature and we should take this into consideration at current desgin. And, What's the relationship between these two concepts, Block and Graph? If Block is Graph, and the private member of Block should be Nodes and Edges. The Graph mainly describes data dependency and control dependency. And Graph will run by an Executor, the executor will support multi-device executing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. better to discuss this in another issue. |
||
PADDLE_ENFORCE(symbols_ready_, "operators and variables should be created first."); | ||
for (auto& op : runtime_table_.ops()) { | ||
op->Run(scope, dev_ctx); | ||
} | ||
} | ||
|
||
void CreateVariables(const framework::Scope& scope); | ||
void CreateOperators(); | ||
|
||
// some other necessary interfaces of NetOp are list below | ||
// ... | ||
|
||
private: | ||
BlockDesc desc_; | ||
bool symbols_ready_{false}; | ||
}; | ||
``` | ||
|
||
## Run and Eval targets | ||
Block inherits from OperatorBase, which has a Run method. | ||
Block's Run method will run its operators sequentially. | ||
|
||
There is another important interface called `Eval`, which take some arguments called targets, and generate a minimal graph which takes targets as the end points and creates a new Block, | ||
|
||
after `Run`, `Eval` will get the latest value and return the targets. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we unify There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Eval will create a new block, and Run it, so split Eval and Run seems better. |
||
|
||
The definition of Eval is as follows: | ||
|
||
```c++ | ||
// clean a block description by targets using the corresponding dependency graph. | ||
// return a new BlockDesc with minial number of operators. | ||
// NOTE not return a Block but the block's description so that this can be distributed | ||
// to a cluster. | ||
BlockDesc Prune(const BlockDesc& desc, vector<string> targets); | ||
|
||
void Block::Eval(const vector<string>& targets, | ||
const framework::Scope& scope, | ||
const platform::DeviceContext& dev_ctx) { | ||
BlockDesc min_desc = Prune(desc_, targets); | ||
Block min_block(min_desc); | ||
min_block.Run(scope, dev_ctx); | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start from how RNN is described using PaddlePaddle: