Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
b1b4364
Rename PlainNet --> NetOp
reyoung Jul 26, 2017
ecf23ce
Update Backward
reyoung Jul 26, 2017
b1b13f8
Update Interface
reyoung Jul 26, 2017
00615eb
Refine OpRegistry::AddInput/AddOutput
reyoung Jul 26, 2017
a2dc961
Add fill_zeros_like op
JiayiFeng Jul 26, 2017
e32e306
Develop backward building precess of single op
JiayiFeng Jul 26, 2017
831d4e1
Refining Unittest
reyoung Jul 26, 2017
f77c63b
Merge branch 'feature/backward' of https://github.com/reyoung/Paddle …
JiayiFeng Jul 26, 2017
fa7cbfd
"backward is NetOp"
dzhwinter Jul 26, 2017
0ac79a3
Merge remote-tracking branch 'reyoung/feature/backward' into feature/…
dzhwinter Jul 26, 2017
292f2ab
"split to generic add PR"
dzhwinter Jul 26, 2017
05d9aff
Stash
reyoung Jul 27, 2017
fa6a46a
Merge branch 'feature/backward' of github.com:reyoung/Paddle into fea…
reyoung Jul 27, 2017
03f418c
Fix compile error
JiayiFeng Jul 27, 2017
5297bcb
Merge branch 'feature/backward' of https://github.com/reyoung/Paddle …
JiayiFeng Jul 27, 2017
9475972
Merge branch 'feature/backward' of github.com:reyoung/Paddle into fea…
reyoung Jul 27, 2017
f9fab14
Fix compile error
reyoung Jul 27, 2017
3d18737
Add unittest for part_of_output_are_not_need
reyoung Jul 27, 2017
70bd07a
Fix compile errors of FillZerosLikeOp
JiayiFeng Jul 27, 2017
63636d6
Stash for canpio
reyoung Jul 27, 2017
04db418
Add unitest of Backward.part_of_input_are_not_need
JiayiFeng Jul 27, 2017
28c0281
Stash
reyoung Jul 27, 2017
099bb53
Merge branch 'feature/backward' of github.com:reyoung/Paddle into fea…
reyoung Jul 27, 2017
3dd5fd0
Add unitest of Backward.intermediate_variable_not_need_in_linear_net
JiayiFeng Jul 27, 2017
84198f7
Add unittest
reyoung Jul 27, 2017
4461f3c
Merge branch 'feature/backward' of https://github.com/reyoung/Paddle …
JiayiFeng Jul 27, 2017
b1d8419
rename test
JiayiFeng Jul 27, 2017
d2583bd
InsertOp for NetOp
reyoung Jul 27, 2017
b9f2bb3
"wait add generic"
dzhwinter Jul 27, 2017
5713266
Merge remote-tracking branch 'reyoung/feature/backward' into feature/…
dzhwinter Jul 27, 2017
d4ab70a
Merge branch 'feature/backward' of github.com:reyoung/Paddle into fea…
reyoung Jul 27, 2017
a0669ea
Merge remote-tracking branch 'reyoung/feature/backward' into feature/…
dzhwinter Jul 27, 2017
7088654
"add duplicate"
dzhwinter Jul 27, 2017
404cc05
"reverse travesal"
dzhwinter Jul 27, 2017
65d2678
"add simple net test"
dzhwinter Jul 28, 2017
46d766e
Merge branch 'feature/unittest_for_inputs' into feature/backward
reyoung Jul 28, 2017
e1d1067
Merge branch 'feature/backward' of github.com:reyoung/Paddle into fea…
reyoung Jul 28, 2017
8bf0ca0
Fix unittest error
reyoung Jul 28, 2017
d0b25ac
Fix some unittest error
reyoung Jul 28, 2017
72839a7
fix conflict6
dzhwinter Jul 28, 2017
29d50ad
Refine unit-test
reyoung Jul 28, 2017
74cd9a7
"fix unittest"
dzhwinter Jul 28, 2017
7087a04
"add unittest"
dzhwinter Jul 28, 2017
b2e1c48
Merge remote-tracking branch 'reyoung/feature/backward' into feature/…
dzhwinter Jul 28, 2017
658588a
"format test case"
dzhwinter Jul 28, 2017
d6e0368
Add comment in backward.cc
reyoung Jul 28, 2017
e1cd719
Merge branch 'feature/backward' of https://github.com/reyoung/Paddle …
dzhwinter Jul 28, 2017
71bd439
Addjust Backward.linear_net_intermediate_variable_has_no_grad
JiayiFeng Jul 28, 2017
0da5cce
"fix test case"
dzhwinter Jul 28, 2017
52054af
"fix typo"
dzhwinter Jul 28, 2017
0e337be
Merge branch 'feature/backward' of https://github.com/reyoung/Paddle …
JiayiFeng Jul 28, 2017
1197420
Merge branch 'feature/backward' of https://github.com/reyoung/Paddle …
JiayiFeng Jul 28, 2017
302046a
"fix return net error"
dzhwinter Jul 28, 2017
1de465b
Change some `ASSERT_EQ` to `EXPECT_EQ`
JiayiFeng Jul 28, 2017
dc06eaa
Merge branch 'feature/backward' of https://github.com/reyoung/Paddle …
JiayiFeng Jul 28, 2017
39cd39e
Update test
JiayiFeng Jul 28, 2017
be52868
Fix net_input_of_network_not_need_grad
reyoung Jul 28, 2017
a2e2cd7
Fix bug of TEST Backwar.linear_net_intermediate_variable_has_no_grad
JiayiFeng Jul 28, 2017
2198963
Merge branch 'feature/backward' of https://github.com/reyoung/Paddle …
JiayiFeng Jul 28, 2017
42e2fa5
Fix unittest
reyoung Jul 28, 2017
48812cd
Merge branch 'feature/backward' of github.com:reyoung/Paddle into fea…
reyoung Jul 28, 2017
213fdad
adjust format
JiayiFeng Jul 28, 2017
f5636da
design doc
dzhwinter Jul 30, 2017
bd14660
"add part of design doc"
dzhwinter Jul 31, 2017
ca16c0d
Merge remote-tracking branch 'remotes/reyoung/feature/backward' into …
dzhwinter Jul 31, 2017
bc146e8
Merge branch 'develop' of github.com:baidu/Paddle into feature/backward
reyoung Aug 1, 2017
80baf86
Merge branch 'feature/backward' of github.com:reyoung/Paddle into fea…
reyoung Aug 1, 2017
e2fd2bd
Follow comments and merge develop
reyoung Aug 1, 2017
737ea05
Use static_cast, Fix unittest
reyoung Aug 1, 2017
9cc9907
Merge branch 'develop' of github.com:baidu/Paddle into feature/backward
reyoung Aug 1, 2017
051d6c8
Merge develop
reyoung Aug 1, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion paddle/framework/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,7 @@ add_custom_target(framework_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch
add_dependencies(framework_py_proto framework_py_proto_init)

cc_library(net SRCS net.cc DEPS op_registry)
cc_test(net_op_test SRCS net_op_test.cc DEPS net add_op mul_op sigmoid_op softmax_op fc_op)
cc_test(net_op_test SRCS net_op_test.cc DEPS net)

cc_library(backward SRCS backward.cc DEPS net)
cc_test(backward_test SRCS backward_test.cc DEPS backward)
178 changes: 178 additions & 0 deletions paddle/framework/backward.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

#include "paddle/framework/backward.h"
#include <list>
#include "paddle/framework/net.h"
#include "paddle/framework/op_registry.h"

namespace paddle {
namespace framework {

static bool AllInSet(const std::vector<std::string>& names,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to use static?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we do not export them to global symbols.

const std::string& suffix,
const std::unordered_set<std::string>& set) {
for (auto& name : names) {
if (set.find(name + suffix) == set.end()) {
return false;
}
}
return true;
}

static std::shared_ptr<OperatorBase> NOP() {
auto net_op = std::make_shared<NetOp>();
net_op->type_ = "@NOP@";
net_op->CompleteAddOp();
return net_op;
}

// Get backward operator from a forward operator, recursively implementation.
//
// no_grad_names the gradient variable names without gradient calculating.
//
// uniq_id is a unique index used inside recursively calling BackwardRecursive.
// use `uid = uniq_id++;` to get the unique index, and pass `uniq_id` through
// recursive calling.
//
// returns The backward operator. For simple situation, it is a simple
// operator. For complex situation, it is a NetOp.
//
// See Backward.h for details
static std::shared_ptr<OperatorBase> BackwardRecursive(
const OperatorBase& forwardOp,
std::unordered_set<std::string>& no_grad_names, size_t& uniq_id);
std::shared_ptr<OperatorBase> BackwardRecursive(
const OperatorBase& forwardOp,
std::unordered_set<std::string>& no_grad_names, size_t& uniq_id) {
// If all input gradients of forwarding operator do not need to calculate,
// just return an NOP. Not return null ptr because NOP does not take
// too much time for calculation, but it is useful for simplifying logic.
if (AllInSet(forwardOp.inputs_, OperatorBase::GRAD_VAR_SUFFIX(),
no_grad_names)) {
return NOP();
}

// All output gradients of forwarding operator do not need to calculate. Then
// all input gradients cannot be computed at all, and we put them into
// `no_grad_names` set. Return an NOP.
if (AllInSet(forwardOp.outputs_, OperatorBase::GRAD_VAR_SUFFIX(),
no_grad_names)) {
for (auto& name : forwardOp.inputs_) {
// Mark all input is not need
no_grad_names.insert(name + OperatorBase::GRAD_VAR_SUFFIX());
}
return NOP();
}

// Returned gradient network
auto net = std::make_shared<NetOp>();

if (forwardOp.IsNetOp()) {
// Because forwardOp is a net op, it can static_cast.
auto& forwardNet = static_cast<const NetOp&>(forwardOp);

// Map from output gradient variable name to operator's indices in backward
// net. That operator generates that variable.
std::unordered_map<std::string, std::vector<size_t>> dup_output_ops;

size_t local_op_id = 0;
// reversely travel forwardNet
for (auto it = forwardNet.ops_.rbegin(); it != forwardNet.ops_.rend();
++it, ++local_op_id) {
auto fwd = *it;
auto bwd = BackwardRecursive(*fwd, no_grad_names, uniq_id);
net->AddOp(bwd);
for (auto& out : bwd->outputs_) {
dup_output_ops[out].emplace_back(local_op_id);
}
}
// Get unique ID for this method.
auto uid = uniq_id++;
// TODO(dzh): more comment
using Pos = std::pair<size_t, std::shared_ptr<OperatorBase>>;
std::list<Pos> insert_position;
for (auto& dup_output_op : dup_output_ops) {
const std::string& name = dup_output_op.first;
auto& dup_op = dup_output_op.second;
if (dup_op.size() == 1) continue;
std::vector<std::string> dup_outputs;

for (size_t i = 0; i < dup_op.size(); ++i) {
auto op_offset = dup_op[i];
dup_outputs.push_back(name + "@RENAME@" + std::to_string(uid) + "@" +
std::to_string(i));
net->ops_[op_offset]->Rename(name, dup_outputs.back());
}
insert_position.push_back(
{dup_op.back(),
OpRegistry::CreateOp(
"add", {dup_outputs}, {name},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个add op现在应该还没实现?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的。这个Op的实现不影响Backward算法的实现和单测。

{{"input_format",
std::vector<int>{0, static_cast<int>(dup_outputs.size())}}})});
}

insert_position.sort(
[](const Pos& l, const Pos& r) { return l.first > r.first; });

for (auto& pos : insert_position) {
net->InsertOp(pos.first + 1, pos.second);
}

} else {
std::shared_ptr<OperatorBase> grad_op = OpRegistry::CreateGradOp(forwardOp);
for (std::string& grad_input : grad_op->inputs_) {
if (no_grad_names.count(grad_input)) {
std::string prefix = grad_input.substr(
0, grad_input.size() - OperatorBase::GRAD_VAR_SUFFIX().size());
grad_input = prefix + OperatorBase::ZERO_VAR_SUFFIX();

// If part of input gradient of that operator is not calculated, fill
// zero variables to that input gradient.
net->AddOp(OpRegistry::CreateOp("fill_zeros_like", {prefix},
{grad_input}, {}));
}
}

for (std::string& grad_output : grad_op->outputs_) {
if (no_grad_names.count(grad_output)) {
grad_output = OperatorBase::EMPTY_VAR_NAME();
}
}

if (net->ops_.empty()) { // Current no aux op is added to network
return grad_op;
}
net->AddOp(grad_op);
}
net->type_ = "@GENERATED_BACKWARD@";
net->CompleteAddOp();
return net;
}

// See header for comments
std::shared_ptr<OperatorBase> Backward(
const OperatorBase& forwardOp,
const std::unordered_set<std::string>& no_grad_vars) {
std::unordered_set<std::string> no_grad_names;
no_grad_names.reserve(no_grad_vars.size());

for (auto& name : no_grad_vars) {
no_grad_names.insert(name + OperatorBase::GRAD_VAR_SUFFIX());
}
size_t uid = 0;
return BackwardRecursive(forwardOp, no_grad_names, uid);
}
} // namespace framework
} // namespace paddle
27 changes: 27 additions & 0 deletions paddle/framework/backward.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

#pragma once
#include <unordered_set>
#include "operator.h"
namespace paddle {
namespace framework {

// Create the backward operator from a forward operator.
// TODO(yuyang18): Add more API reference comment.
extern std::shared_ptr<OperatorBase> Backward(
const OperatorBase& forwardOp,
const std::unordered_set<std::string>& no_grad_vars);
} // namespace framework
} // namespace paddle
38 changes: 38 additions & 0 deletions paddle/framework/backward.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
## Operator/expression 's Backward

### Motivation

In Neural Network, the backpropagation algorithm follows the chain rule, so we need to compound the fundmental gradient operators/expressions together with chain rule . Every forward network need a backward network to construct the full computation lineage, the operator/ expression's Backward feature will generate the backward pass respect to forward pass.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computation lineage ==》 computation graph? I can find any definition about computation lineage


### Implement : gradient operator registry

| | forward operator | backward operator |
| ---------------------- | ---------------- | -------------------------------- |
| **Operator::inputs_** | Inputs | Inputs, Outputs, OutputGradients |
| **Operator::outputs_** | Outputs | InputGradients |

Inputs/Outputs means the input/output of the operator, InputGradients/OutputGradients is the gradient respect to forward opeartor. Forward operator and Backward operator are isomorphic, save their corresponding needs into member attribute.

We use a global hash map record the gradient operators available, follow the philosophy of minimum core, make operator pluggable unit. Each gradient is an operator and it needs to regist itself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need not emphasize that we use hash map, map is enough, hash map is a way of optimization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regist ==> register


grad_op_builder(fengjiayi)

### Implement : Backward network

given a forward network, it generates the backward network. We only care about the Gradients—`OutputGradients`,`InputGradients`.

1. bla bla bla (yuyang)

2. NetOp

when the input forward network is a NetOp, it need to call the sub NetOp/Operators backward function recursively and ensure them done. During the process, we need to collect the `OutputGradients` name.

We share variable in the same scope, as a result, duplicate operator `OutputGradients` will overwirte then duplicate variable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overwirte => overwrite, then => the
I remember that we will add a add_op if some outputs are duplicated, and rename the duplicated ones.


![./images/duplicate_op]()
Copy link
Member

@jacquesqiao jacquesqiao Aug 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

![./images/duplicate_op.png]()


Share variable between operators or same input variable used in multiple operators lead to a duplicate gradient variable. As demo show above, we need to rename gradient name recursively, and add a generic add operator instead.

![./images/duplicate_op2]()

​ Then collect the sub graph OutputGradients/InputGradients as the NetOp's and return it.
Loading