Skip to content

Conversation

@zhiqiu
Copy link
Contributor

@zhiqiu zhiqiu commented Nov 13, 2019

In Paddle, there are several operators that registered some variables as inputs while never use them during computation in kernel. For example, instance_norm_grad registered kernel in InstanceNormGradMaker, which is not used in InstanceNormGradKernel.
This could have two problems,

  1. Unnecessary memory consumption, especially GPU memory. Paddle uses Garbage Collection(GC) to save GPU memory, and GC relies on reference count calculation. When variables are registered as inputs, their reference count increased, thus these unused variables' memory can not be released.
  2. This makes the code confusing.

In order to reduce unused variables, this PR,

  1. Count variable usage in InputVar, MultiInput, MultiInputVar , compare them with op's inputs registered, and report unused variables during operator running.
  2. Add unused_var_check to CI job. (PR_CI_Coverage)
  3. Keep a white list which contains the existing operators that have unused variables. These operators should be fixed in the future.

See wiki for more details.

@zhiqiu zhiqiu force-pushed the dev/unused_var branch 4 times, most recently from 740b587 to e9b1db9 Compare November 14, 2019 07:48
@zhiqiu zhiqiu force-pushed the dev/unused_var branch 3 times, most recently from ebcc8a7 to 26d968e Compare November 20, 2019 14:57
@zhiqiu zhiqiu force-pushed the dev/unused_var branch 4 times, most recently from 936ea7f to 6552c8f Compare November 21, 2019 15:43
fi

# TODO: jiabin need to refine this part when these tests fixed on mac
# NOTE(cql): enable unused_var_check for MAC CI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

438行可以去掉了吧。439行请写具体一点,cql用完整的github名字,不然无法对应。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

@zhiqiu zhiqiu force-pushed the dev/unused_var branch 3 times, most recently from 50719e6 to 36487c8 Compare November 22, 2019 15:15
}

const Variable* ExecutionContext::InputVar(const std::string& name) const {
LogVarUsageIfUnusedVarCheckEnabled(name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call this function with if (FLAGS_enable_unused_var_check)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. To reduce code duplication, I will put the if statement inside LogVarUsageIfUnusedVarCheckEnabled.

template <>
const std::vector<const Tensor*> ExecutionContext::MultiInput<Tensor>(
const std::string& name) const {
LogVarUsageIfUnusedVarCheckEnabled(name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call this function with if (FLAGS_enable_unused_var_check)?

Copy link
Contributor Author

@zhiqiu zhiqiu Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. To reduce code duplication, I will put the if statement inside LogVarUsageIfUnusedVarCheckEnabled.


const std::vector<const Variable*> MultiInputVar(
const std::string& name) const {
LogVarUsageIfUnusedVarCheckEnabled(name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.


template <typename T>
const std::vector<const T*> MultiInput(const std::string& name) const {
LogVarUsageIfUnusedVarCheckEnabled(name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.


//! Get actual name vector for this input.
const std::vector<std::string>& Inputs(const std::string& name) const {
LogVarUsageIfUnusedVarCheckEnabled(name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

}
err_msg +=
"please make sure it(them) is(are) needed. If not, remove it(them) "
"from inputs; if yes, register NoNeedBufferVars or add the operator to "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

register NoNeedBufferVars or remove them in the GradOpMaker or add the operator to white list in unused_var_check.cc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. I use if to discussed failure of unused_var_check on two case,

  1. If the input var is not need in computation, the only fix suggestion is "remove them in the inputs."
  2. If the input var is really need in computation but failed on check (due to some reasons, like different kernels, conditional usage, or only needing dims), the fix suggestion is "register NoNeedBufferVars or add the operator to white list in unused_var_check.cc".

"please make sure it(them) is(are) needed. If not, remove it(them) "
"from inputs; if yes, register NoNeedBufferVars or add the operator to "
"white list in unused_var_check.cc.";
// VLOG(1) << err_msg;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

#include "paddle/fluid/framework/operator.h"

DECLARE_bool(enable_unused_var_check);
DECLARE_bool(use_mkldnn);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not important: DECLARE_xxx here is not necessary in header file. You can move them to .cc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove DECLARE_bool(use_mkldnn) since it is not used, and keep DECLARE_bool(enable_unused_var_check) because some .cc file will include this header file and use the FLAGS.

DECLARE_INPLACE_OP_INFERER(FlattenGradInplaceinToOut,
{framework::GradVarName("Out"),
framework::GradVarName("X")});
DECLARE_NO_NEED_BUFFER_VARS_INFERENCE(FlattenGradNoNeedBufferVarsInference,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no need Out? Maybe it is X that is not needed?

{framework::GradVarName("Out"),
framework::GradVarName("X")});

DECLARE_NO_NEED_BUFFER_VARS_INFERENCE(SqueezeGradNoNeedBufferVarsInference,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

echo_line="You must have one RD (gongweibao or seiriosPlus) approval for the paddle/fluid/operators/distributed/send_recv.proto.in, which manages the environment variables.\n"
check_approval 1 10721757 5442383
elif [ "${API_FILE}" == "paddle/fluid/framework/unused_var_check.cc" ];then
echo_line="You must have one RD (zhiqiu (Recommend) , sneaxiy or luotao1) approval for the paddle/fluid/framework/unused_var_check.cc, which manages the white list of operators that have unused input variables.\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should have a wiki to tell developers what's the problem, and how to fix this problem, otherwise, they don't understand it, and will directly add to the write list.

luotao1
luotao1 previously approved these changes Nov 28, 2019
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sneaxiy
sneaxiy previously approved these changes Nov 29, 2019
Copy link
Collaborator

@sneaxiy sneaxiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with 2 things to do in the next PR:

  1. Remove InitThreadLocalUsedVarNameSet in unused_var_check.h since it is not used and not defined.

  2. Add custom grad op maker for squeeze and flatten to remove Out dependency.

Copy link
Collaborator

@sneaxiy sneaxiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiqiu zhiqiu merged commit e0c9d85 into PaddlePaddle:develop Nov 29, 2019
seiriosPlus pushed a commit to seiriosPlus/Paddle that referenced this pull request Dec 9, 2019
…dle#21169)

* add unused input vars check for OpWithKernel, test=develop

* remove unused vars in some ops, test=develop

* fix batch_norm, test=develop

* add white list, test=develop

* add CI check for white list, test=develop

* :ove white list to c++, test=develop

* solve failure of CI, test=develop

* add unittest for unused_var_check, test=develop

* refine code, enable check in operator_test, test=develop

* skip mkldnn, test=develop

* extend white list, test=develop

* refine condition of mkldnn, test=develop

* fix paddle_build, test=develop

* follow comments, test=develop

* fix GetExpectedKernelType

* add wiki ref to err_msg, test=develop

* follow comment, test=develop
seiriosPlus pushed a commit to seiriosPlus/Paddle that referenced this pull request Dec 9, 2019
…dle#21169)

* add unused input vars check for OpWithKernel, test=develop

* remove unused vars in some ops, test=develop

* fix batch_norm, test=develop

* add white list, test=develop

* add CI check for white list, test=develop

* :ove white list to c++, test=develop

* solve failure of CI, test=develop

* add unittest for unused_var_check, test=develop

* refine code, enable check in operator_test, test=develop

* skip mkldnn, test=develop

* extend white list, test=develop

* refine condition of mkldnn, test=develop

* fix paddle_build, test=develop

* follow comments, test=develop

* fix GetExpectedKernelType

* add wiki ref to err_msg, test=develop

* follow comment, test=develop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants