Skip to content

Conversation

@dzhwinter
Copy link
Contributor

No description provided.



cc_library(init SRCS init.cc DEPS gflags device_context place stringpiece)
cc_library(init SRCS init.cc DEPS gflags device_context place stringpiece operator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init does rely on operator

void DummyTrans(const platform::DeviceContext* ctx,
const KernelTypePair& kernel_pair, const Variable& in,
Variable* out) {
PADDLE_ENFORCE(in.IsType<Tensor>(), "Only Support Tensor transform!.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all Tensors are actually LoDTensors, here should be

in.IsType<LoDTensor>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Will fixed in next PR

#endif
}

void UseALL() {
Copy link
Member

@QiJune QiJune Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since UseCUDNN calls UseCUDA
UseCUDA calls UseMKLDNN...

UseALL is no needed. We can call UseCUDNN directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, UseXXX is recursively called previous UseXXX.
But UseALL called UseCUDNN looks odd, just make it more clearly. And this interface should be removed in the future, we should ONLY allow user configure op in the attribute.

if ((actual_kernel_key == candidate_key) ||
(kernels.count(candidate_key) &&
trans_map.GetNullable(candidate_pair))) {
expected_kernel_key = candidate_key;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default Priority will overwrite user's configuration.
We should strictly obey users' configuration first. If user does not provide a preference, then, we can find a kernel key under the guide of default Priority.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this does not obey the user configuration first rule. Will fix it in the next PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost of DataTrans are different, and the cost from small to large is the following: DataType, Layout, CPU<->GPU. when choosing candidate_key, these cost should take into account.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我觉得我们把问题搞复杂了,目前为止只有 CPU <-> GPU的需求,MKLDNNLayout <-> kPlain的需求。下个PR里从op的attribute让用户选是否使用就可以,考虑cost就和tensorflow的cost model一样了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

premature optimization is the root of all evil.

Copy link
Member

@QiJune QiJune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR will block another, some fix work will be done in another PR together.

@dzhwinter
Copy link
Contributor Author

Thanks! These fix will be done in #6660

@dzhwinter dzhwinter merged commit 5593858 into PaddlePaddle:develop Jan 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants