Why do you do this before each iteration? and the original full connection layer does not have this step. caffe_set(M_*N_, (Dtype)0., bottom_diff);