|
94 | 94 | func : c_embedding_grad |
95 | 95 | no_need_buffer : weight |
96 | 96 |
|
| 97 | +- backward_op : c_softmax_with_cross_entropy_grad |
| 98 | + forward: c_softmax_with_cross_entropy (Tensor logits, Tensor label, int64_t ignore_index=-100, int ring_id=0, int rank=0, int nranks=0) -> Tensor(softmax), Tensor(loss) |
| 99 | + args: (Tensor softmax, Tensor label, Tensor loss_grad,int64_t ignore_index=-100, int ring_id=0, int rank=0, int nranks=0) |
| 100 | + output: Tensor(logits_grad) |
| 101 | + infer_meta : |
| 102 | + func: CSoftmaxWithCrossEntropyGradInferMeta |
| 103 | + kernel: |
| 104 | + func: c_softmax_with_cross_entropy_grad |
| 105 | + data_type: loss_grad |
| 106 | + |
97 | 107 | - backward_op : divide_double_grad |
98 | 108 | forward : divide_grad (Tensor x, Tensor y, Tensor out, Tensor grad_out, int axis = -1) -> Tensor(grad_x), Tensor(grad_y) |
99 | 109 | args : (Tensor y, Tensor out, Tensor grad_out, Tensor grad_x, Tensor grad_x_grad, Tensor grad_y_grad, int axis = -1) |
|
277 | 287 | func: set_value_with_scalar_grad |
278 | 288 | param: [out_grad, starts, ends, steps, axes, decrease_axes, none_axes] |
279 | 289 |
|
280 | | -- backward_op : c_softmax_with_cross_entropy_grad |
281 | | - forward: c_softmax_with_cross_entropy (Tensor logits, Tensor label, int64_t ignore_index=-100, int ring_id=0, int rank=0, int nranks=0) -> Tensor(softmax), Tensor(loss) |
282 | | - args: (Tensor softmax, Tensor label, Tensor loss_grad,int64_t ignore_index=-100, int ring_id=0, int rank=0, int nranks=0) |
283 | | - output: Tensor(logits_grad) |
284 | | - infer_meta : |
285 | | - func: CSoftmaxWithCrossEntropyGradInferMeta |
286 | | - kernel: |
287 | | - func: c_softmax_with_cross_entropy_grad |
288 | | - data_type: loss_grad |
289 | | - |
290 | 290 | - backward_op : softmax_grad |
291 | 291 | forward : softmax (Tensor x, int axis) -> Tensor(out) |
292 | 292 | args : (Tensor out, Tensor out_grad, int axis) |
|
0 commit comments