Use GPU for SGD weight update calculations

Currently weight updates are [calculated on `Native` backend](https://github.com/autumnai/leaf/blob/master/src/solvers/sgd/momentum.rs#L52). Profiling shows that about 40% of CPU time is spent doing corresponding BLAS operations. Another 40% are in an area without debug info, quite likely that's nvidia driver doing i/o. In the same time according to `nvidia-smi` GPU load is about 20% even on my relatively slow GTX 960.

I think it's possible to get 3x-5x speedup if weight updates are implemented on GPU. It should be quite easy since update is a simple BLAS operation `y = a * x + b * y` where `a` and `b` are scalars, `x` and `y` are tensors of equal dimensions.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use GPU for SGD weight update calculations #88

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use GPU for SGD weight update calculations #88

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions