Add support for batch training

True batch training will use new batches for each training round. This is the usual way NNs are trained. Unfortunately, one aspect of GB packages making them fast is prediction caching (ie to fit the next round, you only need the predictions from the previous round). This makes naive batch training with GB packages slow (but not impossible).

I see a couple methods that might make this possible:
1. For data that fits in memory, use eval datasets to gain prediction caching while still operating on batches.
2. Re-enable batch predictions and just accept O(number of trees) training rounds. Perhaps this ultimately changes the update cadence between the trees and the neural network parameters so that too much time is not lost.

Related issue https://github.com/mthorrell/gboost_module/issues/9


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for batch training #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add support for batch training #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions