-
-
Notifications
You must be signed in to change notification settings - Fork 28
Description
After dealing with the idea of caching in mlr recently, I think this is an important topic for mlr3.
It would be a core base feature and should be integrated right from the start.
While in mlr I just did it for caching filter values for now, we should think of implementing it as a pkg option and make it available for all calls (resample, train, tuning, filtering, etc).
Most calls (dataset, learner, hyperpars) are unique and caching won't have that much of an effect as for filtering (for which the call to generate the filter values is always the same and the subsetting happens afterwards).
However, it can also have a positive effect on "normal" train/test calls:
- If a run (resample, tuneParams, benchmark) errors and a seed is set, the user can just rerun and profit from the cached calls
- For tuning methods like grid search settings might be redundant more often and the user can profit from caching
- Most often it will apply for simple train/test calls without tuning.
I've added a function delete_cache()
and get_cache_dir()
in my mlr PR to make the cache handling more convenient. We could think about a own class cache
for such things.
Please share your opinions.