Skip to content

Commit c0dbcdc

Browse files
committed
mlr3 benchmark
1 parent e2ea3df commit c0dbcdc

File tree

1 file changed

+12
-10
lines changed

1 file changed

+12
-10
lines changed

mlr-org/benchmarks/benchmarks_mlr3.qmd

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -50,12 +50,12 @@ For example, if the training step of a random forest model `ranger::ranger()` ta
5050
When the same model takes 1 second to train, the overhead introduced by mlr3 is only 1%.
5151
Instead of using real models, we simulate the training and prediction time for models by sleeping for 1, 10, 100, and 1000 ms.
5252

53-
We start by measuring the runtime of the `$train()` methods of the learner.
53+
We start by measuring the runtime of the `$train()` method of the learner.
5454
For models with a training time of 1000 and 100 ms, the overhead introduced by mlr3 is minimal.
5555
Models with a training time of 10 ms take 2 times longer to train in mlr3.
5656
For models with a training time of 1 ms, the overhead is approximately 10 times larger than the actual model training time.
5757
The overhead of `$predict()` is similar to `$train()` and the size of the dataset being predicted plays only a minor role.
58-
The `$predict_newdata()` methods converts the data to a task and then predicts on it which doubles the overhead of the `$predict()` method.
58+
The `$predict_newdata()` method converts the data to a task and then predicts on it which doubles the overhead of the `$predict()` method.
5959
The recently introduced `$predict_newdata_fast()` method is much faster than `$predict_newdata()`.
6060
For models with a prediction time of 10 ms, the overhead is around 10%.
6161
For models with a prediction time of 1 ms, the overhead is around 50%.
@@ -115,7 +115,8 @@ create_table = function(data) {
115115
task = "Task Size",
116116
median_runtime = "Median Runtime [ms]",
117117
k = "K") %>%
118-
fmt_number(columns = c("k", "median_runtime"), n_sigfig = 2, sep_mark = "") %>%
118+
fmt_number(columns = c("median_runtime"), decimals = 0, sep_mark = "") %>%
119+
fmt_number(columns = c("k"), n_sigfig = 2, sep_mark = "") %>%
119120
tab_style(
120121
style = list(
121122
cell_fill(color = "crimson"),
@@ -156,7 +157,7 @@ data_runtime = data_runtime[, -c("renv_project")]
156157
```
157158

158159
The runtime and memory usage of the `$train()` method is measured for different mlr3 versions.
159-
The train step is performed for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 instances.
160+
The train step is performed for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 observations.
160161

161162
```{r}
162163
#| eval: false
@@ -208,7 +209,7 @@ plot_runtime = function(data) {
208209
geom_col(group = 1, fill = "#008080") +
209210
geom_errorbar(aes(ymin = pmax(median_runtime - mad_runtime, 0), ymax = median_runtime + mad_runtime), width = 0.5, position = position_dodge(0.9)) +
210211
geom_hline(aes(yintercept = model_time), linetype = "dashed") +
211-
facet_wrap(~task, scales = "free_y", labeller = labeller(task = function(value) sprintf("%s Instances", value))) +
212+
facet_wrap(~task, scales = "free_y", labeller = labeller(task = function(value) sprintf("%s Observations", value))) +
212213
labs(x = "mlr3Version", y = "Runtime [ms]") +
213214
theme_minimal() +
214215
theme(axis.text.x = element_text(angle = 45, hjust = 1))
@@ -228,7 +229,8 @@ create_table = function(data) {
228229
task = "Task Size",
229230
median_runtime = "Median Runtime [ms]",
230231
k = "K") %>%
231-
fmt_number(columns = c("k", "median_runtime"), n_sigfig = 2) %>%
232+
fmt_number(columns = c("median_runtime"), decimals = 0, sep_mark = "") %>%
233+
fmt_number(columns = c("k"), n_sigfig = 2, sep_mark = "") %>%
232234
tab_style(
233235
style = list(
234236
cell_fill(color = "crimson"),
@@ -529,7 +531,7 @@ data_runtime = merge(data_runtime, data_memory, by = c("task", "evals", "mlr3",
529531
```
530532

531533
The runtime and memory usage of the `resample()` function is measured for different mlr3 versions.
532-
The models are trained for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 and 10,000 instances.
534+
The models are trained for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 and 10,000 observations.
533535
The resampling iterations (`evals`) are set to 1000, 100, and 10.
534536

535537
```{r}
@@ -607,7 +609,7 @@ data_runtime = merge(data_runtime, data_memory, by = c("task", "evals", "mlr3",
607609
```
608610

609611
The runtime and memory usage of the `benchmark()` function is measured for different mlr3 versions.
610-
The models are trained for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 and 10,000 instances.
612+
The models are trained for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 and 10,000 observations.
611613
The resampling iterations (`evals`) are set to 1000, 100, and 10.
612614

613615
```{r}
@@ -762,7 +764,7 @@ data_runtime = merge(data_runtime, data_runtime_2, by = c("task", "evals", "mlr3
762764

763765
The runtime and memory usage of the `resample()` function with `future::multisession` parallelization is measured for different mlr3 versions.
764766
The parallelization is conducted on 10 cores.
765-
The models are trained for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 and 10,000 instances.
767+
The models are trained for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 and 10,000 observations.
766768
The resampling iterations (`evals`) are set to 1000, 100, and 10.
767769

768770
```{r}
@@ -858,7 +860,7 @@ data_runtime = merge(data_runtime, data_runtime_2, by = c("task", "evals", "mlr3
858860

859861
The runtime and memory usage of the `benchmark()` function with `future::multisession` parallelization is measured for different mlr3 versions.
860862
The parallelization is conducted on 10 cores.
861-
The models are trained for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 and 10,000 instances.
863+
The models are trained for different amounts of time (1 ms, 10 ms, 100 ms, and 1000 ms) on the spam dataset with 1000 and 10,000 observations.
862864
The resampling iterations (`evals`) are set to 1000, 100, and 10.
863865

864866
```{r}

0 commit comments

Comments
 (0)