[help] Chained targets / branching a branch #1511
-
Help
DescriptionI have a dataframe that i need to tidy it in many different configurations (too many to do manually) and then run an analysis on each one of them, so I'm trying to use branches to make my life easier. My expected workflow is something like: dataframe |> functionA(., param1) |> functionB(., param2) |> functionC(., param3) And the parameters come from a table like the one exemplified bellow TABa, b, c As the functions are very expensive, I don't want to run unnecessary targets. For instance, functionA needs to run only 2 times, one for a = 1 and a = 2 (distinct combinations of a). FunctionB has to run 3 times, a = 1 and b = 1, a = 2 and b = 2, a = 2 and b = 3 (distrinct combinations of a and b). FunctionC has to run 5 times, one for each row. An important aspect of this, is that the input of functionB for the combination b = 2 and a = 1 must be the return of functionA when a = 1. Also, I am using a simple example with only 5 lines and 3 functions, but in my actual code is more of 60 lines and 7 functions. Is this possible? Reprexthat creates many unintended targets (all combinations of a b and c) library(targets)
library(tarchetypes)
library(tidyverse)
fA <- function(dado, number){
dado |> mutate(a = number)
}
fB <- function(dado, number){
dado |> mutate(b = number)
}
fC <- function(dado, number){
dado |> mutate(c = number)
}
tab <- tibble(a = c(1, 1, 2, 2, 2),
b = c(1,1,2,2, 3),
c = c(1:5))
list(
tar_map(
tab |> distinct(a),
tar_target(A, fA(tibble(x = 0), a)),
tar_map(
tab |> distinct(a,b),
tar_target(B, fB(A ,b)),
tar_map(
tab |> distinct(a,b,c),
tar_target(C, fC(B ,b))
)
)
)
) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Since the datasets depend on one another, and since you have different functions and different numbers of analyses to process each dataset, I'm not sure the available branching tools would be a good fit for this case. You might have to write the pipeline out manually. |
Beta Was this translation helpful? Give feedback.
I ended up using
memoise()
on my functions and ran dynamic branches as normal. It worked. The only issue is that it either consumes a lot of RAM or I memoise it on my hard drive, effectively saving a "target" outside targets.Maybe this could be implemented on targets? Having an option to memoise some function, then targets would only have to check if that function was ever ran with those specific inputs and reuse the output of that target
Reprex