-
Notifications
You must be signed in to change notification settings - Fork 24
Description
workflow() takes no arguments and requires the preprocessor (recipe/formula/etc) to be set in piped steps afterward:
wf <- workflow() %>%
add_recipe(rec) %>%
add_model(mod)
This could be more compact by taking arguments for recipe and model (with a NULL default; see below).
wf <- workflow(rec, mod)
The first argument could accept either a recipe or a formula. (add_variables may have to remain separate).
One great advantage of this is that it allows you to pipe into workflow from a recipe to create a workflow in one sequence.
rec <- recipe(outcome ~ ., data = train) %>%
step_log(all_numeric_predictors())
mod <- linear_reg() %>%
set_engine("lm")
wf <- workflow() %>%
add_recipe(rec) %>%
add_mod(mod)
Could become:
wf <- recipe(outcome ~ ., data = train) %>%
step_log(all_numeric_predictors()) %>%
workflow(linear_reg())
Besides being shorter, the latter reads more naturally to me because the steps happen in the same order: "clean the data by logging all predictors; then run a linear regression").
(PS I'm also cheating here and assuming that tidymodels/parsnip#513 goes through and the set_engine(). Together I think they make a more compact "minimum viable workflow")
This doesn't at all preclude keeping add_recipe() and add_model() with the current behavior, because you can have defaults for the two workflow parameters of NULL (in which case it returns an empty workflow). This keeps reverse compatibility, and also allows someone to set just a recipe or just a model so they can later customize it: wf() %>% set_model(some_model) %>% ...