Skip to content

Instantly share code, notes, and snippets.

@kleinlennart
Created September 30, 2020 21:54
Show Gist options
  • Save kleinlennart/c0cbc2cbece5ed488b5d90eb18b93b04 to your computer and use it in GitHub Desktop.
Save kleinlennart/c0cbc2cbece5ed488b5d90eb18b93b04 to your computer and use it in GitHub Desktop.
Use User Data Subsets for faster processing in Drake Workflow
drake_plan(
# dplyr::distinct instead?
user_data = tidy_data %>% unique(user_id) %>% select(user_related_cols), # remove "tweet level" vars for faster runtime
geo_data = do_user_data_stuff(),
# left_join: Join matching rows from b to a.
joined_data = target(command = left_join(user_data, tidy_data, by = "user_id"),
format = "fst" # useful ???
)
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment