Website: https://www.tidyverse.org/packages/
Comparison of dplyr
and base
functions: https://cran.r-project.org/web/packages/dplyr/vignettes/base.html
Piping:
library(dplyr)
author <- " Person 1, Person 2, ..."
author %>%
as.character %>%
stringr::str_trim() %>%
gsub("\\.\\.\\.", "et al", .)
vs
gsub("\\.\\.\\.", "et al", stringr::str_trim(as.character(author)))
Tidy datasets are all alike, but every messy dataset is messy in its own way.
R for Data Science book describes "tidy data" https://r4ds.had.co.nz/tidy-data.html
- Each variable must have its own column.
- Each observation must have its own row.
- Each value must have its own cell.
More in depth discussion in this paper: https://www.jstatsoft.org/article/view/v059i10
Lots of stuff on youtube eg https://www.youtube.com/watch?v=ZM04jn95YP0 which includes this gist of examples: https://gist.github.com/larsentom/727da01476ad1fe5c066a53cc784417b