| Base R | Tidyverse | What it does and why tidyverse | Comment |
|---|---|---|---|
| read.csv() | read_csv() | reads in a csv file, but its much faster, shows progress bar for large files, can automatically parse data types | also see read_delim(), read_tsv() and readxl::read_xlsx() |
| sort(), order() | arrange() | sort column(n) within a data frame | see also order_by() |
| mtcars$mpg = … | mutate() | modify a column | see also transmute() which drops existing variables |
| mtcars[,c(“mpg”, “am”)], subset() | select(), rename() | select or rename columns | see also pull() |
| mtcars[mtcars$am == 1,], subset() | filter() | select rows based on a criterion | |
| aggregate() | summarise(), summarize(), do() | reduce grouped values to a single value | see also varaints like summarize_if() |
| ifelse() | if_else(), case_when() | standand vectorized if else, but stricter than base version | see also near() |
| unique() | distinct() | finds unique rows in a data frame, but its much, faster | |
| length(unique()) | n_distinct() | count the number of distinct values in a vector, faster | |
| sample(), sample.int() | sample_n(), sample_frac() | sample n rows or a fraction of rows from a dataframe | |
| all.equal() | all_equal() | checks if two vectors are the same | |
| merge() | inner_join(), left_join() | perform joins, much faster, verbose, and row order is maintain | see also right_join(), full_join(), semi_join(), anti_join() |
| rbind(), cbind() | bind_rows(), bind_cols() | concatenate two dataframes along rows or columns, much faster | |
| x >= left & x <= right | between() | easier to read and faster implementation for larege datasets | see also near() |
| nrow(), sum() | tally(), count(), add_tally(), add_count() | count or sum up rows | |
| c() | combine() | combine into a vector | |
| extends base R | cumall(), cumany(), cummean() | extends base R collection of cumsum(), cumprod() etc | |
| mtcars$mpg[1,] etc | first(), last(), n(), top_n() | works within groups, allows you to order by another column(s) and provide defaults for missing values | |
| split(), aggregate() | group_by() | create a grouped data frame (tibble) to perform operations on groups | see also ungroup() |
| intersect(), union() | intersect(), union() | set operations, but dplyr works on data frames as well | |
| mtcars(mpg2 = c(NA, mtcars)mpg[1:nrow(mtcars)-1]) | lead(), lag() | No equivalent command in base R, easier to read | |
| ifelse(…, NA) | na_if() | convert a value to NA | |
| switch() | recode() | change certain values in your vector | see also forcats package when dealing with factors |
| mtcars[3:5,] | slice() | select rows bases on row numbers | |
| seq_along(), quantile() | row_number(), ntile(), min_ran() etc | add rankings in various ways, much richer set of rankings supported than base r | |
| no easy way | complete(), expand() | expands the dataframe so that supplied columns are completely filled out | often used with nesting(), see also full_seq() |
| expand.grid() | crossing() | create a data frame of all possible combinations of supplied vectors | |
| ifelse(is.na(…), …) | drop_na(), replace_na() | drop rows with missing values or convert NAs to supplied values | see also fill(), coalesce() |
| some mix of paste/strsplit | separate(), unite() | separate two columns based on regex or combine two columns into one | |
| reshape2::dcast() | spread() | convert long (tidy) data into wide (untidy) format | |
| reshape2::melt() | gather() | convert wide (untidy) data into long(tidy) format | |
| replicate() | rerun() | run an expression n number of times | |
| unlist(lapply(x, [[, n)) | pluck() | extract elements out of a list | |
| lapply(), sapply() | map(), map2() | apply a function to a set of values, working with lists | see also map_chr(), map_lgl(), map_int(), map_dbl(), map_df() |
| paste0() | glue() | combine two strings together, but much more powerful because it allows for expressions |
Last active
December 12, 2018 11:34
-
-
Save mryap/bb0be8aaa5c2c44773882e6e834b39d7 to your computer and use it in GitHub Desktop.
Base R to Tidyverse
Author
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Motivation
I have Base R code that measure Ads Response.
I want to move these Base R code to Tidyverse that easier to write, read, maintain and always faster