Skip to content

Instantly share code, notes, and snippets.

@mryap
Last active December 12, 2018 11:34
Show Gist options
  • Select an option

  • Save mryap/bb0be8aaa5c2c44773882e6e834b39d7 to your computer and use it in GitHub Desktop.

Select an option

Save mryap/bb0be8aaa5c2c44773882e6e834b39d7 to your computer and use it in GitHub Desktop.
Base R to Tidyverse
Base R Tidyverse What it does and why tidyverse Comment
read.csv() read_csv() reads in a csv file, but its much faster, shows progress bar for large files, can automatically parse data types also see read_delim(), read_tsv() and readxl::read_xlsx()
sort(), order() arrange() sort column(n) within a data frame see also order_by()
mtcars$mpg = … mutate() modify a column see also transmute() which drops existing variables
mtcars[,c(“mpg”, “am”)], subset() select(), rename() select or rename columns see also pull()
mtcars[mtcars$am == 1,], subset() filter() select rows based on a criterion
aggregate() summarise(), summarize(), do() reduce grouped values to a single value see also varaints like summarize_if()
ifelse() if_else(), case_when() standand vectorized if else, but stricter than base version see also near()
unique() distinct() finds unique rows in a data frame, but its much, faster
length(unique()) n_distinct() count the number of distinct values in a vector, faster
sample(), sample.int() sample_n(), sample_frac() sample n rows or a fraction of rows from a dataframe
all.equal() all_equal() checks if two vectors are the same
merge() inner_join(), left_join() perform joins, much faster, verbose, and row order is maintain see also right_join(), full_join(), semi_join(), anti_join()
rbind(), cbind() bind_rows(), bind_cols() concatenate two dataframes along rows or columns, much faster
x >= left & x <= right between() easier to read and faster implementation for larege datasets see also near()
nrow(), sum() tally(), count(), add_tally(), add_count() count or sum up rows
c() combine() combine into a vector
extends base R cumall(), cumany(), cummean() extends base R collection of cumsum(), cumprod() etc
mtcars$mpg[1,] etc first(), last(), n(), top_n() works within groups, allows you to order by another column(s) and provide defaults for missing values
split(), aggregate() group_by() create a grouped data frame (tibble) to perform operations on groups see also ungroup()
intersect(), union() intersect(), union() set operations, but dplyr works on data frames as well
mtcars(mpg2 = c(NA, mtcars)mpg[1:nrow(mtcars)-1]) lead(), lag() No equivalent command in base R, easier to read
ifelse(…, NA) na_if() convert a value to NA
switch() recode() change certain values in your vector see also forcats package when dealing with factors
mtcars[3:5,] slice() select rows bases on row numbers
seq_along(), quantile() row_number(), ntile(), min_ran() etc add rankings in various ways, much richer set of rankings supported than base r
no easy way complete(), expand() expands the dataframe so that supplied columns are completely filled out often used with nesting(), see also full_seq()
expand.grid() crossing() create a data frame of all possible combinations of supplied vectors
ifelse(is.na(…), …) drop_na(), replace_na() drop rows with missing values or convert NAs to supplied values see also fill(), coalesce()
some mix of paste/strsplit separate(), unite() separate two columns based on regex or combine two columns into one
reshape2::dcast() spread() convert long (tidy) data into wide (untidy) format
reshape2::melt() gather() convert wide (untidy) data into long(tidy) format
replicate() rerun() run an expression n number of times
unlist(lapply(x, [[, n)) pluck() extract elements out of a list
lapply(), sapply() map(), map2() apply a function to a set of values, working with lists see also map_chr(), map_lgl(), map_int(), map_dbl(), map_df()
paste0() glue() combine two strings together, but much more powerful because it allows for expressions
@mryap
Copy link
Copy Markdown
Author

mryap commented Dec 12, 2018

Motivation

I have Base R code that measure Ads Response.
I want to move these Base R code to Tidyverse that easier to write, read, maintain and always faster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment