You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sample_frac(iris, 0.5, replace = TRUE) Randomly select fraction of rows.
sample_n(iris, 10, replace = TRUE) Randomly select n rows.
replace = TRUE Sample with replacement of elements in dataframe for subsequent choice.
Reshaping Data
Gather & Spread columns into row
gather (tidyr) Gather columns into rows.
convert If TRUE will automatically run type.convert on the key column. This is useful if the column names are actually numeric, integer, or logical.
factor_key If FALSE, the default, the key values will be stored as a character vector. If TRUE, will be stored as a factor, which preserves the original ordering of the columns.
spread (tidyr) Spread rows into columns.
> test <- data.frame(Name=c("A","B","C"),M1=c(2.5,3,6),M2=c(5,6,7))
> test
Name M1 M2
1 A 2.5 5
2 B 3.0 6
3 C 6.0 7
> gather(test, Param, val, M1, M2)
Name Param val
1 A M1 2.5
2 B M1 3.0
3 C M1 6.0
4 A M2 5.0
5 B M2 6.0
6 C M2 7.0
> spread(gather(test,Param, val, M1,M2), Param,val)
Name M1 M2
1 A 2.5 5
2 B 3.0 6
3 C 6.0 7
Split & unitecolumn
separate (tidyr) Separate one column into several.
unite (tidyr) concatenate strings of several column with a sep
merge(a,b,all=, by=) merge two data frames by common columns or row names, if all=TRUE, extra rows will be added to the output, one for each row in x that has no matching row in y and reciprocally
> authors <- data.frame(
surname = I(c("Tukey", "Venables", "Tierney")),
deceased = c(T, rep(F, 2)))
> books <- data.frame(
name = I(c("Tukey", "Venables", "Tierney",
"Ripley", "R Core")),
title = c("Exploratory Data Analysis",
"Modern Applied Statistics ...",
"LISP-STAT",
"Spatial Statistics",
"An Introduction to R"))
> merge(authors, books, by.x = "surname", by.y = "name", all = TRUE)
surname deceased title
1 R Core NA An Introduction to R
2 Ripley NA Spatial Statistics
3 Tierney FALSE LISP-STAT
4 Tukey TRUE Exploratory Data Analysis
5 Venables FALSE Modern Applied Statistics ...
> merge(authors, books, by.x = "surname", by.y = "name", all = FALSE)
surname deceased title
1 Tierney FALSE LISP-STAT
2 Tukey TRUE Exploratory Data Analysis
3 Venables FALSE Modern Applied Statistics ...
Piping
> x %>% f(y) # f(x, y)
> y %>% f(x, ., z) # f(x, y, z )
> iris %>%
group_by(Species) %>%
summarise(avg = mean(Sepal.Width)) %>%
arrange(avg)