Skip to content

Instantly share code, notes, and snippets.

@stuntgoat
Last active December 26, 2015 11:09
Show Gist options
  • Save stuntgoat/7142307 to your computer and use it in GitHub Desktop.
Save stuntgoat/7142307 to your computer and use it in GitHub Desktop.
Notes on R
#### write a dataframe as a csv
write.table(df, file="<filename>", row.names=F, sep=",", quote=F)
#### select dataframe rows by column value
df[df$state=="<colname>",]
#### select columns
# by column index
want_cols <- c(7, 8, 12)
trimmed_df <- df[,want_cols]
#### ordering/sorting dataframe based on a column(s)
df[order(df$colname),]
# reversed
df[order(-df$colname),]
#### Histograms
# breaks
hist(df$chucks, breaks=seq(0, 1, by=.001))
# ignore outliers with xlim/ylim
hist(df$chucks, breaks=seq(0, 1, by=.001), xlim=c(0, .25))
# count unique rows in a column
df.chuck_counts = as.data.frame(table(df$chucks))
# find duplicates: http://stackoverflow.com/questions/12495345/find-indices-of-duplicated-rows
dups = duplicated(df) | duplicated(df, fromLast=T)
df[dups,]
# summary statistics of a single column
unique_chucks = as.data.frame(as.table(summary(df$chucks)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment