Skip to content

Instantly share code, notes, and snippets.

@primaryobjects
Created July 8, 2014 01:52
Show Gist options
  • Save primaryobjects/f5f266a8c9a4984604b6 to your computer and use it in GitHub Desktop.
Save primaryobjects/f5f266a8c9a4984604b6 to your computer and use it in GitHub Desktop.
Calculating the mean of a data table column in R (American Community Survey) and timing the result.
doLoop <- function(method, iterations = 1000) {
for (i in 1:iterations) {
method();
}
}
# Load csv file.
DT <- fread('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv')
# Time option 2.
print(system.time(doLoop(function() tapply(DT$pwgtp15,DT$SEX,mean))))
# Time option 3.
print(system.time(doLoop(function() DT[,mean(pwgtp15),by=SEX])))
# Time option 6.
print(system.time(doLoop(function() sapply(split(DT$pwgtp15,DT$SEX),mean))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment