Created
September 28, 2012 20:46
-
-
Save iros/3802000 to your computer and use it in GitHub Desktop.
R Trick - reading data faster
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Making R read data faster by precomputing the column | |
# data types | |
sample <- read.table("data.txt", nrows = 100) | |
types <- sapply(sample, classes) | |
allData <- read.table("data.txt", colClasses = classes) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Irene
Does it work better with a couple of fixes?
sample <- read.table(fname, nrows = 100, header=TRUE)
types <- sapply(sample, class)
read.table(fname, colClasses = types, header=TRUE)
Or maybe I miss something about the classes function and where types is used.
Then making some benchmark with the classic method vs this one, on ~10M lines, 3 col (numeric and text), I did not find major time improvement (barely a few %).
This method might be good, but in other situation (or with older R versions)
(Anyway, I love your tweets, I'm a big fan - stucked in slow load this morning, I remembered this one 3 months ago)
Alex