Skip to content

Instantly share code, notes, and snippets.

@arunsrinivasan
Created January 13, 2014 21:08
Show Gist options
  • Select an option

  • Save arunsrinivasan/8408180 to your computer and use it in GitHub Desktop.

Select an option

Save arunsrinivasan/8408180 to your computer and use it in GitHub Desktop.
`:=` vs 'set' in data.table
require(data.table)
set.seed(1L)
DT1 <- data.table(x=sample(1e7), y=as.numeric(sample(1e7)), z=sample(letters, 1e7, TRUE))
DT2 <- copy(DT1)
val <- runif(1e7)
# 'set' seems faster when adding 1-column
# =======================================
# on 1 column
system.time(set(DT1, i=NULL, j='bla', value=val)) # ----------- 0.195 secs
system.time(DT2[, bla := val]) # ------------------------------ 0.235 secs
# ':=' seems faster when adding 10-columns (due to for-loop? move to "C" for 'set'?)
# ==================================================================================
# 'set' ---------------------------------- 1.896 secs
cols <- paste("bla", 1:10, sep="")
system.time({
for (j in cols) {
set(DT1, i=NULL, j=j, value=val)
}
})
# ':=' ----------------------------------- 1.645 secs
system.time({
DT2[, (cols) := val]
})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment