Under the section Vectorise
(and also briefly mentioned under section Do as little as possible
), one point I think would be nice to have is to be aware of the data structure the vectorised functions are implemented for. Using vectorised code without understanding that is a form of "premature optimisation" as well, IMHO.
For example, consider the case of rowSums
on a data.frame
. Some issues to consider here are:
- Memory - using
rowSums
on adata.frame
will coerce into amatrix
first. Imagine a huge (> 1Gb) data.frame and this might turn out to be a bad idea if the conversion drains memory and starts swapping.
Note: I personally think discussion about performance should merit on trade-offs between "speed" and "memory".
- Data structure - We can do much more in terms of speed (and memory) by taking advantage of the data structure here. Here's an example:
set.seed(1L)
require(data.table)
DF <- as.data.frame(setDT(lapply(1:1e2, function(x) as.numeric(sample(10, 1e6, TRUE)))))
## using vectorised rowSums
system.time(ans1 <- rowSums(DF))
# user system elapsed
# 2.029 1.154 3.660
## using simple for-loop
foo <- function(x) {
## skipping checks here just for illustration
ans = x[[1L]]
for (i in seq_len(ncol(x))[-1L]) {
ans = ans + x[[i]]
}
ans
}
system.time(ans2 <- foo(DF))
# user system elapsed
# 0.565 0.570 1.172
identical(ans1, ans2) ## [1] TRUE
The for-loop has no coercion (no twice the memory usage) and is ~3x faster. We've performance improvement in terms of both "speed" and "memory" by choosing not to use rowSums
on a data.frame
.
Even better would be to write this for-loop in C. But that shouldn't matter a lot as long as you're not dealing with a lot of columns (which is rarely the case).