-
-
Save Malarkey73/8096475 to your computer and use it in GitHub Desktop.
This is sort of a reply to this post: | |
http://www.johnmyleswhite.com/notebook/2013/12/22/the-relationship-between-vectorized-and-devectorized-code/ | |
which arises from a a discussion of this | |
http://julialang.org/blog/2013/09/fast-numeric/ | |
So.... | |
## This is Julia code | |
# executable code here: https://gist.github.com/Malarkey73/8097545 | |
# vectorised function from Julia blog | |
function vectorised(x,y) | |
r = exp(-abs(x-y)) | |
return r | |
end | |
# recommended devectorised function | |
function devectorised(x,y) | |
r = similar(x) | |
for i = 1:length(x) | |
r[i] = exp(-abs(x[i]-y[i])) | |
end | |
return r | |
end | |
# timer func for vectorised (adapted from john Myles White original code) | |
function timevec(N, x,y) | |
timings = Array(Float64, N) | |
# Force compilation | |
vectorised(x,y) | |
for itr in 1:N | |
timings[itr] = @elapsed vectorised(x,y) | |
end | |
return timings | |
end | |
# timer func for devectorised (adapted from john Myles White original code) | |
function timedevec(N, x,y) | |
timings = Array(Float64, N) | |
# Force compilation | |
devectorised(x,y) | |
for itr in 1:N | |
timings[itr] = @elapsed devectorised(x,y) | |
end | |
return timings | |
end | |
# data | |
julia> x= [1: 2e6] | |
julia> y= x * 2 | |
# tldr RESULTS !!!!!!!!!!!!!!!!!! | |
julia> median(timevec(50,x,y)) | |
0.24664389599999997 | |
julia> median(timedevec(50,x,y)) | |
0.176243553 | |
------------------------------------------------------ | |
#This is now R | |
# executable code with added Rcpp here: https://gist.github.com/hadley/8097300 | |
# vectorised function | |
vectorised <- function(x, y) | |
{ | |
r = exp(-abs(x-y)) | |
return(r) | |
} | |
#devectorised function | |
devectorised <- function(x,y) | |
{ | |
r=rep(NA, length(x)) | |
for(i in seq_along(x)) | |
{ | |
r[i]= exp(-abs(x[i]-y[i])) | |
} | |
return(r) | |
} | |
# data | |
> x= 1:2e6 | |
> y= x * 2 | |
# tldr RESULTS !!!!!!!!!!!!!!!!!! | |
> microbenchmark( | |
vectorised(x,y), | |
devectorised(x,y), | |
unit = "s", times=5) | |
Unit: seconds | |
expr min lq median uq max neval | |
vectorised(x, y) 0.2058464 0.2165744 0.2610062 0.2612965 0.2805144 5 | |
devectorised(x, y) 9.7923054 9.8095265 9.8097871 9.8606076 10.0144012 5 | |
@HarlanH Sorry there was actually another typo above, devectorised Julia is fractionally faster both for the wrong way I did it and the right way you have pointed out (for this use case).
NB both ways give me the same answer.
Interesting example. This is a very close analogue to the second set of snippets in my post: almost all of the time is spent in memory allocation for r
, so there's not much to be gained from devectorization.
I chose this function because the original was a loop 1e6 * 2-fold vector addition. So this is just a 2e6 iteration. It's the sort of code you see in every day data munging, or matrix algebra. I want to learn more Julia so i may try benchmarking Data.frames logical indexing and and apply type operations because these are the true data analysis workhorses.
Just from eyeballing, I don't think this is doing what you think it's doing:
x= 1: 2e6
. In Julia, that's a range, and it doesn't get converted into a vector automatically, like it does in R. That's why Julia shows essentially no speedup by devectorizing. Try rewriting asx = [1:2e6]
and see what you get?