Skip to content

Instantly share code, notes, and snippets.

@r9y9
Last active December 19, 2015 20:52
Show Gist options
  • Save r9y9/d1341b9e8bfc3b141987 to your computer and use it in GitHub Desktop.
Save r9y9/d1341b9e8bfc3b141987 to your computer and use it in GitHub Desktop.
BLAS.axpy! と forループ愚直計算はどっちが速いのか実験
function perf(a=1.0, N=1000; n=100)
srand(98765)
X = rand(N, N)
Y = rand(N, N)
println("N=$N")
print(" BLAS: ")
gc_enable(false)
gc()
@time for i in 1:n
BLAS.axpy!(a, X, Y)
end
print(" for: ")
gc()
@time begin
for j in 1:n
for i in eachindex(X)
@inbounds Y[i] = a*X[i] + Y[i]
end
end
end
gc_enable(true)
end
versioninfo()
for N in [100, 1000, 3000, 5000, 10000]
perf(5.0, N, n=1000)
end
@r9y9
Copy link
Author

r9y9 commented Dec 19, 2015

実行結果:

% julia-master blas2.jl
Julia Version 0.5.0-dev+1256
Commit 5393150* (2015-11-12 04:42 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.4.0)
  CPU: Intel(R) Core(TM) i5-4258U CPU @ 2.40GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.8.0svn
N=100
  BLAS:   0.005899 seconds
  for:    0.004070 seconds
N=1000
  BLAS:   1.553393 seconds
  for:    1.450331 seconds
N=3000
  BLAS:  12.617937 seconds
  for:   13.150426 seconds
N=5000
  BLAS:  34.130196 seconds
  for:   36.054273 seconds
N=10000
  BLAS: 137.215003 seconds
  for:  169.501480 seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment