Last active
September 20, 2021 21:22
-
-
Save annashipman/d3b0533ce26df1e4dd84fbc3001e98dc to your computer and use it in GitHub Desktop.
Debugging Clojure/Vagrant for 7 concurrency models in 7 weeks
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I am trying to replicate an example from this book: https://uk.bookshop.org/books/seven-concurrency-models-in-seven-weeks-when-threads-unravel/9781937785659 | |
Running the below commands in a lein repl. | |
In the book, the second `parallel-sum` function executes 2.5x as fast as `sum` but on my Vagrant box (which has 4 cores, and demonstrates other parallelised code having a speed up effect) the two functions take pretty much the same time every time. My Vagrantfile is here https://github.com/annashipman/7weeks-concurrency/blob/main/Vagrantfile | |
This works in a non-Vagrant Clojure environment, i.e parallel-sum runs much faster. | |
So, what should I do differently in Vagrant to get it to work? | |
(ns sum.core | |
(:require [clojure.core.reducers :as r])) | |
(defn sum [numbers] | |
(reduce + numbers)) | |
(defn parallel-sum [numbers] | |
(r/fold + numbers)) | |
(def numbers (into [] (range 0 10000000))) | |
(time (sum numbers)) | |
(time (sum numbers)) | |
(time (parallel-sum numbers)) | |
(time (parallel-sum numbers)) |
Running the same test cases on a hardware machine (here a MBP Core i7 2.7GHz 16Gb RAM) and OpenJDK 17 (not OpenJDK 1.8), the garbage collector is now G1 by default and the results are:
$ lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 17+0
[...]
user=> (load-file "test-sum.clj")
"Elapsed time: 307.124373 msecs"
"Elapsed time: 359.415788 msecs"
49999995000000
$ lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 17+0
[...]
user=> (load-file "test-parallel-sum.clj")
"Elapsed time: 175.177959 msecs"
"Elapsed time: 137.79619 msecs"
49999995000000
and with -Xms3g -Xmx3g:
$ LEIN_JVM_OPTS="-Xms3g -Xmx3g" lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~16.04.1-b10
[...]
user=> (load-file "test-sum.clj")
"Elapsed time: 144.63988 msecs"
"Elapsed time: 117.451834 msecs"
49999995000000
$ LEIN_JVM_OPTS="-Xms3g -Xmx3g" lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~16.04.1-b10
[...]
user=> (load-file "test-parallel-sum.clj")
"Elapsed time: 178.446205 msecs"
"Elapsed time: 74.989178 msecs"
49999995000000
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
After learning of LEIN_JVM_OPTS, I did following tests on a Vagrant/Virtualbox VM with 4Gb of RAM (note that I intentionally kept the default garbage collector options, and just set the Heap Size in the second example).
With Leiningen default JVM Heap Size the
sum
function is hammered by the garbage collector:On the other hand,
parallel-sum
function runs smoothly:Now if you let clojure take advantage of 3Gb of memory, you see well that the "random" slownesses are caused by the GC work.
Same for
parallel-sum
usage:Notes: The
test-sum.clj
andtest-parallel-sum.clj
files are copy of the above script ending by only calling the two calls of the function in the filename (used execute runs independently to avoid some memory caching effects).