Last active
March 6, 2017 11:21
-
-
Save jstepien/70accee59eef8b899c72ee49b11ae95d to your computer and use it in GitHub Desktop.
reduce-and-gc.clj
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
;; Let's experiment with Clojure 1.8.0 running with a relatively small heap | |
;; of 10MB. To avoid any extra dependencies, we'll start it as follows: | |
;; | |
;; java -Xmx10m -jar clojure-1.8.0.jar | |
;; | |
;; Copy following expressions into your REPL one by one. | |
;; | |
;; Let's take a small number, for instance one million. | |
(def small-num (* 1000 1000)) | |
;; We can build a lazy sequence of the given length and get its last element | |
;; not exceeding our heap limits. | |
(last (range small-num)) | |
;; We can define the lazy sequence in a let binding. It might look as if we | |
;; were holding onto its head, but the compiler is smart enough. We calculate | |
;; the last element of the sequence in the last expression of the let body. | |
;; This allows the compiler to drop the reference to the head of the sequence, | |
;; thus avoiding materialising the entire sequence in memory. | |
(let [coll (range small-num)] | |
(last coll)) | |
;; Let's try to cause an out-of-memory error. We'll firstly calculate the last | |
;; element, ignore the result, and return the first element instead. The head of | |
;; the sequence (and the entire sequence itself) cannot be garbage collected. | |
;; We end up with an OutOfMemoryError. | |
(let [coll (range small-num)] | |
(last coll) | |
(first coll)) | |
;; We can cause the same error by attempting to fully evaluate the sequence | |
;; before returning its last element. We get an OutOfMemoryError again. | |
(first (doall (range small-num))) | |
;; Now we've learned that a range of one million elements won't fit into our | |
;; 10MB heap. So let's continue our experiments with a range which is 100 times | |
;; longer. | |
(def large-num (* 100 small-num)) | |
;; We can get the last element of a 100 times longer sequence not exceeding our | |
;; 10MB heap. | |
(last (range large-num)) | |
;; The same result can be achieved with reduce. | |
(reduce (fn [_ current] current) (range large-num)) | |
;; Similarly, we can ask for the last element in a subsequence of adjacent | |
;; elements satisfying a predicate. We can achieve it either with take-while | |
;; or with reduce. | |
(defn predicate [elem] | |
(< elem (/ large-num 2))) | |
(->> (range large-num) | |
(take-while predicate) | |
(last)) | |
(->> (range large-num) | |
(reduce (fn [last elem] (if (predicate elem) elem (reduced last))))) | |
;; Summing elements of the long sequence? Sure, we can fit it in a 10MB heap too. | |
;; Let's initialise our reduction with a BigDecimal to avoid integer overflows. | |
(->> (range large-num) | |
(reduce + 0M)) | |
;; Alternatively, | |
(->> (range) | |
(take large-num) | |
(reduce + 0M)) | |
;; I hope this helps. Questions, comments, critique? Let me know! Reach me at | |
;; [email protected] or at @janstepien on Twitter. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment