Last active
March 8, 2025 11:49
-
-
Save cgrand/6303edaa04951fe3c20d18fa60a7c4d9 to your computer and use it in GitHub Desktop.
Fast structural merge in ClojureDart
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
;; ClojureDart has structural merge for maps https://github.com/Tensegritics/ClojureDart/commit/ac6bd2a8dec29e31dd2f598a8009ebdffd036781#diff-7d37216e2a92d1019eb7fe5bf292430220b2fcec4cbc2cd3355c6a4226333680R4714 | |
;; * (merge big-map small-map) and (merge small-map big-map) are equally fast | |
;; * (merge original-map modified-map) takes time proportional to the number of modified keys | |
;; * even in worst case, it's faster than `into` | |
;; BEST CASE | |
user=> (let [m (zipmap (range 1e6) (range 1e6))] (time (count (merge {:a 1} m)))) | |
"Elapsed time: 287.794583 msecs" | |
1000001 | |
user=> (let [m (zipmap (range 1e6) (range 1e6)) m' (assoc m 42 :meaning)] (time (count (merge m m')))) | |
"Elapsed time: 281.375792 msecs" | |
1000000 | |
;;; CLJ ๐/ CLJD ๐ | |
cljd.user=> (let [m (zipmap (range 1e6) (range 1e6))] (time (count (merge {:a 1} m)))) | |
"Elapsed time: 0.355 msecs" | |
1000001 | |
cljd.user=> (let [m (zipmap (range 1e6) (range 1e6)) m' (assoc m 42 :meaning)] (time (count (merge m m')))) | |
"Elapsed time: 0.275 msecs" | |
1000000 | |
;; WORST CASE | |
;; The above is obviously an ideal best case where CLJD merge asymptotically is infintely faster than CLJ merge. | |
;; More interesting is a worst case: merging two big disjoint maps: | |
ser=> (let [n 1e6 m1 (zipmap (range 0 n 2) (range)) m2 (zipmap (range 1 n 2) (range))] (time (count (merge m1 m2)))) | |
"Elapsed time: 187.950625 msecs" | |
1000000 | |
cljd.user=> (let [n 1e6 m1 (zipmap (range 0 n 2) (range)) m2 (zipmap (range 1 n 2) (range))] (time (count (merge m1 m2)))) | |
"Elapsed time: 51.925 msecs" | |
1000000 | |
;; CLJD is still 3.5x faster than CLJ | |
;; As a reference point, one can use `into` where CLJ is 2x faster than CLJD: | |
user=> (let [n 1e6 m1 (zipmap (range 0 n 2) (range)) m2 (zipmap (range 1 n 2) (range))] (time (count (into m1 m2)))) | |
"Elapsed time: 185.683625 msecs" | |
1000000 | |
cljd.user=> (let [n 1e6 m1 (zipmap (range 0 n 2) (range)) m2 (zipmap (range 1 n 2) (range))] (time (count (into m1 m2)))) | |
"Elapsed time: 352.216 msecs" | |
1000000 | |
;; Where does the 7x improvements comes from? | |
;; mostly | |
;; * fewer hops (tree traversal) | |
;; * fewer hashes computed |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment