Skip to content

Instantly share code, notes, and snippets.

@cgrand
Last active March 8, 2025 11:49
Show Gist options
  • Save cgrand/6303edaa04951fe3c20d18fa60a7c4d9 to your computer and use it in GitHub Desktop.
Save cgrand/6303edaa04951fe3c20d18fa60a7c4d9 to your computer and use it in GitHub Desktop.
Fast structural merge in ClojureDart
;; ClojureDart has structural merge for maps https://github.com/Tensegritics/ClojureDart/commit/ac6bd2a8dec29e31dd2f598a8009ebdffd036781#diff-7d37216e2a92d1019eb7fe5bf292430220b2fcec4cbc2cd3355c6a4226333680R4714
;; * (merge big-map small-map) and (merge small-map big-map) are equally fast
;; * (merge original-map modified-map) takes time proportional to the number of modified keys
;; * even in worst case, it's faster than `into`
;; BEST CASE
user=> (let [m (zipmap (range 1e6) (range 1e6))] (time (count (merge {:a 1} m))))
"Elapsed time: 287.794583 msecs"
1000001
user=> (let [m (zipmap (range 1e6) (range 1e6)) m' (assoc m 42 :meaning)] (time (count (merge m m'))))
"Elapsed time: 281.375792 msecs"
1000000
;;; CLJ ๐Ÿ‘†/ CLJD ๐Ÿ‘‡
cljd.user=> (let [m (zipmap (range 1e6) (range 1e6))] (time (count (merge {:a 1} m))))
"Elapsed time: 0.355 msecs"
1000001
cljd.user=> (let [m (zipmap (range 1e6) (range 1e6)) m' (assoc m 42 :meaning)] (time (count (merge m m'))))
"Elapsed time: 0.275 msecs"
1000000
;; WORST CASE
;; The above is obviously an ideal best case where CLJD merge asymptotically is infintely faster than CLJ merge.
;; More interesting is a worst case: merging two big disjoint maps:
ser=> (let [n 1e6 m1 (zipmap (range 0 n 2) (range)) m2 (zipmap (range 1 n 2) (range))] (time (count (merge m1 m2))))
"Elapsed time: 187.950625 msecs"
1000000
cljd.user=> (let [n 1e6 m1 (zipmap (range 0 n 2) (range)) m2 (zipmap (range 1 n 2) (range))] (time (count (merge m1 m2))))
"Elapsed time: 51.925 msecs"
1000000
;; CLJD is still 3.5x faster than CLJ
;; As a reference point, one can use `into` where CLJ is 2x faster than CLJD:
user=> (let [n 1e6 m1 (zipmap (range 0 n 2) (range)) m2 (zipmap (range 1 n 2) (range))] (time (count (into m1 m2))))
"Elapsed time: 185.683625 msecs"
1000000
cljd.user=> (let [n 1e6 m1 (zipmap (range 0 n 2) (range)) m2 (zipmap (range 1 n 2) (range))] (time (count (into m1 m2))))
"Elapsed time: 352.216 msecs"
1000000
;; Where does the 7x improvements comes from?
;; mostly
;; * fewer hops (tree traversal)
;; * fewer hashes computed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment