Skip to content

Instantly share code, notes, and snippets.

@emlyn
Last active December 26, 2015 10:28
Show Gist options
  • Save emlyn/7136642 to your computer and use it in GitHub Desktop.
Save emlyn/7136642 to your computer and use it in GitHub Desktop.
Clojure Hash Test
(def testset (for [a (range 20)
b (range 20)
c (range 20)
d (range 20)]
(conj #{[a b]} [c d])))
(def testvec (for [a (range 20)
b (range 20)
c (range 20)
d (range 20)]
[[a b] [c d]]))
(def testvec2 (for [a (range 1000)
b (range 1000)]
[a b]))
(defrecord Pair [a b])
(def testrec (for [a (range 1000)
b (range 1000)]
(Pair. a b)))
(defn hashtest
[items & [hash-fn]]
(let [hash-fn (or hash-fn hash)
v (into #{} items)
h (map count (vals (group-by hash v)))
n (count v)
m (count h)]
(prn n m (apply max h) (/ n m 1.0))))
; count-items count-hashes max-items-per-hash mean-items-per-hash
(hashtest testset)
; 80200 1615 200 49.6594427244582
(hashtest testvec)
; 160000 12560 20 12.73885350318471
(hashtest testvec2)
; 1000000 31969 33 31.28030279333104
(hashtest testrec)
; 1000000 2039 976 490.4364884747425
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment