Created
January 7, 2011 17:14
-
-
Save davidminor/769758 to your computer and use it in GitHub Desktop.
Clojure - Split a collection into partitions on boundaries specified by supplied function.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(defn partition-between | |
"Splits coll into a lazy sequence of lists, with partition | |
boundaries between items where (f item1 item2) is true. | |
(partition-between = '(1 2 2 3 4 4 4 5)) => | |
((1 2) (2 3 4) (4) (4 5))" | |
[f coll] | |
(lazy-seq | |
(when-let [s (seq coll)] | |
(let [fst (first s)] | |
(if-let [rest-seq (next s)] | |
(if (f fst (first rest-seq)) | |
(cons (list fst) (partition-between f rest-seq)) | |
(let [rest-part (partition-between f rest-seq)] | |
(cons (cons fst (first rest-part)) (rest rest-part)))) | |
(list (list fst))))))) | |
; more idiomatic perhaps, but 3 times slower than the above | |
(defn partition-between | |
"Splits coll into a lazy sequence of lists, with partition | |
boundaries between items where (f item1 item2) is true. | |
(partition-between = '(1 2 2 3 4 4 4 5)) => | |
((1 2) (2 3 4) (4) (4 5))" | |
[f coll] | |
(lazy-seq | |
(if-not (seq coll) | |
'() | |
(let [pairs (map list coll (rest coll)) | |
[h t] (map #(map last %) (split-with #(not (apply f %)) pairs))] | |
(cons (cons (first coll) h) (partition-between f t)))))) |
It's the third time I used it in 4clojure excersises. Can I use something similar from stdlib?
Ok. I for that last one I could use (partition-by identity _).
Shorter one from StackOverflow:
(defn partition-between [pred? coll]
(let [switch (reductions not= true (map pred? coll (rest coll)))]
(map (partial map first) (partition-by second (map list coll switch)))))
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The time difference evens out over large collections, so that's probably not a deciding factor. But more importantly, the first one can end up blowing the stack in certain situations. So the second option wins on all fronts, not the least of which is the more idiomatic code.