Skip to content

Instantly share code, notes, and snippets.

;; maybe cascalog.ops/max is good enough?
(<- [?a ?b ?c ?max-score]
(tap ?a ?b ?c)
(score-fn ?a ?b :> ?score)
(cascalog.ops/max ?score :> ?max-score)
@sthuebner
sthuebner / bucketize.clj
Created May 21, 2012 20:33 — forked from maxrzepka/bucketize.clj
High-order bucketize
(require '[cascalog.api :as ca]
'[cascalog.ops :as co])
(def person
;; [person gender age]
[
["alice" "f" 28]
["bob" "m" 33]
["chris" "m" 40]
["david" "m" 25]
@sthuebner
sthuebner / top_2.clj
Created November 29, 2011 18:51
cascalog top-2
(use 'cascalog.api)
(def data [[1 1]
[1 2]
[1 3]
[2 1]
[2 2]
[2 3]])
(defbufferop top-2 [tuples]
(defn hfs-textline
"Creates a tap on HDFS using textline format. Different filesystems can
be selected by using different prefixes for {path}.
See http://www.cascading.org/javadoc/cascading/tap/Hfs.html and
http://www.cascading.org/javadoc/cascading/scheme/TextLine.html"
([path]
(hfs-textline path :keep))
([path sink-mode]
(-> (w/text-line ["line"] Fields/ALL)