Skip to content

Instantly share code, notes, and snippets.

@t-ob
t-ob / gist:3793454
Created September 27, 2012 10:57
Ignore wonky SSL certs
;; Before:
;; (http/get "https://tv.eurosport.com/")
;; Error: AbstractVerifier.java:228 org.apache.http.conn.ssl.AbstractVerifier.verify
;;
;; Instead:
;; (with-connection-pool {:insecure? true} (http/get "https://tv.eurosport.com/"))
;; ignores bad certificate
(defn request*
"HTTP request with a constrained time and body size."
@t-ob
t-ob / gist:3786636
Created September 26, 2012 07:45
London Clojure Dojo September 2012 - Team 2
(ns ldnclj-sep-2012.core)
(defn split-seq [n s]
(loop [ln n ls s r1 []]
(cond
(= ln 0) (vector r1 ls)
:else (recur (dec ln) (rest ls) (conj r1 (first ls))))))
@t-ob
t-ob / gist:3450989
Created August 24, 2012 14:07
lein ring server error
Exception in thread "main" java.lang.NoSuchMethodError: org.slf4j.spi.LocationAwareLogger.log(Lorg/slf4j/Marker;Ljava/lang/String;ILjava/lang/String;[Ljava/lang/Object;Ljava/lang/Throwable;)V
at org.eclipse.jetty.util.log.JettyAwareLogger.log(JettyAwareLogger.java:601)
at org.eclipse.jetty.util.log.JettyAwareLogger.warn(JettyAwareLogger.java:425)
at org.eclipse.jetty.util.log.Slf4jLog.warn(Slf4jLog.java:74)
at org.eclipse.jetty.util.component.AbstractLifeCycle.setFailed(AbstractLifeCycle.java:199)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:69)
at ring.adapter.jetty$run_jetty.invoke(jetty.clj:85)
at ring.server.standalone$serve$fn__1551.invoke(standalone.clj:98)
at ring.server.standalone$try_port.invoke(standalone.clj:16)
at ring.server.standalone$serve.doInvoke(standalone.clj:95)
@t-ob
t-ob / hbasesink.clj
Created August 3, 2012 14:48
Sinking to HBase
(def sample-tweets
[["1234567890" (str "[\"" "http://t.co/09hWanj8" "\",\"" "http://t.co/lozKUbnY" "\"]")]
["1029384756" (str "[\"" "http://t.co/1O8YAA21" "\",\"" "http://t.co/ONYmi6gK" "\"]")]])
;; tap/hbase
;; (defn hbase [table-name key-field column-family & value-fields]
;; (let [scheme (HBaseScheme. (w/fields key-field) column-family (w/fields value-fields))]
;; (HBaseTap. table-name scheme)))
;; This works
@t-ob
t-ob / bad.sql
Created August 1, 2012 17:04
bad tweets
select count(*), tweet_id, content, max(screen_name), max(created_at), min(screen_name), min(created_at)
from tweets_with_id
group by tweet_id
having count(*) = 2
@t-ob
t-ob / hbase.txt
Created July 31, 2012 17:15
hbase
import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes
scan 'tweets', { COLUMNS => 'base:content', FILTER => SingleColumnValueFilter.new(Bytes.toBytes('base'),Bytes.toBytes('content'), CompareFilter::CompareOp.valueOf('EQUAL'),SubstringComparator.new('burn')) }
@t-ob
t-ob / csv.sh
Created July 31, 2012 14:45
csv script
#!/bin/bash
NOW=$(date +"%Y-%m-%d-%T")
USERS10="users-10.$NOW.csv"
EVENTS10="events-10.$NOW.csv"
USERS20="users-20.$NOW.csv"
EVENTS20="events-20.$NOW.csv"
echo $USERS10;
@t-ob
t-ob / first-n.clj
Created July 30, 2012 15:24
first-n
(defn limited-output
[keywords-str limit]
(let [tweets-tap (tweets-new)
urls-tap (short-urls)
jdbc-sink db/jdbc-tap]
(?- (stdout)
(c/first-n
(<- [ ?args ]
; query
limit))))
@t-ob
t-ob / multiple-columns.clj
Created July 27, 2012 15:53
Multiple column families in a tap
(defn tweets
[]
(let [scheme (HBaseScheme. (Fields. (into-array String ["id"]))
(into-array String ["base" "raw"])
(into-array [(Fields. (into-array String ["tweet_id"
"screen_name"
"content"
"created_at"
"urls"]))
(Fields. (into-array String ["topsy_url"]))]))
@t-ob
t-ob / joins.clj
Created July 27, 2012 15:46
Cascalog joins
(def short-urls
[["http://t.co/yERArQn0"]
["http://t.co/gI8TjreI"]
["http://t.co/CBsucpNm"]
["http://t.co/F74GG1oN"]
["http://t.co/hyoXObbU"]])
(def longified-urls
[["http://t.co/yERArQn0" "http://news.sky.com/story/964624/olympic-lanes-open-traffic-delays-in-london"]
["http://t.co/CBsucpNm" "http://media.tumblr.com/tumblr_m65uu5UsrQ1r7h1lt.gif"]