Skip to content

Instantly share code, notes, and snippets.

@goldsmith
goldsmith / numpy_os_x_10_9.sh
Last active January 6, 2024 07:25
How to install Numpy and Scipy on Mac OS X Mavericks (10.9) using Pip.
# set up flags for Numpy C extentions compiling
export CFLAGS="-arch i386 -arch x86_64"
export FFLAGS="-m32 -m64"
export LDFLAGS="-Wall -undefined dynamic_lookup -bundle -arch i386 -arch x86_64"
export CC=gcc-4.2
export CXX="g++ -arch i386 -arch x86_64"
pip install numpy
# success!
@jackschultz
jackschultz / ner.py
Created September 19, 2013 16:52
Named Entity Recognition with python
import nltk
import requests
FREEBASE_API_KEY = ''
class FindNames(object):
def __init__(self, text, freebase_api_key):
self.text = text
self.key = freebase_api_key
@joelittlejohn
joelittlejohn / trie.clj
Last active May 25, 2020 14:59
Trie for auto-complete (or: how to implement auto-complete in 4 lines of Clojure)
(def t
"trie containing the 100,000 most common english words"
(with-open [r (clojure.java.io/reader "/tmp/words-100000")]
(reduce #(assoc-in %1 %2 (sorted-map \0 nil)) (sorted-map) (line-seq r))))
(defn search [p m]
"return a sorted sequence of all words in the trie m that start with the given prefix p"
(let [n (get-in m p)
next (mapcat #(search (str p (key %)) m) (dissoc n \0))]
(if (contains? n \0) (cons p next) next)))

Setting up Flume NG, listening to syslog over UDP, with an S3 Sink

My goal was to set up Flume on my web instances, and write all events into s3, so I could easily use other tools like Amazon Elastic Map Reduce, and Amazon Red Shift.

I didn't want to have to deal with log rotation myself, so I setup Flume to read from a syslog UDP source. In this case, Flume NG acts as a syslog server, so as long as Flume is running, my web application can simply write to it in syslog format on the specified port. Most languages have plugins for this.

At the time of this writing, I've been able to get Flume NG up and running on 3 ec2 instances, and all writing to the same bucket.

Install Flume NG on instances

@samuelclay
samuelclay / radix_trie.clj
Created January 22, 2013 17:51
A Radix Trie (aka PATRICIA trie) implemented in Clojure. For my JavaScript implementation, see this interactive JS fiddle: http://jsfiddle.net/jCYAw/
(ns radix
(:require [clojure.string :as string]))
(use 'clojure.java.io)
(use 'clojure.pprint)
(println "Loading names... ")
(time (def names
(with-open
[rdr (reader
"/usr/share/dict/ProperNames")]
@brandonb927
brandonb927 / osx-for-hackers.sh
Last active April 19, 2025 05:22
OSX for Hackers: Yosemite/El Capitan Edition. This script tries not to be *too* opinionated and any major changes to your system require a prompt. You've been warned.
#!/bin/sh
###
# SOME COMMANDS WILL NOT WORK ON macOS (Sierra or newer)
# For Sierra or newer, see https://github.com/mathiasbynens/dotfiles/blob/master/.macos
###
# Alot of these configs have been taken from the various places
# on the web, most from here
# https://github.com/mathiasbynens/dotfiles/blob/5b3c8418ed42d93af2e647dc9d122f25cc034871/.osx
@smougenot
smougenot / A_Logstash.conf
Created July 26, 2012 13:59
Logstash Multiline Filter for Java Stacktrace (tested on field)
# stacktrace java as one message
multiline {
#type => "all" # no type means for all inputs
pattern => "(^.+Exception: .+)|(^\s+at .+)|(^\s+... \d+ more)|(^\s*Caused by:.+)"
what => "previous"
}
@jgeurts
jgeurts / install-graphite-ubuntu-12.04.sh
Created July 14, 2012 16:36 — forked from tkoeppen/install-graphite-ubuntu-10.04.sh
Install Graphite 0.9.10 on Ubuntu 12.04
####################################
# BASIC REQUIREMENTS
# http://graphite.wikidot.com/installation
# http://geek.michaelgrace.org/2011/09/how-to-install-graphite-on-ubuntu/
# Last tested & updated 10/13/2011
####################################
cd
sudo apt-get update
sudo apt-get upgrade
@jboner
jboner / latency.txt
Last active April 19, 2025 21:29
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
@methane
methane / gist:2185380
Created March 24, 2012 17:28
Tornado Example: Delegating an blocking task to a worker thread pool from an asynchronous request handler
from time import sleep
from tornado.httpserver import HTTPServer
from tornado.ioloop import IOLoop
from tornado.web import Application, asynchronous, RequestHandler
from multiprocessing.pool import ThreadPool
_workers = ThreadPool(10)
def run_background(func, callback, args=(), kwds={}):
def _callback(result):