Skip to content

Instantly share code, notes, and snippets.

groups = [{
condition: "OR",
rules: []
}]
state = {}
def match_groups(groups)
"<!DOCTYPE html>\n<html>\n<head>\n\n \n\n <meta charset=\"utf-8\" />\n <meta name=\"land\" content=\"65640;2780\">\n <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0, minimum-scale=1.0, maximum-scale=1.0, user-scalable=no\">\n <style type=\"text/css\">html{font-family:sans-serif;-ms-text-size-adjust:100%;-webkit-text-size-adjust:100%}body{margin:0}article,aside,details,figcaption,figure,footer,header,hgroup,main,menu,nav,section,summary{display:block}audio,canvas,progress,video{display:inline-block;vertical-align:baseline}audio:not([controls]){display:none;height:0}[hidden],template{display:none}a{background-color:transparent}a:active,a:hover{outline:0}abbr[title]{border-bottom:1px dotted}b,strong{font-weight:700}dfn{font-style:italic}h1{font-size:2em;margin:.67em 0}mark{background:#ff0;color:#000}small{font-size:80%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-al
@fatum
fatum / es.sh
Created September 16, 2013 06:55
cd ~
sudo apt-get update
sudo apt-get install openjdk-7-jre-headless -y
### Check http://www.elasticsearch.org/download/ for latest version of ElasticSearch and replace wget link below
# NEW WAY / EASY WAY
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.0.deb
sudo dpkg -i elasticsearch-0.90.0.deb
@fatum
fatum / timeout.rb
Created June 23, 2013 12:52 — forked from lpar/timeout.rb
# Runs a specified shell command in a separate thread.
# If it exceeds the given timeout in seconds, kills it.
# Returns any output produced by the command (stdout or stderr) as a String.
# Uses Kernel.select to wait up to the tick length (in seconds) between
# checks on the command's status
#
# If you've got a cleaner way of doing this, I'd be interested to see it.
# If you think you can do it with Ruby's Timeout module, think again.
def run_with_timeout(command, timeout, tick)
output = ''
#!/bin/sh
mkdir /tmp/ruby-build-patch
cd /tmp/ruby-build-patch
# download and patch the ruby sources
wget http://ftp.ruby-lang.org/pub/ruby/1.9/ruby-1.9.3-p125.tar.gz
tar xzf ruby-1.9.3-p125.tar.gz
cd ruby-1.9.3-p125
curl https://raw.github.com/wayneeseguin/rvm/master/patches/ruby/1.9.3/p125/gcdata.patch | patch -p1
Linux From Scratch:
http://www.linuxfromscratch.org/lfs/downloads/stable/LFS-BOOK-7.3-NOCHUNKS.html
@fatum
fatum / list.md
Created March 2, 2013 16:05 — forked from pbailis/list.md

A friend asked me for a few pointers to interesting, mostly recent papers on data warehousing and "big data" database systems, with an eye towards real-world deployments. I figured I'd share the list. While it's biased and rather incomplete but maybe of interest to someone. While many are obvious choices (I've omitted several, like MapReduce), I think there are a few underappreciated gems.

###Dataflow Engines:

Dryad--general-purpose distributed parallel dataflow engine
http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf

Spark--in memory dataflow
http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf

@fatum
fatum / nlp-help
Last active October 18, 2018 09:43 — forked from mattb/gist:3888345
Some pointers for Natural Language Processing / Machine Learning
Here are the areas I've been researching, some things I've read and some open source packages...
Nearly all text processing starts by transforming text into vectors:
http://en.wikipedia.org/wiki/Vector_space_model
Often it uses transforms such as TFIDF to normalise the data and control for outliers (words that are too frequent or too rare confuse the algorithms):
http://en.wikipedia.org/wiki/Tf%E2%80%93idf
Collocations is a technique to detect when two or more words occur more commonly together than separately (e.g. "wishy-washy" in English) - I use this to group words into n-gram tokens because many NLP techniques consider each word as if it's independent of all the others in a document, ignoring order:
http://matpalm.com/blog/2011/10/22/collocations_1/
@fatum
fatum / install-chef-server
Created September 6, 2012 18:29
Install chef-server (Ubuntu)
echo "deb http://apt.opscode.com/ `lsb_release -cs`-0.10 main" | sudo tee /etc/apt/sources.list.d/opscode.list
sudo mkdir -p /etc/apt/trusted.gpg.d
gpg --keyserver keys.gnupg.net --recv-keys 83EF826A
gpg --export [email protected] | sudo tee /etc/apt/trusted.gpg.d/opscode-keyring.gpg > /dev/null
sudo apt-get update
sudo apt-get install opscode-keyring # permanent upgradeable keyring
sudo apt-get upgrade