Skip to content

Instantly share code, notes, and snippets.

View mynameisfiber's full-sized avatar

Micha Gorelick mynameisfiber

View GitHub Profile
@mynameisfiber
mynameisfiber / schedule.json
Last active August 29, 2015 13:57
Parse the pycon2014 schedule and create a json blob for great victory
[{"start_time":1397201400,"end_time":1397206800,"title":"Breakfast"},{"start_time":1397206800,"end_time":1397208600,"title":"Opening Statements: Diana Clarke"},{"start_time":1397208600,"end_time":1397211000,"title":"Keynote: John Perry Barlow"},{"start_time":1397211000,"end_time":1397213400,"title":"Break"},{"description":"Why are Python programmers crazy about lists and dictionaries, when\r\nother languages tout bitmaps, linked lists, and B+\u00c2\u00a0trees? Are we missing\r\nout? Come learn how data structures are implemented on bare metal, how\r\nto select the right data structure, how the list and dictionary cover a\r\nwide swath of use cases, and when to dip into the Standard Library or a\r\nthird-party package for an alternative.","title":"All Your Ducks In A Row: Data Structures in the Standard Library and Beyond","track":1,"start_time":1397213400,"speaker":"Brandon Rhodes","end_time":1397215800,"link":"https:\/\/us.pycon.org\/\/2014\/schedule\/presentation\/211\/"},{"description":"Many developers, in
@mynameisfiber
mynameisfiber / gist:7451236
Created November 13, 2013 15:46
simple unnormalized kde evaluation
import numpy as np
def gaussian(x0, sigma):
return lambda x : np.exp(- 0.5 * ((x - x0) / sigma)**2 ) / (sigma * np.sqrt(2 * np.pi))
def kde(points, sigma=.5):
functions = [gaussian(x0, sigma) for x0 in points]
def sampler(x):
return sum(f(x) for f in functions)
return sampler
@mynameisfiber
mynameisfiber / command.js
Last active December 27, 2015 10:29
backtick (http://backtick.io/) command for creating bitly bitmarks
(function () {
var s = document.createElement("script");
s.setAttribute("id", "bitmark_js");
s.setAttribute("type", "text/javascript");
s.setAttribute("src", "//bitly.com/a/bitmarklet.js");
(top.document.body || top.document.getElementsByTagName("head")[0]).appendChild(s);
})();
@mynameisfiber
mynameisfiber / gist:6782583
Last active December 24, 2015 10:09
Counting Bloom and a Timing Bloom using python arrays and tornado PeriodicCallback's
#!/usr/bin/env python
import tornado.ioloop
import tornado.testing
import array
import struct
import math
import mmh3
import time
@mynameisfiber
mynameisfiber / gist:6772906
Created October 1, 2013 01:53
Counting Bloom Filter
#!/usr/bin/env python
import array
import struct
import math
import mmh3
class CountingBloomFilter(object):
def __init__(self, capacity, error=0.005, dtype="B"):
self.capacity = capacity
@mynameisfiber
mynameisfiber / gist:6746047
Created September 28, 2013 20:09
simple twitter archiver
#!/usr/bin/env python2.7
"""
Get a users's timeline and saves it into flat json files, one file per hour. A
good way of running this would be setting up a cronjob that runs every 5
minutes, ie:
( twitter_archive.py >> timeline.log ) || (echo "twitter_archive failed" | mail -s "twitter archive" me@example.com)
"""
import TwitterAPI
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Monkey Patch for tornado that profiles through async requests
"""
from tornado import web
from tornado.escape import utf8
import cProfile as profile
import settings
@mynameisfiber
mynameisfiber / quick_kmin_hash.py
Last active December 14, 2015 01:19
This is a quick kmaxhash implemintation to test out the usefulness of this datastructure for calculating jaccard metrics. It is quick because I wrote it quickly :)
#!/usr/bin/env python2.7
#
# This is a quick kmaxhash implemintation to test out the usefulness of this
# datastructure for calculating jaccard metrics. It is quick because I wrote
# it quickly :)
# micha gorelick, micha@bit.ly
#
import mmh3
import heapq
@mynameisfiber
mynameisfiber / limited_cache.py
Created February 4, 2013 21:11
Simple dictionary caching object that limits the number of cached entries beingstored.
"""
Simple dictionary caching object that limits the number of cached entries being
stored.
Micha Gorelick - http://micha.gd/
"""
from collections import OrderedDict
class LimitedCache(OrderedDict):
def __init__(self, maxsize=250):
@mynameisfiber
mynameisfiber / split_gzip
Last active December 10, 2015 23:48
Split a gzip'ed newline separated file into multiple files by line count.
#!/bin/bash
file="$1";
numlines="$2"
basefile=${file%.gz}
isMore=1
function write_n_lines {
local c=0;
while [[ "$c" -lt "$1" ]]; do