I hereby claim:
- I am ryanwitt on github.
- I am onecreativenerd (https://keybase.io/onecreativenerd) on keybase.
- I have a public key ASB_TkeaXoaqMw5ii1FGzwCwYblooenmt-s59k24W87OZAo
To claim this, I am signing this object:
I hereby claim:
To claim this, I am signing this object:
class RedisTools: | |
''' | |
A set of utility tools for interacting with a redis cache | |
''' | |
def __init__(self): | |
self._queues = ["default", "high", "low", "failed"] | |
self.get_redis_connection() | |
def get_redis_connection(self): |
#!/bin/sh | |
VERSION=0.12.2 | |
PLATFORM=linux | |
ARCH=x64 | |
PREFIX=/usr/local | |
mkdir -p "$PREFIX" && \ | |
curl http://nodejs.org/dist/v$VERSION/node-v$VERSION-$PLATFORM-$ARCH.tar.gz \ | |
| tar xzvf - --strip-components=1 -C "$PREFIX" |
def collect_ranges(s): | |
""" | |
Returns a generator of tuples of consecutive numbers found in the input. | |
>>> list(collect_ranges([])) | |
[] | |
>>> list(collect_ranges([1])) | |
[(1, 1)] | |
>>> list(collect_ranges([1,2,3])) | |
[(1, 3)] |
// | |
// cpuse.js - simple continuous cpu monitor for node | |
// | |
// Intended for programs wanting to monitor and take action on overall CPU load. | |
// | |
// The monitor starts as soon as you require the module, then you can query it at | |
// any later time for the average cpu: | |
// | |
// > var cpuse = require('cpuse'); | |
// > cpuse.averages(); |
// Check mongodb working set size (Mongo 2.4+). | |
// Paste this into mongo console, get back size in GB | |
db.runCommand({ | |
serverStatus:1, workingSet:1, metrics:0, locks:0 | |
}).workingSet.pagesInMemory * 4096 / (Math.pow(2,30)); |
You need 7zip installed to grab the NPI database. (brew install p7zip
osx)
To create the index, run the init_*
scripts. You would need the doctor graph referral data to use *_refer.*
, but the NPI database will be automatically downloaded for you. Indexing happens on all cores, and takes less than 10 min on my 8 core machine.
To grab lines matching a search term, use python search_npi.py term
.
Note: index performance is good if you have a lot of memory. Index file blocks will stay hot in cache, but they are loaded each time the program is run, which is super inefficient. Should use an on-disk hashtable where the offsets can be calculated instead.
froms = {} | |
tos = {} | |
for i,line in enumerate(file('refer.2011.csv')): | |
try: | |
fr, to, count = line.strip().split(',') | |
froms[fr] = froms.get(fr,0) + 1 | |
tos[to] = tos.get(to,0) + 1 | |
except: | |
import traceback; traceback.print_exc() |
import random | |
import matplotlib.pyplot as plt | |
k = 1000 | |
array = [] | |
for n, x in enumerate([range(k)[random.randrange(k)] for x in range(100000)]): | |
if n < k: | |
array.append(x) | |
else: | |
if random.random() < k/float(n): |