Skip to content

Instantly share code, notes, and snippets.

@dsparks
dsparks / distance_matrix.R
Created September 18, 2012 23:15
Calculating distances (including between matrices)
# Cross-matrix distances and different measurement options, with "proxy"
doInstall <- TRUE # Change to FALSE if you don't want packages installed.
toInstall <- c("proxy", "MASS", "Zelig")
if(doInstall){install.packages(toInstall, repos = "http://cran.us.r-project.org")}
lapply(toInstall, library, character.only = TRUE)
# Invent two data frames
voterIdealPoints <- data.frame(matrix(rnorm(26*2), ncol = 2))
rownames(voterIdealPoints) <- letters
@bwhite
bwhite / rank_metrics.py
Created September 15, 2012 03:23
Ranking Metrics
"""Information Retrieval metrics
Useful Resources:
http://www.cs.utexas.edu/~mooney/ir-course/slides/Evaluation.ppt
http://www.nii.ac.jp/TechReports/05-014E.pdf
http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf
http://hal.archives-ouvertes.fr/docs/00/72/67/60/PDF/07-busa-fekete.pdf
Learning to Rank for Information Retrieval (Tie-Yan Liu)
"""
import numpy as np
@jeroenr
jeroenr / WeirdIntegerParser.scala
Last active November 15, 2017 09:26
Scala pattern matching with regex: parsing weird integer strings
val MaxInt = """(inf)""".r
val NormalInt = """(\d*)""".r
def parseInt(integerString:String): Int = {
integerString match {
case MaxInt(_) => Integer.MAX_VALUE
case NormalInt(_) => Integer.valueOf(integerString)
case _ => throw new NumberFormatException(String.format("Cannot parse %s as an integer", integerString))
}
}
@NorthIsUp
NorthIsUp / spawn.py
Created July 5, 2012 12:46
gevent spawn helpers
"""
realertime.lib.spawn
~~~~~~~~~~~~~~~~~~~~
:author: Adam Hitchcock
:copyright: (c) 2012 DISQUS.
:license: Apache License 2.0, see LICENSE for more details.
"""
from __future__ import absolute_import
@darkseed
darkseed / apache-logs-hive.sql
Created June 1, 2012 21:00 — forked from emk/apache-logs-hive.sql
Apache log analysis with Hadoop, Hive and HBase
-- This is a Hive program. Hive is an SQL-like language that compiles
-- into Hadoop Map/Reduce jobs. It's very popular among analysts at
-- Facebook, because it allows them to query enormous Hadoop data
-- stores using a language much like SQL.
-- Our logs are stored on the Hadoop Distributed File System, in the
-- directory /logs/randomhacks.net/access. They're ordinary Apache
-- logs in *.gz format.
--
-- We want to pretend that these gzipped log files are a database table,
@oodavid
oodavid / README.md
Created March 26, 2012 17:05
Backup MySQL to Amazon S3

Backup MySQL to Amazon S3

This is a simple way to backup your MySQL tables to Amazon S3 for a nightly backup - this is all to be done on your server :-)

Sister Document - Restore MySQL from Amazon S3 - read that next

1 - Install s3cmd

this is for Centos 5.6, see http://s3tools.org/repositories for other systems like ubuntu etc

@darkseed
darkseed / apache-logs-hive.sql
Created March 12, 2012 00:04 — forked from emk/apache-logs-hive.sql
Apache log analysis with Hadoop, Hive and HBase
-- This is a Hive program. Hive is an SQL-like language that compiles
-- into Hadoop Map/Reduce jobs. It's very popular among analysts at
-- Facebook, because it allows them to query enormous Hadoop data
-- stores using a language much like SQL.
-- Our logs are stored on the Hadoop Distributed File System, in the
-- directory /logs/randomhacks.net/access. They're ordinary Apache
-- logs in *.gz format.
--
-- We want to pretend that these gzipped log files are a database table,
@samvermette
samvermette / gist:1691280
Created January 27, 2012 22:27
MapKit callout bubble pop animation
self.view.layer.anchorPoint = CGPointMake(0.50, 1.0);
CAKeyframeAnimation *bounceAnimation = [CAKeyframeAnimation animationWithKeyPath:@"transform.scale"];
bounceAnimation.values = [NSArray arrayWithObjects:
[NSNumber numberWithFloat:0.05],
[NSNumber numberWithFloat:1.08],
[NSNumber numberWithFloat:0.92],
[NSNumber numberWithFloat:1.0],
nil];
@chrishamant
chrishamant / s3_multipart_upload.py
Created January 3, 2012 19:29
Example of Parallelized Multipart upload using boto
#!/usr/bin/env python
"""Split large file into multiple pieces for upload to S3.
S3 only supports 5Gb files for uploading directly, so for larger CloudBioLinux
box images we need to use boto's multipart file support.
This parallelizes the task over available cores using multiprocessing.
Usage:
s3_multipart_upload.py <file_to_transfer> <bucket_name> [<s3_key_name>]
@michaelaguiar
michaelaguiar / d3donut.js
Created December 17, 2011 00:37
d3.js donut chart test
var donutVal = 85;
var donutFull = 100 - donutVal;
var d3_category_socialmedia = ["#0054a6", "#dbdbdb"];
if(donutVal < 50) {
donutVal = -donutVal;
donutFull = -donutFull;