Skip to content

Instantly share code, notes, and snippets.

@kogakure
kogakure / .gitignore
Last active December 17, 2023 08:21
Git: .gitignore file for LaTeX projects
*.acn
*.acr
*.alg
*.aux
*.bak
*.bbl
*.bcf
*.blg
*.brf
*.bst
@piscisaureus
piscisaureus / pr.md
Created August 13, 2012 16:12
Checkout github pull requests locally

Locate the section for your github remote in the .git/config file. It looks like this:

[remote "origin"]
	fetch = +refs/heads/*:refs/remotes/origin/*
	url = [email protected]:joyent/node.git

Now add the line fetch = +refs/pull/*/head:refs/remotes/origin/pr/* to this section. Obviously, change the github url to match your project's URL. It ends up looking like this:

@Mithrandir0x
Mithrandir0x / gist:3639232
Created September 5, 2012 16:15
Difference between Service, Factory and Provider in AngularJS
// Source: https://groups.google.com/forum/#!topic/angular/hVrkvaHGOfc
// jsFiddle: http://jsfiddle.net/pkozlowski_opensource/PxdSP/14/
// author: Pawel Kozlowski
var myApp = angular.module('myApp', []);
//service style, probably the simplest one
myApp.service('helloWorldFromService', function() {
this.sayHello = function() {
return "Hello, World!"
@philz
philz / unixtime.sh
Created November 30, 2012 17:59
seconds since epoch to readable
# $unixtime 1354232717
# local Thu 29 Nov 2012 03:45:17 PM PST utc Thu 29 Nov 2012 11:45:17 PM GMT
unixtime ()
{
gawk "BEGIN { print \"local\", strftime("'"'"%c"'"'", $1), \"utc\", strftime("'"'"%c"'"'", $1, 1) ; }"
}
# Alternately,
# $date -d @1354298146
# Fri Nov 30 09:55:46 PST 2012
@pbailis
pbailis / list.md
Last active April 15, 2018 08:54
Quick and dirty (incomplete) list of interesting, mostly recent data warehousing/"big data" papers

A friend asked me for a few pointers to interesting, mostly recent papers on data warehousing and "big data" database systems, with an eye towards real-world deployments. I figured I'd share the list. It's biased and rather incomplete but maybe of interest to someone. While many are obvious choices (I've omitted several, like MapReduce), I think there are a few underappreciated gems.

###Dataflow Engines:

Dryad--general-purpose distributed parallel dataflow engine
http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf

Spark--in memory dataflow
http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf

@miketheman
miketheman / zook_grow.md
Created July 22, 2013 21:36
Adding nodes to a ZooKeeper ensemble

Adding 2 nodes to an existing 3-node ZooKeeper ensemble without losing the Quorum

Since many deployments may start out with 3 nodes and so little is known about how to grow a cluster from 3 memebrs to 5 members without losing the existing Quorum, here is an example of how this might be achieved.

In this example, all 5 nodes will be running on the same Vagrant host for the purpose of illustration, running on distinct configurations (ports and data directories) without the actual load of clients.

YMMV. Caveat usufructuarius.

Step 1: Have a healthy 3-node ensemble

@diegopacheco
diegopacheco / maven3-scala-plugin.txt
Created October 16, 2013 14:48
Scala Maven Plugin Config
Inside your parent pom.xml:
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
</plugin>
@jdmaturen
jdmaturen / vacuum.py
Last active December 28, 2015 02:09
Vacuum up Crunchbase
"""
Get a bunch of Crunchbase data, but respect the API limits.
Author JD Maturen
Apache 2 License
"""
import logging
from random import random
import sys
@pbailis
pbailis / reproducibility.md
Last active February 15, 2016 12:18
Reproducing (un)reproducibility results

edit: see http://cs.brown.edu/~sk/Memos/Examining-Reproducibility/

Not deserving of a full post, but nonetheless worth writing about: @ongardie, @aalevy, and a few others on Twitter were surprised by the number of papers that were flagged as "not reproducible" according to the recent study at http://reproducibility.cs.arizona.edu. Digging deeper, it appeared that 1.) "code builds" is the standard for reproducibility in this study and that 2.) many broken builds were the result of missing dependencies on the researchers' systems.

I tried reproducing a few of the authors' "unreproducible" results. It's hard to vet 600+ research code repositories, but, with a little effort (< ~10 minutes each?), I was able to get all of the following to actually build (on Ubuntu 13.10). This doesn't inspire confidence in the reproducibility of the study results.

Peter [email protected]