Skip to content

Instantly share code, notes, and snippets.

View strongh's full-sized avatar

Homer Strong strongh

View GitHub Profile
@tobym
tobym / pwdx_for_mac.bash
Created October 27, 2010 01:03
pwdx for mac. Usage: pwx pid
function pwdx {
lsof -a -p $1 -d cwd -n | tail -1 | awk '{print $NF}'
}
@strongh
strongh / buggy.R
Created November 19, 2010 02:14
Some debugging demos for the Portland R User Group meeting
##' Portland R User Group
##'
##' Debugging in R:
##' A Worst-Case Scenario Survival guide
##' Homer Strong, Qmedtrix
##' > Debugging is twice as hard as
##' > writing the code in the first place.
##' > Therefore, if you write the
##' > code as cleverly as possible,
@dzhou
dzhou / amzn_scraper.py
Created May 8, 2012 03:45
amazon review scraper
#!/usr/bin/env python
import urllib
import pprint
import amazonproduct
from BeautifulSoup import BeautifulSoup
from review import db
AWS_KEY = 'YOUR_AWS_KEY'
SECRET_KEY = 'YOUR_AWS_SECRET_KEY'
API_PAGE_LIMIT = 10
@ohpauleez
ohpauleez / sample.clj
Last active January 25, 2017 19:33
The Leap Motion example code using the clojure-leap library
(ns sample
(:require [clojure-leap.core :as leap]
[clojure-leap.hand :as l-hand]
[clojure-leap.pointable :as l-pointable :refer [tip-position]]))
(defn process-frame [frame]
(let [_ (println "Frame id:" (.id frame) "timestamp:" (.timestamp frame)
"hands:" (leap/hands frame) "fingers:" (leap/fingers frame) "tools:" (leap/tools frame))]
(when-let [hand (and (leap/hands? frame) (leap/hand frame 0))]
(let [fingers (leap/fingers hand)
@quintona
quintona / OuterJoinReducer
Created May 11, 2013 03:25
A way of doing an outer join of 2 separate streams, in storm trident. Use an outer join reducer. Here is the code for the reducer and associated state. Simply use topology.multiReduce(s1, s2, function, outputFields).
package storm.cookbook.tfidf.functions;
import java.util.Map;
import storm.trident.operation.MultiReducer;
import storm.trident.operation.TridentCollector;
import storm.trident.operation.TridentMultiReducerContext;
import storm.trident.tuple.TridentTuple;
import backtype.storm.tuple.Values;
@mrflip
mrflip / tuning_storm_trident.asciidoc
Last active October 8, 2024 15:18
Notes on Storm+Trident tuning

Tuning Storm+Trident

Tuning a dataflow system is easy:

The First Rule of Dataflow Tuning:
* Ensure each stage is always ready to accept records, and
* Deliver each processed record promptly to its destination
@jackrusher
jackrusher / blue-ball.clj
Last active December 20, 2015 07:49
Interacting with Leap Motion using Clojure from within emacs.
(ns blue-ball
(:use [seesaw core font graphics])
(:require [clojure-leap.core :as leap]
[clojure-leap.screen :as l-screen]))
;; these atoms contain the current x/y state from the Leap
(def x (atom 10))
(def y (atom 10))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Build the window/canvas
@jkeirstead
jkeirstead / gp-predict.stan
Last active April 18, 2018 01:34
A demo of Gaussian processes using RStan
// Predict from Gaussian Process
// All data parameters must be passed as a list to the Stan call
// Based on original file from https://code.google.com/p/stan/source/browse/src/models/misc/gaussian-process/
data {
int<lower=1> N1;
vector[N1] x1;
vector[N1] y1;
int<lower=1> N2;
vector[N2] x2;
@hadley
hadley / advise.md
Created February 13, 2015 21:32
Advise for teaching an R workshop

I think the two most important messages that people can get from a short course are:

a) the material is important and worthwhile to learn (even if it's challenging), and b) it's possible to learn it!

For those reasons, I usually start by diving as quickly as possible into visualisation. I think it's a bad idea to start by explicitly teaching programming concepts (like data structures), because the pay off isn't obvious. If you start with visualisation, the pay off is really obvious and people are more motivated to push past any initial teething problems. In stat405, I used to start with some very basic templates that got people up and running with scatterplots and histograms - they wouldn't necessary understand the code, but they'd know which bits could be varied for different effects.

Apart from visualisation, I think the two most important topics to cover are tidy data (i.e. http://www.jstatsoft.org/v59/i10/ + tidyr) and data manipulation (dplyr). These are both important for when people go off and apply

---------- Forwarded message ----------
From: chris wiggins <chris.wiggins@[YYY].edu>
Date: Wed, Aug 1, 2012 at 7:26 PM
Subject: stats history
To: hadley@[XXX].edu
Cc: chris wiggins <chris.wiggins@[YYY].edu>
Dear Hadley: