Skip to content

Instantly share code, notes, and snippets.

View chrishwiggins's full-sized avatar

chris wiggins chrishwiggins

View GitHub Profile
- tukey's 1962 paper on the tension between
mathematical statistics and applied computational statistics
http://web.stanford.edu/~gavish/documents/Tukey_the_future_of_data_analysis.pdf
- william cleveland's 2001 "data science" paper
http://www.datascienceassn.org/sites/default/files/Data%20Science%20An%20Action%20Plan%20for%20Expanding%20the%20Technical%20Areas%20of%20the%20Field%20of%20Statistics.pdf
- interview w/leo breiman, heretical statistician
http://projecteuclid.org/euclid.ss/1009213290
Q: I want to sign up for 3900 (supervised research). How many
credits will you give me?
A: If you want to take 3900 with me, we need to come to a
contract, and this contract needs to be closed before the start
of the semester. The contract will stipulate:
- Who is the scientific advisor (if not me)
- What is the deliverable (e.g., technical report, oral report)
Q: what book should i use to learn ML?
A: use several, and find the one that speaks to you.
the list below assumes you know a bit of math but
are not very mathematical, and are interested in learning
enough to be practical. that is, it is not at the
mathematical level of MIJ's alleged list
(cf. https://news.ycombinator.com/item?id=1055389 )
For current information please
- see http://modelingsocialdata.org/ and
- follow @CUSocialData ( https://twitter.com/CUSocialData )
official bulletin URL:
http://www.columbia.edu/cu/bulletin/uwb/subj/APMA/E4990-20151-001/
FAQ:
where are some fun datasets to play with?
1. CMU:
http://lib.stat.cmu.edu/datasets/
2. UCI:
a) MLR@UCI (machine learning repository / machine learning archive )
Q: what are "single tree-based" (as opposed to forest-based) supervised learning methods?
A: some of my favorites:
- ADT
+ wiki: http://en.wikipedia.org/wiki/Alternating_decision_tree
+ ref: http://perun.pmf.uns.ac.rs/radovanovic/dmsem/cd/install/Weka/doc/classifiers-papers/trees/ADTree/atrees.pdf
- rpart in R
+ http://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf
nice NPR story illustrating a conceptual and methodological
difference between AI and ML, using some of the more
press-grabbing, (human) game-beating systems:
http://www.npr.org/blogs/alltechconsidered/2015/01/08/375736513/look-out-this-poker-playing-computer-is-unbeatable
this story's pretty interesting in general but one particular
part grabs my attention:
Oren Etzioni, the head of Seattle's Allen Institute for
---------- Forwarded message ----------
From: chris wiggins <chris.wiggins@[YYY].edu>
Date: Wed, Aug 1, 2012 at 7:26 PM
Subject: stats history
To: hadley@[XXX].edu
Cc: chris wiggins <chris.wiggins@[YYY].edu>
Dear Hadley: