frequently asked question:
Q: I would like to ask your advice about preparing for a role in data science
A:
my advice would be to put together a portfolio of projects, on GitHub, evidencing that you know how to
some relationship rules i believe in | |
- don't call a meeting w/o an agenda | |
- don't go to a meeting w/o an agenda (h/t aric hagberg) | |
- don't call a meeting w/o making clear to everyone what you hope to gain, and be honest. | |
- don't go to a meeting w/o at least a hypothesis as to what is the interest of everyone attending |
there's probably more consensus building in academia than in | |
real world. particularly since some of the participants are | |
tenured, you have to get along with them for years, where in | |
real world you can just fire them or they'll just find a new | |
job. tenure makes the market for faculty extremely illiquid. |
background | |
outline | |
data science | |
practices (managerial) | |
reframing questions as ML | |
better wrong than "nice" | |
better science: |
scribd URL: http://www.scribd.com/doc/224608514/The-Full-New-York-Times-Innovation-Report | |
0 (cover) | |
1-2 executive summary | |
- (general) | |
3-5 introduction | |
- NYT "is winning at journalism" | |
- falling behind in...the art and science of getting our journalism to readers" | |
4 {graphic} vast print & digital audience |
frequently asked question: | |
Q: I would like to ask your advice about preparing for a role in data science | |
A: | |
my advice would be to put together a portfolio of projects, on GitHub, | |
evidencing that you know how to | |
- get data (e.g., via wget/curl) |
frequently asked question:
Q: I would like to ask your advice about preparing for a role in data science
A:
my advice would be to put together a portfolio of projects, on GitHub, evidencing that you know how to
wiggins@tantanmen{algorithms}132: lynx -dump -nolist -nobold -nocolor -noreverse https://github.com/ledeprogram/courses/tree/master/algorithms | /usr/bin/perl -pe 's/[^[:ascii:]]/+/g' | tr ',:; /\. ( ) ?-"#[0-9]' '\n' | tr '[:upper:]' '[:lower:]' | grep '[a-z]' | sort -bfd | uniq -c | sort -nr | grep -v '^ 1 ' | |
25 of | |
20 literacy | |
17 to | |
17 in | |
16 o | |
16 data | |
16 a | |
15 algorithms | |
14 the |
BuzzFeed has technology at its core. | |
Its 100+ person tech team has created world-class systems for | |
analytics, | |
advertising, and | |
content management. | |
Engineers are 1st class citizens. | |
Everything is built for mobile devices from the outset. | |
Internet native formats like | |
lists, | |
tweets, |
The Bayesian approach to model selection is a subject you'll | |
like. The basic idea is to compute the "Bayes Factor": | |
http://en.wikipedia.org/wiki/Bayes_factor . | |
As the page says "Bayesian inference has been put forward as a | |
theoretical justification for and generalization of Occam's | |
razor". | |
( http://en.wikipedia.org/wiki/Occam%27s_razor ) | |
The Bayes factor can be approximated under sum assumptions, | |
leading to a simple penalized maximum likelihood called the |
learning mixtures of ranking models | |
consistency of spectral partitioning of uniform hypergraphs under | |
optimal rates for $k$-nn density and mode estimation | |
bayesian inference for structured spike and slab priors | |
grouping-based low-rank video completion and 3d reconstruction | |
tightening after relax: minimax-optimal sparse pca in polynomial | |
belief propagation recursive neural networks | |
communication efficient distributed machine learning with the | |
on the statistical consistency of plug-in classifiers for | |
distributed context-aware bayesian posterior sampling via |