FAQ-preparing-for-a-role-in-DS.md

frequently asked question:

Q: I would like to ask your advice about preparing for a role in data science

my advice would be to put together a portfolio of projects, on GitHub, evidencing that you know how to

get data (e.g., via wget/curl)
scrub data (wisely choose and reproducibly remove "outliers")
model using a variety of approaches (supervised, unsupervised, exploratory) in python or possibly R (usually an employer will prefer one or the other, with more and more employers in my experience preferring python; in the Data Science Group at NYT it's helpful to know your way around SQL and scikit-learn. We don't do much in R, and nothing in SAS, SPSS, MATLAB, Mathematica, or... )
write a coherent description of what you learned, and what this implies for the stakeholder/collaborator/world;

as well as

how you chose the approach you took, what assumptions you made on the way what are the weaknesses in your approach, and what are the next steps.

Update 1: Also consider getting your hands on some fun data to play with. Definition of "fun" is highly personal, so I list several sets which might be of interest: https://gist.github.com/chrishwiggins/84a6319246a7b8f547c4

Update 2: Also consider taking a class ( cf., http://datascience.columbia.edu/data-science-academics )

Update 3: Also consider enrolling in a "data science boot camp", e.g., http://insightdatascience.com/

For more info:

chrishwiggins/FAQ-preparing-for-a-role-in-DS.md