Want to speak with me? I'm jason at jxnl.co
# First things first, Fizzbuzz
for i in range(1, 101): print "Fizz" * (not i % 3) + "Buzz" * (not i % 5) or i
# Or if you want... Simpson's Rule
def simpson(a, b, f, N):
return (1.0 / 3.0) * (2 * (((b - a) / N) * sum(f(v) for v in [i * ((b - a) \
/ (2. * N)) for i in range(2 * N + 1)][1:2 * N + 1:2])) + (((b - a) / N) * \
((f(a) + f(b)) / 2.0 + sum (f(v * ((b - a) / N) + a) for v in xrange(1, N)))
))
I swear I write good code.Hack the North -- Data Scientist
Sole data scientist working on the Hack the North team.
- Exploratory data analysis and visulation to study the interests of hackathon participants to improve upcoming event.
- Designed experiments on various biases that might exist in the application/judging process.
- Created reports and summary statistics of the event and it's participants and gave suggestions to the directors.
NYU Global Institute of Public Health -- Research Intern (Current Position)
Supervisor: Dr. Rumi Chunara
Currently working on a paper discussing the use of machine learning models to find patterns of alcohol abuse on social media.
- Exploratory analysis using Python and Gensim for topic modeling.
- Built complex preprocessing and ingestion pipeline for machine learning with scikit-learn and Gensim.
- Developed informative ipython notebooks that outline and document the body of work produced during the research project.
- Using Amazon Mechanical Turk and crowdsource techniques to develop training data from raw twitter firehose.
Sysomos -- Data Scientist
- Prototyped a proof of concept advertisement recommendation platform for offline targeted audience generation.
- Developed extensible interfaces to our community detection and k-armed bandit service layer.
- Improved clustering performance for twitter community detection on Sysomos MAP and Heartbeat.
- Created MapReduce and Spark applications for Audience generation and various ad-hoc ETL.
University of Waterloo 3B Honors B.Math, transfered from Mathematical Physics Computational Mathematics, Statistics Minor CO-OP
- Applied Probability&Statistics, Linear Modeling, Data Visualization, Data Structures, Linear Algebra, Real Analysis, Computational Physics.
MOOCS (Coursera)
- Machine Learning, from Andrew Ng
- Data Science, from Bill Howe
- Natural Language Processing, from Dan Jurafsky
- Probabistic Graphical Models, from Daphne Koller
[Mark Sweep]http://devpost.com/software/mark-sweep)
Mark Sweep is a collection of services for automating Facebook group moderation. Prototype is currently being rewritten into a service oriented design for reuseability in other applications.
- Won "Best use of Machine Learning" and placed Top20 at PennAppsX.
- Weighed reservoir sampling was used for group based topic classification as a novel way to capture recency.
- Topic/Troll detection done using using sklearn's SVC bag of word features, and watchwords.
- Spam classification using hand engineered features.
- Provided various data services including web crawling, data analysis, and consulting.
- Maintains 70% paid clientwork and 30% pro bono non profit.
- Itpy -- Lazy evaluated list processing with chained transformations such as streaming variance, groupbys, map, filter and more.
- Reservoir -- Python module for uniform, exponential, and weighted reservoir sampling.
- Bandit -- Java Implementation of various bandit algorithms.
- Data Mining Canadian Goverment press releases (2002-2015) to uncover potential data journalism stories.