Created
July 5, 2017 18:32
-
-
Save ricklentz/ead6ed87d712a08e42ce84cb7cd7293f to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Higher leverage activities instead of focusing on grunt work | |
| best minds in the world to focus on the issue (IP development) access to top 3-5 winning solutions | |
| Go on to hire them or continue on in consulting engagement | |
| Branding, recruiting | |
| leverage distribution | |
| XGBoost library - part of 50% of winning solutions | |
| Leaderboard forces question, why are these submissions above me better? | |
| Anyone can sign up - no corrolation with winners and domain export knowledge (shared passion is for data) | |
| Deep learning extended beyond computer vision problem | |
| Compeditive until after the competition ends, open source solutions at end, outgoing video describing the solution | |
| Incredibly useful for beginning student, learn from others, collagoration tooling - Kaggle scripts (R, Python, Julia) | |
| Core skills: | |
| Data programming language R, Python | |
| Interactively exploring data and it's structure (ggplot2) | |
| Rapid iteration and experimentation | |
| Iterative loop is performed as fast as possible | |
| Python - scientific to production code (last 5 years), Keras library (machine vision), XGBoost (rank ads, predict satisfaction) | |
| R - machine learning - great exploration, a bit harder to take to prod | |
| Understand problem (feature preprocessing, compeditive edge, understanding the distribution of the training datasets) Cross validation | |
| Creatively thinking about the domain (effort and creativity) | |
| Issues: | |
| Overfitting on public leaderboard, learn from byproducts (e.g. submitting multiple times, overfit to public leaderboard) | |
| Career: | |
| Deep mind hired 4 Kaggle winners, run own job board (look at top 200-300 in competition pool), swap to more interesting work, executives use profile (in large companies) to find standouts within | |
| Showcase the best of your abilities, collaborations showcase clean and well architected code, give helpful advice, share insights about the data | |
| 100s of compeditions, winners from outside USA, not machine learning Ph.D. students, 30 years in another field and become passionate about ML | |
| Future: | |
| Field still in infancy - 10 year time horizion this will change, it will be easy to create and use the technology | |
| Help the world learn from data, integrate new data sources | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment