Skip to content

Instantly share code, notes, and snippets.

@mepsrajput
Last active November 20, 2021 06:18
Show Gist options
  • Save mepsrajput/204bd053338f4f7c28c081dc5067193d to your computer and use it in GitHub Desktop.
Save mepsrajput/204bd053338f4f7c28c081dc5067193d to your computer and use it in GitHub Desktop.
Data Science skills
  1. Data Pipelines: Airflow
  2. MS Excel
  3. Programming/Tools: Python, PySpark, SQL
  4. Version Control: Github/Bitbucket
  5. Data Wrangling and Feature Engineering
  6. ML
  7. Presentation & Storytelling/Communication: Tableu, PowerBI, Python libraries
  8. Analytics and Modeling: analyze data, run tests, and create explanatory models to gather new insights and predict possible outcomes.
  9. A/B Testing
  10. Statistics
    • To help make recommendations and decisions: maximum likelihood estimators, distributors, and statistical tests
    • Tied to ML Algorithms: Calculus and linear algebra
    • Descriptive statistics (using Python): mean, median, mode, variance, standard deviation.
    • Probability distributions, sample and population, CLT, skewness and kurtosis
    • Inferential statistics: hypothesis testing, confidence intervals
  11. Data Visualization
    1. Break down complex data into smaller, digestible pieces as well as using a variety of visual aids (charts, graphs, etc.)
    2. Effectively communicate key messaging and get buy in for proposed solutions
  12. Big Data: Hadoop, Apache Spark / PySpark
  13. Deep Learning: CNN, RNN
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment