Skip to content

Instantly share code, notes, and snippets.

View misho-kr's full-sized avatar

Misho Krastev misho-kr

  • San Jose, California
View GitHub Profile
@misho-kr
misho-kr / Python Data Science Toolbox (Part 2).md
Last active November 2, 2019 05:36
Summary of "Python Data Science Toolbox (Part 2)" course on Datacamp

In this second Python Data Science Toolbox course, you'll continue to build your Python data science skills. First, you'll learn about iterators, objects you have already encountered in the context of for loops. You'll then learn about list comprehensions, which are extremely handy tools for all data scientists working in Python. You'll end the course by working through a case study in which you'll apply all the techniques you learned in both parts of this course.

Using iterators in PythonLand

You'll learn all about iterators and iterables, which you have already worked with when writing for loops.

  • Iterators and iterables, iter() and next()
@misho-kr
misho-kr / Introduction to Importing Data in Python.md
Last active February 2, 2020 05:25
Summary of "Introduction to Importing Data in Python" course on Datacamp

As a data scientist, you will need to clean data, wrangle and munge it, visualize it, build predictive models, and interpret these models. Before you can do so, however, you will need to know how to get data into Python. In this course, you'll learn the many ways to import data into Python: from flat files such as .txt and .csv; from files native to other software such as Excel spreadsheets, Stata, SAS, and MATLAB files; and from relational databases such as SQLite and PostgreSQL.

Lead by Hugo Bowne-Anderson, Data Scientist at DataCamp

Introduction and flat files

In this chapter, you'll learn how to import data into Python from all types of flat files, which are a simple and prevalent form of data storage. You've previously learned how to use NumPy and pandas—you will learn how to use these packages to impor

@misho-kr
misho-kr / Intermediate Importing Data in Python.md
Last active February 2, 2020 05:59
Summary of "Intermediate Importing Data in Python" course on Datacamp

In this course, you'll extend this knowledge base by learning to import data from the web and by pulling data from Application Programming Interfaces— APIs—such as the Twitter streaming API, which allows us to stream real-time tweets.

Lead by Hugo Bowne-Anderson, Data Scientist at DataCamp

Importing data from the Internet

The web is a rich source of data from which you can extract various types of insights and findings. In this chapter, you will learn how to get data from the web, whether it is stored in files or in HTML. You'll also learn the basics of scraping and parsing web data.

@misho-kr
misho-kr / Cleaning Data in Python.md
Last active November 2, 2021 03:34
Summary of "Cleaning Data in Python" course on Datacamp

It is commonly said that data scientists spend 80% of their time cleaning and manipulating data, and only 20% of their time actually analyzing it. This course will equip you with all the skills you need to clean your data in Python, from learning how to diagnose problems in your data, to dealing with missing values and outliers.

Lead by Daniel Chen, Data Science Consultant at Lander Analytics

Exploring your data

You've gotten your hands on a brand new dataset and are itching to start exploring it. How can you be sure your dataset is clean? You'll learn how to explore your data with an eye for diagnosing issues such as outliers, missing values, and duplicate rows.

@misho-kr
misho-kr / pandas Foundations.md
Last active September 17, 2020 09:39
Summary of "pandas Foundations" course on Datacamp (https://gist.github.com/misho-kr/873ddcc2fc89f1c96414de9e0a58e0fe)

pandas DataFrames are the most widely used in-memory representation of complex data collections within Python. Whether in finance, a scientific field, or data science, familiarity with pandas is essential. This course teaches you to work with real-world datasets containing both string and numeric data, often structured around time series. You will learn powerful analysis, selection, and visualization techniques in this course.

Lead by Team Anaconda, Data Science Consultant at Lander Analytics

Data ingestion & inspection

Use pandas to import and inspect a variety of datasets, ranging from population data obtained from the World Bank to monthly stock data obtained via Yahoo Finance. Build DataFrames from scratch and become familiar with the intrinsic data visualization capabilities of pandas.

@misho-kr
misho-kr / Manipulating DataFrames with pandas.md
Last active October 11, 2020 10:37
Summary of "Manipulating DataFrames with pandas" course on Datacamp

Leverage pandas' powerful data manipulation engine to get the most out of your data. Drill into the data that really matters by extracting, filtering, and transforming data from DataFrames. The pandas library has many techniques that make this process efficient and intuitive. You will learn how to tidy, rearrange, and restructure your data by pivoting or melting and stacking or unstacking DataFrames.

Lead by Team Anaconda, Data Science Consultant at Lander Analytics

Extracting and transforming data

Index, slice, filter, and transform DataFrames using a variety of datasets, ranging from 2012 US election data for the state of Pennsylvania to Pittsburgh weather data.

@misho-kr
misho-kr / Data Manipulation with pandas.md
Last active April 5, 2023 13:46
Summary of "Data Manipulation with pandas" course on Datacamp

pandas is the world's most popular Python library, used for everything from data manipulation to data analysis. Learn how to manipulate DataFrames, as you extract, filter, and transform real-world datasets for analysis. Using real-world data, including Walmart sales figures and global temperature time series, you’ll learn how to import, clean, calculate statistics, and create visualizations—using pandas!

Lead by Maggie Matsui, Data Scientist at DataCamp

Transforming Data

Inspect DataFrames and perform fundamental manipulations, including sorting rows, subsetting, and adding new columns

@misho-kr
misho-kr / Loan Amortization in Spreadsheets.md
Last active September 17, 2020 09:32
Summary of "Loan Amortization in Spreadsheets" from Datacamp.Org (https://gist.github.com/misho-kr/873ddcc2fc89f1c96414de9e0a58e0fe)

Course Description

A loan amortization schedule sounds like something that's only used by bankers and financial traders, right?

Wrong! In this course, we'll be looking at the key financial formulas in Google Sheets that you can use to investigate your own loans, like student loans, car loans, and mortgages. We'll build up a dashboard in Google Sheets which uses visualizations and conditional formulas to produce presentation-ready spreadsheets which will impress any finance manager!

By Brent Allen, Financial Spreadsheets Specialist

@misho-kr
misho-kr / Introduction to Airflow in Python.md
Last active March 15, 2021 20:28
Summary of "Introduction to Airflow in Python" from Datacamp.Org (https://gist.github.com/misho-kr/873ddcc2fc89f1c96414de9e0a58e0fe)

A guide to the basic concepts of Airflow and how to implement data engineering workflows in production

By Mike Metzger, Data Engineer Consultant @ Flexible Creations

Intro to Airflow

An introduction to the components of Apache Airflow and why to use them