- pandas
- scikit-learn
- sklearn-pandas
- Dive
- Google hosted jupyter notebook: Colab. Introductory tutorial on how to use colab
- Data Exploration & Validation
- Statistical Properties
- Mean, variance etc.
- Correlation
- Entropy
- Mutual information, Point-wise mutual information
- Missing values
- Outlier Detection
- Feature imputation
- Fairness in Machine Learning
- Statistical Properties
- Feature Engineering
- Feature scaling
- Feature normalization
- Feature standardization
- Feature encoding
- One hot encoding (why do we need one hot encoding)
- Feature selection
- Feature Correlation: Whether two features are correlated
- Mutual information
- chi2 test
- Dimensionality reduction - Principal Component Analysis (PCA)
- Feature Importance
- Algorithms
- Linear Classification
- Linear Regression
- Validation
- k-fold cross validation
- Titanic Dataset for classification
- Housing Price Dataset for regression
- https://pair-code.github.io/facets/
- http://vega.github.io/voyager/
- https://github.com/altair-viz/jupyterlab_voyager
- https://pypi.org/project/ipython-sql/
- https://datastudio.google.com/
- https://www.datarobot.com/ (Commercial)