Stage | Description | |
---|---|---|
1 | Hypothesis Generation | Study the business problem. Build a conceptual model by developing a deeper understanding of the problem and domain. Generate Hypotheses. |
2 | Data Collection | Go out in the wild and collect data based on the generated hypotheses. |
3 | Study the variables | Identify potential predictors using data visualization |
4 | Data Preparation | Clean the data. Fill in missing data points. Scale, normalize and transform data as necessary. |
5 | Bivariate/Multivariate Analysis | Test the hypotheses you've generated earlier. Choose predictors based on correlation with target. |
6 | Data Transformation | Perform non-linear transformations (log) on variables to fish out non-linear relationships with the target variable - log-linear, linear-log, log-log, etc. |
7 | Feature Engineering | Engineer new features guided by your data intuition |
8 | Model Evaluation | Choose a list of appropriate models and rank them by evaluating against the validation set |
9 | Hyperparameter Search | Find the optimal hyperparameters for the models. Re-evaluate and re-rank models |
10 | Ensembling | Stabilize the final model using ensemble methods like averaging, voting and stacking. |
11 | Model Explanation | Extract insights from the model by using visualization or explanatory tools like SHAP values. |
Last active
November 16, 2020 10:06
-
-
Save suriyadeepan/b6319adfc43a2e49845cda6e3134d362 to your computer and use it in GitHub Desktop.
Blog: Hypothesis Generation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment