Skip to content

Instantly share code, notes, and snippets.

@jpbarto
Last active July 28, 2021 00:16
Show Gist options
  • Save jpbarto/343671f59ca0df854ed27e2809cfbd9f to your computer and use it in GitHub Desktop.
Save jpbarto/343671f59ca0df854ed27e2809cfbd9f to your computer and use it in GitHub Desktop.
Lab instructions for an introduction to machine learning

Machine Learning 101 on SageMaker Immersion Day

Lab 1: Feature Engineering

Step 1. Get into Event Engine

Following the steps online access your AWS account through AWS Event Engine:

https://sagemaker-immersionday.workshop.aws/en/prerequisites/option1.html

DO NOT PROCEED BEYOND THE SAGEMAKER STUDIO DEPLOYMENT.

Step 2. Launch a SageMaker Notebook

Once you’re connected to the AWS console follow the instructions in the Amazon SageMaker documentation to create an EC2-based Jupyter notebook.

https://docs.aws.amazon.com/sagemaker/latest/dg/gs-setup-working-env.html

When the notebook has reached an InService state in the SageMaker console click Open JupyterLab to access the Jupyter server.

The SageMaker notebook can take up to 5 minutes to startup and become available.

Step 3. Download Python notebooks

For the remainder of the labs you will need to copy GitHub repositories which contain Jupyter notebook files. You will need to clone 2 GitHub repositories. The first is for Fast.AI and the second is for the SageMaker Immersion day.

The two repositories you will need to clone are:

To clone the Fast.AI repository:

  1. From the JupyterLab console click GitClone a Repository in the menu at the top of the JupyterLab interface.
  2. This creates a Clone a repo popup dialog. Paste the URL of the Fast.AI repository into the dialog: https://github.com/fastai/course-v3.git and click Clone.
  3. After a few seconds you should see course-v3 in the file navigator on the left of the JuptyerLab interface.

To clone the SageMaker Immersion day repository follow the same instructions as above but use the SageMaker Immersion day repository URL: https://github.com/aws-samples/amazon-sagemaker-immersion-day.git

Step 4. Begin the Jupyter Notebook Introduction lab

The Fast.AI repository you have cloned contains a Jupyter notebook which walks through the features of a Jupyter Notebook. To work through it open the Jupyter notebook at course-v3/nbs/dl1/00_notebook_tutorial.ipynb.

You can navigate to the notebook file using the file navigator on the left of the Jupyter notebook. To begin double-click on course-v3.

When you find the 00_notebook_tutorial.ipynb file in the navigator, double-click on the file to open it. You will be asked to specify a kernel for the notebook, select conda_python3.

Read through the Jupyter Notebook Tutorial until you have completed it. This will introduce you to the features of the Jupyter notebook. After completing this notebook proceed to the next step.

Step 5. Begin the Feature Engineering lab

The SageMaker Immersion day contains a Jupyter notebook that will be used for the remainder of the Immersion Day. To get started open the notebook found in amazon-sagemaker-immersion-day/xgboost_direct_marketing_sagemaker.ipynb.

When prompted select the conda_python3 kernel.

In another browser tab open https://sagemaker-immersionday.workshop.aws/lab1/option2.html and follow the instructions as it walks you through engineering a feature set using Pandas and Numpy. When you reach the text End of Lab 1 in the notebook stop before continuing on to the next lab.

Lab 2: Training, Tuning, and Deploying a Model

In Lab 2 you will train an XGBoost model on the data you prepared in Lab 1. To do this, follow the instructions online https://sagemaker-immersionday.workshop.aws/lab2.html and continue working in the Jupyter notebook you were using.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment