Skip to content

Instantly share code, notes, and snippets.

@jpbarto
Last active August 26, 2020 05:37
Show Gist options
  • Save jpbarto/f2c41b7c30ec1bed870db742c1d4833e to your computer and use it in GitHub Desktop.
Save jpbarto/f2c41b7c30ec1bed870db742c1d4833e to your computer and use it in GitHub Desktop.
Overarching lab guide for an introduction to Amazon SageMaker

Lab 1: Getting started

Create a Jupyter notebook

  1. Visit https://github.com/awslabs/amazon-sagemaker-examples, at the bottom of the page you will find a link to create an Amazon SageMaker notebook
  2. Follow the instructions using an ml.m5.2xlarge instance type.
  3. When the notebook has been created click Open JupyterLab.

Clone the lab materials

Clone the Amazon SageMaker Examples to your notebook from GitHub.

  1. With the JupyterLab console open click Git from the menu and select Clone
  2. Paste the GitHub project url: https://github.com/awslabs/amazon-sagemaker-examples.git and click Clone

Lab 1, Create an pipeline model with SciKitLearn

In the JupyterLab interface, in the file browser on the left, navigate to the project located in

amazon-sagemaker-examples/sagemaker-python-sdk/scikit_learn_inference_pipeline

In this directory open the Python Notebook to begin the lab. Take note of the functions and structure of the sklearn_abalone_featurizer.py script which contains the feature engineering logic for the pipeline model.

Lab 2, Build your own Tensorflow Container

In this lab you will create a customized TensorFlow container to host your training and model hosting code. After you create the container you will push it to Amazon ECR where SageMaker will use the container to carry out your training and hosting jobs.

In the JupyterLab interface, in the file browser on the left, navigate to the project located in

amazon-sagemaker-examples/advanced_functionality/tensorflow_bring_your_own

In this directory open the Python Notebook to begin the lab.

Lab 3, Perform distributed training with PyTorch and Horovod

In this lab you will specify the number of EC2 instances to be used to create a training cluster to performed distributed training of a PyTorch algorithm using Horovod.

In the JupyterLab interface, in the file browser on the left, navigate to the project located in

amazon-sagemaker-examples/sagemaker-python-sdk/pytorch_horovod_mnist

In this directory open the Python Notebook to begin the lab.

Extra fun, use Deep Graph Library to detect fraud

https://github.com/awslabs/sagemaker-graph-fraud-detection

Notes

Default S3 bucket name

To obtain the default S3 bucket name:

session = sagemaker.session.Session ()
bucket = session.default_bucket ()

Errors about as_matrix

Pandas DataFrame, replace as_matrix with values

Errors about reshape

Pandas DataFrame, replace reshape with values.reshape

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment