Last active
May 22, 2021 16:15
-
-
Save hamelsmu/6e5cfcdde87e5e4c0b9766078d1e83d6 to your computer and use it in GitHub Desktop.
fastai example doesn't work
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Copyright (c) Microsoft Corporation. All rights reserved.\n", | |
"\n", | |
"Licensed under the MIT License." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Train a model using a custom Docker image" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"In this tutorial, learn how to use a custom Docker image when training models with Azure Machine Learning.\n", | |
"\n", | |
"The example scripts in this article are used to classify pet images by creating a convolutional neural network. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Set up the experiment\n", | |
"This section sets up the training experiment by initializing a workspace, creating an experiment, and uploading the training data and training scripts." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Initialize a workspace\n", | |
"The Azure Machine Learning workspace is the top-level resource for the service. It provides you with a centralized place to work with all the artifacts you create. In the Python SDK, you can access the workspace artifacts by creating a `workspace` object.\n", | |
"\n", | |
"Create a workspace object from the config.json file." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"Warning: Falling back to use azure cli login credentials.\n", | |
"If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.\n", | |
"Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.\n" | |
] | |
} | |
], | |
"source": [ | |
"from azureml.core import Workspace\n", | |
"\n", | |
"ws = Workspace.from_config()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Prepare scripts\n", | |
"Create a directory titled `fastai-example`." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import os\n", | |
"os.makedirs('fastai-example', exist_ok=True)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Then run the cell below to create the training script train.py in the directory." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": { | |
"jupyter": { | |
"outputs_hidden": false, | |
"source_hidden": false | |
}, | |
"nteract": { | |
"transient": { | |
"deleting": false | |
} | |
} | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Overwriting fastai-example/train.py\n" | |
] | |
} | |
], | |
"source": [ | |
"%%writefile fastai-example/train.py\n", | |
"\n", | |
"from fastai.vision.all import *\n", | |
"\n", | |
"print('hello world')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Define your environment\n", | |
"Create an environment object and enable Docker." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"'enabled' is deprecated. Please use the azureml.core.runconfig.DockerConfiguration object with the 'use_docker' param instead.\n" | |
] | |
} | |
], | |
"source": [ | |
"from azureml.core import Environment\n", | |
"\n", | |
"fastai_env = Environment(\"fastai\")\n", | |
"fastai_env.docker.enabled = True" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"This specified base image supports the fast.ai library which allows for distributed deep learning capabilities. For more information, see the [fast.ai DockerHub](https://hub.docker.com/u/fastdotai). \n", | |
"\n", | |
"When you are using your custom Docker image, you might already have your Python environment properly set up. In that case, set the `user_managed_dependencies` flag to True in order to leverage your custom image's built-in python environment." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"fastai_env.docker.base_image = \"fastai/ubuntu:latest\"\n", | |
"fastai_env.python.user_managed_dependencies = True" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"To use an image from a private container registry that is not in your workspace, you must use `docker.base_image_registry` to specify the address of the repository as well as a username and password." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"```python\n", | |
"fastai_env.docker.base_image_registry.address = \"myregistry.azurecr.io\"\n", | |
"fastai_env.docker.base_image_registry.username = \"username\"\n", | |
"fastai_env.docker.base_image_registry.password = \"password\"\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"It is also possible to use a custom Dockerfile. Use this approach if you need to install non-Python packages as dependencies and remember to set the base image to None. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Specify docker steps as a string:\n", | |
"```python \n", | |
"dockerfile = r\"\"\" \\\n", | |
"FROM mcr.microsoft.com/azureml/base:intelmpi2018.3-ubuntu16.04\n", | |
"RUN echo \"Hello from custom container!\" \\\n", | |
"\"\"\"\n", | |
"```\n", | |
"Set base image to None, because the image is defined by dockerfile:\n", | |
"```python\n", | |
"fastai_env.docker.base_image = None \\\n", | |
"fastai_env.docker.base_dockerfile = dockerfile\n", | |
"```\n", | |
"Alternatively, load the string from a file:\n", | |
"```python\n", | |
"fastai_env.docker.base_image = None \\\n", | |
"fastai_env.docker.base_dockerfile = \"./Dockerfile\"\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Create or attach existing AmlCompute\n", | |
"You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, you create `AmlCompute` as your training compute resource.\n", | |
"\n", | |
"> Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.\n", | |
"\n", | |
"**Creation of AmlCompute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n", | |
"\n", | |
"As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Found existing compute target.\n", | |
"{'currentNodeCount': 0, 'targetNodeCount': 1, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Resizing', 'allocationStateTransitionTime': '2021-05-22T15:52:14.955000+00:00', 'errors': None, 'creationTime': '2021-05-21T22:35:46.284604+00:00', 'modifiedTime': '2021-05-21T22:36:31.965101+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT120S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_D2_V2'}\n" | |
] | |
} | |
], | |
"source": [ | |
"from azureml.core.compute import ComputeTarget, AmlCompute\n", | |
"from azureml.core.compute_target import ComputeTargetException\n", | |
"\n", | |
"# choose a name for your cluster\n", | |
"cluster_name = \"cpu-cluster\"\n", | |
"\n", | |
"try:\n", | |
" compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n", | |
" print('Found existing compute target.')\n", | |
"except ComputeTargetException:\n", | |
" print('Creating a new compute target...')\n", | |
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6',\n", | |
" max_nodes=4)\n", | |
"\n", | |
" # create the cluster\n", | |
" compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n", | |
"\n", | |
" compute_target.wait_for_completion(show_output=True)\n", | |
"\n", | |
"# use get_status() to get a detailed status for the current AmlCompute\n", | |
"print(compute_target.get_status().serialize())" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Create a ScriptRunConfig\n", | |
"This ScriptRunConfig will configure your job for execution on the desired compute target." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": { | |
"jupyter": { | |
"outputs_hidden": false, | |
"source_hidden": false | |
}, | |
"nteract": { | |
"transient": { | |
"deleting": false | |
} | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"from azureml.core import ScriptRunConfig\n", | |
"\n", | |
"fastai_config = ScriptRunConfig(source_directory='fastai-example',\n", | |
" script='train.py',\n", | |
" compute_target=compute_target,\n", | |
" environment=fastai_env)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Submit your run\n", | |
"When a training run is submitted using a ScriptRunConfig object, the submit method returns an object of type ScriptRun. The returned ScriptRun object gives you programmatic access to information about the training run. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": { | |
"jupyter": { | |
"outputs_hidden": false, | |
"source_hidden": false | |
}, | |
"nteract": { | |
"transient": { | |
"deleting": false | |
} | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"from azureml.core import Experiment\n", | |
"\n", | |
"run = Experiment(ws,'fastai-custom-image').submit(fastai_config)\n", | |
"run.wait_for_completion(show_output=True)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"authors": [ | |
{ | |
"name": "sagopal" | |
} | |
], | |
"category": "training", | |
"compute": [ | |
"AML Compute" | |
], | |
"datasets": [ | |
"Oxford IIIT Pet" | |
], | |
"deployment": [ | |
"None" | |
], | |
"exclude_from_index": false, | |
"framework": [ | |
"Pytorch" | |
], | |
"friendly_name": "Train a model with a custom Docker image", | |
"index_order": 1, | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.8.3" | |
}, | |
"nteract": { | |
"version": "[email protected]" | |
}, | |
"tags": [ | |
"None" | |
], | |
"task": "Train with custom Docker image" | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment