Skip to content

Instantly share code, notes, and snippets.

@johnleung8888
Last active October 24, 2023 17:13
Show Gist options
  • Save johnleung8888/f061853e73c535ddd4a5965c0b7d5a23 to your computer and use it in GitHub Desktop.
Save johnleung8888/f061853e73c535ddd4a5965c0b7d5a23 to your computer and use it in GitHub Desktop.
C2W2_Assignment.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/johnleung8888/f061853e73c535ddd4a5965c0b7d5a23/c2w2_assignment.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AuW-xg_bTsaF"
},
"source": [
"# Week 2: Tackle Overfitting with Data Augmentation\n",
"\n",
"Welcome to this assignment! As in the previous week, you will be using the famous `cats vs dogs` dataset to train a model that can classify images of dogs from images of cats. For this, you will create your own Convolutional Neural Network in Tensorflow and leverage Keras' image preprocessing utilities, more so this time around since Keras provides excellent support for augmenting image data.\n",
"\n",
"You will also need to create the helper functions to move the images around the filesystem as you did last week, so if you need to refresh your memory with the `os` module be sure to take a look a the [docs](https://docs.python.org/3/library/os.html).\n",
"\n",
"Let's get started!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dn-6c02VmqiN"
},
"outputs": [],
"source": [
"import os\n",
"import zipfile\n",
"import random\n",
"import shutil\n",
"import tensorflow as tf\n",
"from tensorflow.keras.preprocessing.image import ImageDataGenerator\n",
"from shutil import copyfile\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bLTQd84RUs1j"
},
"source": [
"Download the dataset from its original source by running the cell below. \n",
"\n",
"Note that the `zip` file that contains the images is unzipped under the `/tmp` directory."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "3sd9dQWa23aj",
"lines_to_next_cell": 2,
"outputId": "c5dc3992-3bdc-4ae1-e49e-115c4c87faf4",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"--2022-02-15 05:42:59-- https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_3367a.zip\n",
"Resolving download.microsoft.com (download.microsoft.com)... 184.51.220.111, 2600:1407:a800:2ae::e59, 2600:1407:a800:280::e59\n",
"Connecting to download.microsoft.com (download.microsoft.com)|184.51.220.111|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 824894548 (787M) [application/octet-stream]\n",
"Saving to: ‘/tmp/cats-and-dogs.zip’\n",
"\n",
"/tmp/cats-and-dogs. 100%[===================>] 786.68M 93.4MB/s in 8.5s \n",
"\n",
"2022-02-15 05:43:08 (92.4 MB/s) - ‘/tmp/cats-and-dogs.zip’ saved [824894548/824894548]\n",
"\n"
]
}
],
"source": [
"# If the URL doesn't work, visit https://www.microsoft.com/en-us/download/confirmation.aspx?id=54765\n",
"# And right click on the 'Download Manually' link to get a new URL to the dataset\n",
"\n",
"# Note: This is a very large dataset and will take some time to download\n",
"\n",
"!wget --no-check-certificate \\\n",
" \"https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_3367a.zip\" \\\n",
" -O \"/tmp/cats-and-dogs.zip\"\n",
"\n",
"local_zip = '/tmp/cats-and-dogs.zip'\n",
"zip_ref = zipfile.ZipFile(local_zip, 'r')\n",
"zip_ref.extractall('/tmp')\n",
"zip_ref.close()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "e_HsUV9WVJHL"
},
"source": [
"Now the images are stored within the `/tmp/PetImages` directory. There is a subdirectory for each class, so one for dogs and one for cats."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "DM851ZmN28J3",
"outputId": "83398339-021e-4986-8071-45c82b585e7d",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"There are 12501 images of dogs.\n",
"There are 12501 images of cats.\n"
]
}
],
"source": [
"source_path = '/tmp/PetImages'\n",
"\n",
"source_path_dogs = os.path.join(source_path, 'Dog')\n",
"source_path_cats = os.path.join(source_path, 'Cat')\n",
"\n",
"\n",
"# os.listdir returns a list containing all files under the given path\n",
"print(f\"There are {len(os.listdir(source_path_dogs))} images of dogs.\")\n",
"print(f\"There are {len(os.listdir(source_path_cats))} images of cats.\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "G7dI86rmRGmC"
},
"source": [
"**Expected Output:**\n",
"\n",
"```\n",
"There are 12501 images of dogs.\n",
"There are 12501 images of cats.\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iFbMliudNIjW"
},
"source": [
"You will need a directory for cats-v-dogs, and subdirectories for training\n",
"and testing. These in turn will need subdirectories for 'cats' and 'dogs'. To accomplish this, complete the `create_train_test_dirs` below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "code",
"id": "F-QkLjxpmyK2"
},
"outputs": [],
"source": [
"# Define root directory\n",
"root_dir = '/tmp/cats-v-dogs'\n",
"\n",
"# Empty directory to prevent FileExistsError is the function is run several times\n",
"if os.path.exists(root_dir):\n",
" shutil.rmtree(root_dir)\n",
"\n",
"# GRADED FUNCTION: create_train_test_dirs\n",
"def create_train_test_dirs(root_path):\n",
" ### START CODE HERE\n",
" path = os.path.join(root_dir, \"training\")\n",
" os.makedirs(path)\n",
" path_1 = os.path.join(path, \"cats\")\n",
" os.makedirs(path_1)\n",
" path_2 = os.path.join(path, \"dogs\")\n",
" os.makedirs(path_2)\n",
" path = os.path.join(root_dir, \"testing\")\n",
" os.makedirs(path)\n",
" path_3 = os.path.join(path, \"cats\")\n",
" os.makedirs(path_3)\n",
" path_4 = os.path.join(path, \"dogs\")\n",
" os.makedirs(path_4)\n",
" \n",
" # HINT:\n",
" # Use os.makedirs to create your directories with intermediate subdirectories\n",
"\n",
" pass\n",
" \n",
" ### END CODE HERE\n",
"\n",
" \n",
"try:\n",
" create_train_test_dirs(root_path=root_dir)\n",
"except FileExistsError:\n",
" print(\"You should not be seeing this since the upper directory is removed beforehand\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5dhtL344OK00",
"outputId": "f9c9bb79-b2c3-4eb1-e215-fdd70ad58eb9",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"/tmp/cats-v-dogs/testing\n",
"/tmp/cats-v-dogs/training\n",
"/tmp/cats-v-dogs/testing/cats\n",
"/tmp/cats-v-dogs/testing/dogs\n",
"/tmp/cats-v-dogs/training/cats\n",
"/tmp/cats-v-dogs/training/dogs\n"
]
}
],
"source": [
"# Test your create_train_test_dirs function\n",
"\n",
"for rootdir, dirs, files in os.walk(root_dir):\n",
" for subdir in dirs:\n",
" print(os.path.join(rootdir, subdir))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "D7A0RK3IQsvg"
},
"source": [
"**Expected Output (directory order might vary):**\n",
"\n",
"``` txt\n",
"/tmp/cats-v-dogs/training\n",
"/tmp/cats-v-dogs/testing\n",
"/tmp/cats-v-dogs/training/cats\n",
"/tmp/cats-v-dogs/training/dogs\n",
"/tmp/cats-v-dogs/testing/cats\n",
"/tmp/cats-v-dogs/testing/dogs\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "R93T7HdE5txZ"
},
"source": [
"Code the `split_data` function which takes in the following arguments:\n",
"- SOURCE: directory containing the files\n",
"\n",
"- TRAINING: directory that a portion of the files will be copied to (will be used for training)\n",
"- TESTING: directory that a portion of the files will be copied to (will be used for testing)\n",
"- SPLIT SIZE: to determine the portion\n",
"\n",
"The files should be randomized, so that the training set is a random sample of the files, and the test set is made up of the remaining files.\n",
"\n",
"For example, if `SOURCE` is `PetImages/Cat`, and `SPLIT` SIZE is .9 then 90% of the images in `PetImages/Cat` will be copied to the `TRAINING` dir\n",
"and 10% of the images will be copied to the `TESTING` dir.\n",
"\n",
"All images should be checked before the copy, so if they have a zero file length, they will be omitted from the copying process. If this is the case then your function should print out a message such as `\"filename is zero length, so ignoring.\"`. **You should perform this check before the split so that only non-zero images are considered when doing the actual split.**\n",
"\n",
"\n",
"Hints:\n",
"\n",
"- `os.listdir(DIRECTORY)` returns a list with the contents of that directory.\n",
"\n",
"- `os.path.getsize(PATH)` returns the size of the file\n",
"\n",
"- `copyfile(source, destination)` copies a file from source to destination\n",
"\n",
"- `random.sample(list, len(list))` shuffles a list"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "code",
"id": "zvSODo0f9LaU"
},
"outputs": [],
"source": [
"# GRADED FUNCTION: split_data\n",
"def split_data(SOURCE, TRAINING, TESTING, SPLIT_SIZE):\n",
"\n",
" ### START CODE HERE\n",
" files = []\n",
" for filename in os.listdir(SOURCE):\n",
" file = SOURCE + filename\n",
" if os.path.getsize(file) > 0:\n",
" files.append(filename)\n",
" else:\n",
" print(filename + ' is zero length, so ignoring.')\n",
"\n",
" training_length = int(len(files) * SPLIT_SIZE)\n",
" testing_length = int(len(files) - training_length)\n",
" shuffled_set = random.sample(files, len(files))\n",
" training_set = shuffled_set[0:training_length]\n",
" testing_set = shuffled_set[-testing_length:]\n",
" \n",
" for filename in training_set:\n",
" src_file = SOURCE + filename\n",
" dest_file = TRAINING + filename\n",
" copyfile(src_file, dest_file)\n",
" \n",
" for filename in testing_set:\n",
" src_file = SOURCE + filename\n",
" dest_file = TESTING + filename\n",
" copyfile(src_file, dest_file)\n",
" pass\n",
"\n",
" ### END CODE HERE\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "FlIdoUeX9S-9",
"outputId": "859c9df3-d2b0-4df3-ded7-9492386b6ecb",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"666.jpg is zero length, so ignoring.\n",
"11702.jpg is zero length, so ignoring.\n",
"\n",
"\n",
"There are 11250 images of cats for training\n",
"There are 11250 images of dogs for training\n",
"There are 1250 images of cats for testing\n",
"There are 1250 images of dogs for testing\n"
]
}
],
"source": [
"# Test your split_data function\n",
"\n",
"# Define paths\n",
"CAT_SOURCE_DIR = \"/tmp/PetImages/Cat/\"\n",
"DOG_SOURCE_DIR = \"/tmp/PetImages/Dog/\"\n",
"\n",
"TRAINING_DIR = \"/tmp/cats-v-dogs/training/\"\n",
"TESTING_DIR = \"/tmp/cats-v-dogs/testing/\"\n",
"\n",
"TRAINING_CATS_DIR = os.path.join(TRAINING_DIR, \"cats/\")\n",
"TESTING_CATS_DIR = os.path.join(TESTING_DIR, \"cats/\")\n",
"\n",
"TRAINING_DOGS_DIR = os.path.join(TRAINING_DIR, \"dogs/\")\n",
"TESTING_DOGS_DIR = os.path.join(TESTING_DIR, \"dogs/\")\n",
"\n",
"# Empty directories in case you run this cell multiple times\n",
"if len(os.listdir(TRAINING_CATS_DIR)) > 0:\n",
" for file in os.scandir(TRAINING_CATS_DIR):\n",
" os.remove(file.path)\n",
"if len(os.listdir(TRAINING_DOGS_DIR)) > 0:\n",
" for file in os.scandir(TRAINING_DOGS_DIR):\n",
" os.remove(file.path)\n",
"if len(os.listdir(TESTING_CATS_DIR)) > 0:\n",
" for file in os.scandir(TESTING_CATS_DIR):\n",
" os.remove(file.path)\n",
"if len(os.listdir(TESTING_DOGS_DIR)) > 0:\n",
" for file in os.scandir(TESTING_DOGS_DIR):\n",
" os.remove(file.path)\n",
"\n",
"# Define proportion of images used for training\n",
"split_size = .9\n",
"\n",
"# Run the function\n",
"# NOTE: Messages about zero length images should be printed out\n",
"split_data(CAT_SOURCE_DIR, TRAINING_CATS_DIR, TESTING_CATS_DIR, split_size)\n",
"split_data(DOG_SOURCE_DIR, TRAINING_DOGS_DIR, TESTING_DOGS_DIR, split_size)\n",
"\n",
"# Check that the number of images matches the expected output\n",
"print(f\"\\n\\nThere are {len(os.listdir(TRAINING_CATS_DIR))} images of cats for training\")\n",
"print(f\"There are {len(os.listdir(TRAINING_DOGS_DIR))} images of dogs for training\")\n",
"print(f\"There are {len(os.listdir(TESTING_CATS_DIR))} images of cats for testing\")\n",
"print(f\"There are {len(os.listdir(TESTING_DOGS_DIR))} images of dogs for testing\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hvskJNOFVSaz"
},
"source": [
"**Expected Output:**\n",
"\n",
"```\n",
"666.jpg is zero length, so ignoring.\n",
"11702.jpg is zero length, so ignoring.\n",
"```\n",
"\n",
"```\n",
"There are 11250 images of cats for training\n",
"There are 11250 images of dogs for training\n",
"There are 1250 images of cats for testing\n",
"There are 1250 images of dogs for testing\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Zil4QmOD_mXF"
},
"source": [
"Now that you have successfully organized the data in a way that can be easily fed to Keras' `ImageDataGenerator`, it is time for you to code the generators that will yield batches of images, both for training and validation. For this, complete the `train_val_generators` function below.\n",
"\n",
"Something important to note is that the images in this dataset come in a variety of resolutions. Luckily, the `flow_from_directory` method allows you to standarize this by defining a tuple called `target_size` that will be used to convert each image to this target resolution. **For this exercise use a `target_size` of (150, 150)**.\n",
"\n",
"**Note:** So far, you have seen the term `testing` being used a lot for referring to a subset of images within the dataset. In this exercise, all of the `testing` data is actually being used as `validation` data. This is not very important within the context of the task at hand but it is worth mentioning to avoid confusion."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "code",
"id": "fQrZfVgz4j2g"
},
"outputs": [],
"source": [
"# GRADED FUNCTION: train_val_generators\n",
"def train_val_generators(TRAINING_DIR, VALIDATION_DIR):\n",
" ### START CODE HERE\n",
"\n",
" # Instantiate the ImageDataGenerator class (don't forget to set the arguments to augment the images)\n",
" \n",
" train_datagen = ImageDataGenerator(rescale=1.0/255.,\n",
" rotation_range=40,\n",
" width_shift_range=0.2,\n",
" height_shift_range=0.2,\n",
" shear_range=0.2,\n",
" zoom_range=0.2,\n",
" horizontal_flip=True,\n",
" fill_mode='nearest')\n",
"\n",
" # Pass in the appropriate arguments to the flow_from_directory method\n",
" train_generator = train_datagen.flow_from_directory(directory=TRAINING_DIR,\n",
" batch_size=128,\n",
" class_mode='binary',\n",
" target_size=(150, 150))\n",
"\n",
" # Instantiate the ImageDataGenerator class (don't forget to set the rescale argument)\n",
" validation_datagen = ImageDataGenerator(rescale=1.0/255.)\n",
"\n",
" # Pass in the appropriate arguments to the flow_from_directory method\n",
" validation_generator = validation_datagen.flow_from_directory(directory=VALIDATION_DIR,\n",
" batch_size=16,\n",
" class_mode='binary',\n",
" target_size=(150, 150))\n",
" ### END CODE HERE\n",
" return train_generator, validation_generator\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "qM7FxrjGiobD",
"outputId": "e83f12ce-adbc-4dc5-a979-52b5975f5fa4",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Found 22498 images belonging to 2 classes.\n",
"Found 2500 images belonging to 2 classes.\n"
]
}
],
"source": [
"# Test your generators\n",
"train_generator, validation_generator = train_val_generators(TRAINING_DIR, TESTING_DIR)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tiPNmSfZjHwJ"
},
"source": [
"**Expected Output:**\n",
"\n",
"```\n",
"Found 22498 images belonging to 2 classes.\n",
"Found 2500 images belonging to 2 classes.\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TI3oEmyQCZoO"
},
"source": [
"One last step before training is to define the architecture of the model that will be trained.\n",
"\n",
"Complete the `create_model` function below which should return a Keras' `Sequential` model.\n",
"\n",
"Aside from defining the architecture of the model, you should also compile it so make sure to use a `loss` function that is compatible with the `class_mode` you defined in the previous exercise, which should also be compatible with the output of your network. You can tell if they aren't compatible if you get an error during training.\n",
"\n",
"**Note that you should use at least 3 convolution layers to achieve the desired performance.**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "code",
"id": "oDPK8tUB_O9e",
"lines_to_next_cell": 2
},
"outputs": [],
"source": [
"# GRADED FUNCTION: create_model\n",
"def create_model():\n",
" # DEFINE A KERAS MODEL TO CLASSIFY CATS V DOGS\n",
" # USE AT LEAST 3 CONVOLUTION LAYERS\n",
"\n",
" ### START CODE HERE\n",
"\n",
" model = tf.keras.models.Sequential([ \n",
" tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(150, 150, 3)),\n",
" tf.keras.layers.MaxPooling2D(2, 2),\n",
" tf.keras.layers.Conv2D(32, (3,3), activation='relu'),\n",
" tf.keras.layers.MaxPooling2D(2,2),\n",
" tf.keras.layers.Conv2D(64, (3,3), activation='relu'),\n",
" tf.keras.layers.MaxPooling2D(2,2),\n",
" tf.keras.layers.Conv2D(64, (3,3), activation='relu'),\n",
" tf.keras.layers.MaxPooling2D(2,2),\n",
" tf.keras.layers.Conv2D(64, (3,3), activation='relu'),\n",
" tf.keras.layers.MaxPooling2D(2,2),\n",
" tf.keras.layers.Flatten(),\n",
" # 512 neuron hidden layer\n",
" tf.keras.layers.Dense(512, activation='relu'),\n",
" # Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class ('horses') and 1 for the other ('humans')\n",
" tf.keras.layers.Dense(1, activation='sigmoid')\n",
" ])\n",
"\n",
" \n",
" model.compile(loss='binary_crossentropy',\n",
" optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001),\n",
" metrics=['accuracy']) \n",
" \n",
" ### END CODE HERE\n",
"\n",
" return model\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SMFNJZmTCZv6"
},
"source": [
"Now it is time to train your model!\n",
"\n",
"Note: You can ignore the `UserWarning: Possibly corrupt EXIF data.` warnings."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5qE1G6JB4fMn",
"outputId": "70c05564-85a4-48ed-d24e-559eb96cc29f",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Epoch 1/15\n",
" 41/176 [=====>........................] - ETA: 2:25 - loss: 0.6949 - accuracy: 0.5171"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"/usr/local/lib/python3.7/dist-packages/PIL/TiffImagePlugin.py:770: UserWarning: Possibly corrupt EXIF data. Expecting to read 32 bytes but only got 0. Skipping tag 270\n",
" \" Skipping tag %s\" % (size, len(data), tag)\n",
"/usr/local/lib/python3.7/dist-packages/PIL/TiffImagePlugin.py:770: UserWarning: Possibly corrupt EXIF data. Expecting to read 5 bytes but only got 0. Skipping tag 271\n",
" \" Skipping tag %s\" % (size, len(data), tag)\n",
"/usr/local/lib/python3.7/dist-packages/PIL/TiffImagePlugin.py:770: UserWarning: Possibly corrupt EXIF data. Expecting to read 8 bytes but only got 0. Skipping tag 272\n",
" \" Skipping tag %s\" % (size, len(data), tag)\n",
"/usr/local/lib/python3.7/dist-packages/PIL/TiffImagePlugin.py:770: UserWarning: Possibly corrupt EXIF data. Expecting to read 8 bytes but only got 0. Skipping tag 282\n",
" \" Skipping tag %s\" % (size, len(data), tag)\n",
"/usr/local/lib/python3.7/dist-packages/PIL/TiffImagePlugin.py:770: UserWarning: Possibly corrupt EXIF data. Expecting to read 8 bytes but only got 0. Skipping tag 283\n",
" \" Skipping tag %s\" % (size, len(data), tag)\n",
"/usr/local/lib/python3.7/dist-packages/PIL/TiffImagePlugin.py:770: UserWarning: Possibly corrupt EXIF data. Expecting to read 20 bytes but only got 0. Skipping tag 306\n",
" \" Skipping tag %s\" % (size, len(data), tag)\n",
"/usr/local/lib/python3.7/dist-packages/PIL/TiffImagePlugin.py:770: UserWarning: Possibly corrupt EXIF data. Expecting to read 48 bytes but only got 0. Skipping tag 532\n",
" \" Skipping tag %s\" % (size, len(data), tag)\n",
"/usr/local/lib/python3.7/dist-packages/PIL/TiffImagePlugin.py:788: UserWarning: Corrupt EXIF data. Expecting to read 2 bytes but only got 0. \n",
" warnings.warn(str(msg))\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"176/176 [==============================] - 202s 1s/step - loss: 0.6812 - accuracy: 0.5660 - val_loss: 0.6442 - val_accuracy: 0.6196\n",
"Epoch 2/15\n",
"176/176 [==============================] - 200s 1s/step - loss: 0.6515 - accuracy: 0.6202 - val_loss: 0.6339 - val_accuracy: 0.6108\n",
"Epoch 3/15\n",
"176/176 [==============================] - 200s 1s/step - loss: 0.6235 - accuracy: 0.6516 - val_loss: 0.5921 - val_accuracy: 0.6892\n",
"Epoch 4/15\n",
"176/176 [==============================] - 201s 1s/step - loss: 0.6007 - accuracy: 0.6729 - val_loss: 0.5678 - val_accuracy: 0.7076\n",
"Epoch 5/15\n",
"176/176 [==============================] - 201s 1s/step - loss: 0.5808 - accuracy: 0.6959 - val_loss: 0.5291 - val_accuracy: 0.7304\n",
"Epoch 6/15\n",
"176/176 [==============================] - 200s 1s/step - loss: 0.5627 - accuracy: 0.7114 - val_loss: 0.5018 - val_accuracy: 0.7624\n",
"Epoch 7/15\n",
"176/176 [==============================] - 201s 1s/step - loss: 0.5448 - accuracy: 0.7242 - val_loss: 0.4670 - val_accuracy: 0.7784\n",
"Epoch 8/15\n",
"176/176 [==============================] - 201s 1s/step - loss: 0.5298 - accuracy: 0.7358 - val_loss: 0.4790 - val_accuracy: 0.7660\n",
"Epoch 9/15\n",
"176/176 [==============================] - 201s 1s/step - loss: 0.5205 - accuracy: 0.7397 - val_loss: 0.4406 - val_accuracy: 0.7896\n",
"Epoch 10/15\n",
"176/176 [==============================] - 201s 1s/step - loss: 0.5016 - accuracy: 0.7544 - val_loss: 0.4138 - val_accuracy: 0.8024\n",
"Epoch 11/15\n",
"176/176 [==============================] - 201s 1s/step - loss: 0.4826 - accuracy: 0.7627 - val_loss: 0.3815 - val_accuracy: 0.8300\n",
"Epoch 12/15\n",
"176/176 [==============================] - 201s 1s/step - loss: 0.4647 - accuracy: 0.7775 - val_loss: 0.4196 - val_accuracy: 0.8072\n",
"Epoch 13/15\n",
"176/176 [==============================] - 200s 1s/step - loss: 0.4542 - accuracy: 0.7849 - val_loss: 0.3984 - val_accuracy: 0.8316\n",
"Epoch 14/15\n",
"176/176 [==============================] - 200s 1s/step - loss: 0.4277 - accuracy: 0.7974 - val_loss: 0.3594 - val_accuracy: 0.8340\n",
"Epoch 15/15\n",
"176/176 [==============================] - 201s 1s/step - loss: 0.4195 - accuracy: 0.8057 - val_loss: 0.3319 - val_accuracy: 0.8536\n"
]
}
],
"source": [
"# Get the untrained model\n",
"model = create_model()\n",
"\n",
"# Train the model\n",
"# Note that this may take some time.\n",
"history = model.fit(train_generator,\n",
" epochs=15,\n",
" verbose=1,\n",
" validation_data=validation_generator)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VGsaDMc-GMd4"
},
"source": [
"Once training has finished, you can run the following cell to check the training and validation accuracy achieved at the end of each epoch.\n",
"\n",
"**To pass this assignment, your model should achieve a training and validation accuracy of at least 80% and the final testing accuracy should be either higher than the training one or have a 5% difference at maximum**. If your model didn't achieve these thresholds, try training again with a different model architecture, remember to use at least 3 convolutional layers or try tweaking the image augmentation process.\n",
"\n",
"You might wonder why the training threshold to pass this assignment is significantly lower compared to last week's assignment. Image augmentation does help with overfitting but usually this comes at the expense of requiring more training time. To keep the training time reasonable, the same number of epochs as in the previous assignment are kept. \n",
"\n",
"However, as an optional exercise you are encouraged to try training for more epochs and to achieve really good training and validation accuracies."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "MWZrJN4-65RC",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 546
},
"outputId": "bb1cb450-3b9d-4072-8480-c6333f9ad9e3"
},
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"\n"
]
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
],
"source": [
"#-----------------------------------------------------------\n",
"# Retrieve a list of list results on training and test data\n",
"# sets for each training epoch\n",
"#-----------------------------------------------------------\n",
"acc=history.history['accuracy']\n",
"val_acc=history.history['val_accuracy']\n",
"loss=history.history['loss']\n",
"val_loss=history.history['val_loss']\n",
"\n",
"epochs=range(len(acc)) # Get number of epochs\n",
"\n",
"#------------------------------------------------\n",
"# Plot training and validation accuracy per epoch\n",
"#------------------------------------------------\n",
"plt.plot(epochs, acc, 'r', \"Training Accuracy\")\n",
"plt.plot(epochs, val_acc, 'b', \"Validation Accuracy\")\n",
"plt.title('Training and validation accuracy')\n",
"plt.show()\n",
"print(\"\")\n",
"\n",
"#------------------------------------------------\n",
"# Plot training and validation loss per epoch\n",
"#------------------------------------------------\n",
"plt.plot(epochs, loss, 'r', \"Training Loss\")\n",
"plt.plot(epochs, val_loss, 'b', \"Validation Loss\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NYIaqsN2pav6"
},
"source": [
"You will probably encounter that the model is overfitting, which means that it is doing a great job at classifying the images in the training set but struggles with new data. This is perfectly fine and you will learn how to mitigate this issue in the upcomming week.\n",
"\n",
"Before closing the assignment, be sure to also download the `history.pkl` file which contains the information of the training history of your model. You can download this file by running the cell below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "yWcrc9nZTsHj",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 17
},
"outputId": "554362a0-0886-40f0-e548-a234659c089d"
},
"outputs": [
{
"output_type": "display_data",
"data": {
"application/javascript": [
"\n",
" async function download(id, filename, size) {\n",
" if (!google.colab.kernel.accessAllowed) {\n",
" return;\n",
" }\n",
" const div = document.createElement('div');\n",
" const label = document.createElement('label');\n",
" label.textContent = `Downloading \"${filename}\": `;\n",
" div.appendChild(label);\n",
" const progress = document.createElement('progress');\n",
" progress.max = size;\n",
" div.appendChild(progress);\n",
" document.body.appendChild(div);\n",
"\n",
" const buffers = [];\n",
" let downloaded = 0;\n",
"\n",
" const channel = await google.colab.kernel.comms.open(id);\n",
" // Send a message to notify the kernel that we're ready.\n",
" channel.send({})\n",
"\n",
" for await (const message of channel.messages) {\n",
" // Send a message to notify the kernel that we're ready.\n",
" channel.send({})\n",
" if (message.buffers) {\n",
" for (const buffer of message.buffers) {\n",
" buffers.push(buffer);\n",
" downloaded += buffer.byteLength;\n",
" progress.value = downloaded;\n",
" }\n",
" }\n",
" }\n",
" const blob = new Blob(buffers, {type: 'application/binary'});\n",
" const a = document.createElement('a');\n",
" a.href = window.URL.createObjectURL(blob);\n",
" a.download = filename;\n",
" div.appendChild(a);\n",
" a.click();\n",
" div.remove();\n",
" }\n",
" "
],
"text/plain": [
"<IPython.core.display.Javascript object>"
]
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"application/javascript": [
"download(\"download_18f50ab6-831e-41ab-b7c4-9175ac5e4e59\", \"history_augmented.pkl\", 628)"
],
"text/plain": [
"<IPython.core.display.Javascript object>"
]
},
"metadata": {}
}
],
"source": [
"def download_history():\n",
" import pickle\n",
" from google.colab import files\n",
"\n",
" with open('history_augmented.pkl', 'wb') as f:\n",
" pickle.dump(history.history, f)\n",
"\n",
" files.download('history_augmented.pkl')\n",
"\n",
"download_history()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yEj7UVe0OgMq"
},
"source": [
"You will also need to submit this notebook for grading. To download it, click on the `File` tab in the upper left corner of the screen then click on `Download` -> `Download .ipynb`. You can name it anything you want as long as it is a valid `.ipynb` (jupyter notebook) file."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "joAaZSWWpbOI"
},
"source": [
"**Congratulations on finishing this week's assignment!**\n",
"\n",
"You have successfully implemented a convolutional neural network that classifies images of cats and dogs, along with the helper functions needed to pre-process the images!\n",
"\n",
"**Keep it up!**"
]
}
],
"metadata": {
"accelerator": "GPU",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
},
"colab": {
"name": "C2W2_Assignment.ipynb",
"provenance": [],
"include_colab_link": true
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment