Skip to content

Instantly share code, notes, and snippets.

@johnleung8888
Last active February 10, 2022 12:45
Show Gist options
  • Save johnleung8888/f37fc825a3374a93351105fe624e75ed to your computer and use it in GitHub Desktop.
Save johnleung8888/f37fc825a3374a93351105fe624e75ed to your computer and use it in GitHub Desktop.
C2_W2_Lab_1_cats_v_dogs_augmentation.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/johnleung8888/f37fc825a3374a93351105fe624e75ed/c2_w2_lab_1_cats_v_dogs_augmentation.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wHS48OwClVIL"
},
"source": [
"<a href=\"https://colab.research.google.com/github/https-deeplearning-ai/tensorflow-1-public/blob/master/C2/W2/ungraded_labs/C2_W2_Lab_1_cats_v_dogs_augmentation.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gGxCD4mGHHjG"
},
"source": [
"# Ungraded Lab: Data Augmentation\n",
"\n",
"In the previous lessons, you saw that having a high training accuracy does not automatically mean having a good predictive model. It can still perform poorly on new data because it has overfit to the training set. In this lab, you will see how to avoid that using _data augmentation_. This increases the amount of training data by modifying the existing training data's properties. For example, in image data, you can apply different preprocessing techniques such as rotate, flip, shear, or zoom on your existing images so you can simulate other data that the model should also learn from. This way, the model would see more variety in the images during training so it will infer better on new, previously unseen data.\n",
"\n",
"Let's see how you can do this in the following sections."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kJJqX4DxcQs8"
},
"source": [
"## Baseline Performance\n",
"\n",
"You will start with a model that's very effective at learning `Cats vs Dogs` without data augmentation. It's similar to the previous models that you have used. Note that there are four convolutional layers with 32, 64, 128 and 128 convolutions respectively. The code is basically the same from the previous lab so we won't go over the details step by step since you've already seen it before.\n",
"\n",
"You will train only for 20 epochs to save time but feel free to increase this if you want."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "zJZIF29-dIRv"
},
"outputs": [],
"source": [
"# Download the dataset\n",
"!gdown --id 1RL0T7Rg4XqQNRCkjfnLo4goOJQ7XZro9"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_DyUfCTgdwa8"
},
"outputs": [],
"source": [
"import os\n",
"import zipfile\n",
"\n",
"# Extract the archive\n",
"zip_ref = zipfile.ZipFile(\"./cats_and_dogs_filtered.zip\", 'r')\n",
"zip_ref.extractall(\"tmp/\")\n",
"zip_ref.close()\n",
"\n",
"# Assign training and validation set directories\n",
"base_dir = 'tmp/cats_and_dogs_filtered'\n",
"train_dir = os.path.join(base_dir, 'train')\n",
"validation_dir = os.path.join(base_dir, 'validation')\n",
"\n",
"# Directory with training cat pictures\n",
"train_cats_dir = os.path.join(train_dir, 'cats')\n",
"\n",
"# Directory with training dog pictures\n",
"train_dogs_dir = os.path.join(train_dir, 'dogs')\n",
"\n",
"# Directory with validation cat pictures\n",
"validation_cats_dir = os.path.join(validation_dir, 'cats')\n",
"\n",
"# Directory with validation dog pictures\n",
"validation_dogs_dir = os.path.join(validation_dir, 'dogs')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ub_BdOJIfZ_Q"
},
"source": [
"You will place the model creation inside a function so you can easily initialize a new one when you use data augmentation later in this notebook."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "uWllK_Wad-Mx"
},
"outputs": [],
"source": [
"import tensorflow as tf\n",
"from tensorflow.keras.optimizers import RMSprop\n",
"\n",
"def create_model():\n",
" '''Creates a CNN with 4 convolutional layers'''\n",
" model = tf.keras.models.Sequential([\n",
" tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),\n",
" tf.keras.layers.MaxPooling2D(2, 2),\n",
" tf.keras.layers.Conv2D(64, (3,3), activation='relu'),\n",
" tf.keras.layers.MaxPooling2D(2,2),\n",
" tf.keras.layers.Conv2D(128, (3,3), activation='relu'),\n",
" tf.keras.layers.MaxPooling2D(2,2),\n",
" tf.keras.layers.Conv2D(128, (3,3), activation='relu'),\n",
" tf.keras.layers.MaxPooling2D(2,2),\n",
" tf.keras.layers.Flatten(),\n",
" tf.keras.layers.Dense(512, activation='relu'),\n",
" tf.keras.layers.Dense(1, activation='sigmoid')\n",
" ])\n",
"\n",
" model.compile(loss='binary_crossentropy',\n",
" optimizer=RMSprop(learning_rate=1e-4),\n",
" metrics=['accuracy'])\n",
" \n",
" return model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "MJPyDEzOqrKB"
},
"outputs": [],
"source": [
"from tensorflow.keras.preprocessing.image import ImageDataGenerator\n",
"\n",
"# All images will be rescaled by 1./255\n",
"train_datagen = ImageDataGenerator(rescale=1./255)\n",
"test_datagen = ImageDataGenerator(rescale=1./255)\n",
"\n",
"# Flow training images in batches of 20 using train_datagen generator\n",
"train_generator = train_datagen.flow_from_directory(\n",
" train_dir, # This is the source directory for training images\n",
" target_size=(150, 150), # All images will be resized to 150x150\n",
" batch_size=20,\n",
" # Since we use binary_crossentropy loss, we need binary labels\n",
" class_mode='binary')\n",
"\n",
"# Flow validation images in batches of 20 using test_datagen generator\n",
"validation_generator = test_datagen.flow_from_directory(\n",
" validation_dir,\n",
" target_size=(150, 150),\n",
" batch_size=20,\n",
" class_mode='binary')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "hdqUoF44esR3"
},
"outputs": [],
"source": [
"# Constant for epochs\n",
"EPOCHS = 20\n",
"\n",
"# Create a new model\n",
"model = create_model()\n",
"\n",
"# Train the model\n",
"history = model.fit(\n",
" train_generator,\n",
" steps_per_epoch=100, # 2000 images = batch_size * steps\n",
" epochs=EPOCHS,\n",
" validation_data=validation_generator,\n",
" validation_steps=50, # 1000 images = batch_size * steps\n",
" verbose=2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Y-G0Am4cguNt"
},
"source": [
"You will then visualize the loss and accuracy with respect to the training and validation set. You will again use a convenience function so it can be reused later. This function accepts a [History](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/History) object which contains the results of the `fit()` method you ran above."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "GZWPcmKWO303"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"def plot_loss_acc(history):\n",
" '''Plots the training and validation loss and accuracy from a history object'''\n",
" acc = history.history['accuracy']\n",
" val_acc = history.history['val_accuracy']\n",
" loss = history.history['loss']\n",
" val_loss = history.history['val_loss']\n",
"\n",
" epochs = range(len(acc))\n",
"\n",
" plt.plot(epochs, acc, 'bo', label='Training accuracy')\n",
" plt.plot(epochs, val_acc, 'b', label='Validation accuracy')\n",
" plt.title('Training and validation accuracy')\n",
"\n",
" plt.figure()\n",
"\n",
" plt.plot(epochs, loss, 'bo', label='Training Loss')\n",
" plt.plot(epochs, val_loss, 'b', label='Validation Loss')\n",
" plt.title('Training and validation loss')\n",
" plt.legend()\n",
"\n",
" plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Vojz4NYXiT_f"
},
"outputs": [],
"source": [
"# Plot training results\n",
"plot_loss_acc(history)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zb81GvNov-Tg"
},
"source": [
"From the results above, you'll see the training accuracy is more than 90%, and the validation accuracy is in the 70%-80% range. This is a great example of _overfitting_ -- which in short means that it can do very well with images it has seen before, but not so well with images it hasn't.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5KBz-vFbjLZX"
},
"source": [
"## Data augmentation\n",
"\n",
"One simple method to avoid overfitting is to augment the images a bit. If you think about it, most pictures of a cat are very similar -- the ears are at the top, then the eyes, then the mouth etc. Things like the distance between the eyes and ears will always be quite similar too. \n",
"\n",
"What if you tweak with the images a bit -- rotate the image, squash it, etc. That's what image augementation is all about. And there's an API that makes it easy!\n",
"\n",
"Take a look at the [ImageDataGenerator](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator) which you have been using to rescale the image. There are other properties on it that you can use to augment the image. \n",
"\n",
"```\n",
"# Updated to do image augmentation\n",
"train_datagen = ImageDataGenerator(\n",
" rotation_range=40,\n",
" width_shift_range=0.2,\n",
" height_shift_range=0.2,\n",
" shear_range=0.2,\n",
" zoom_range=0.2,\n",
" horizontal_flip=True,\n",
" fill_mode='nearest')\n",
"```\n",
"\n",
"These are just a few of the options available. Let's quickly go over it:\n",
"\n",
"* `rotation_range` is a value in degrees (0–180) within which to randomly rotate pictures.\n",
"* `width_shift` and `height_shift` are ranges (as a fraction of total width or height) within which to randomly translate pictures vertically or horizontally.\n",
"* `shear_range` is for randomly applying shearing transformations.\n",
"* `zoom_range` is for randomly zooming inside pictures.\n",
"* `horizontal_flip` is for randomly flipping half of the images horizontally. This is relevant when there are no assumptions of horizontal assymmetry (e.g. real-world pictures).\n",
"* `fill_mode` is the strategy used for filling in newly created pixels, which can appear after a rotation or a width/height shift.\n",
"\n",
"\n",
"Run the next cells to see the impact on the results. The code is similar to the baseline but the definition of `train_datagen` has been updated to use the parameters described above.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UK7_Fflgv8YC"
},
"outputs": [],
"source": [
"# Create new model\n",
"model_for_aug = create_model()\n",
"\n",
"# This code has changed. Now instead of the ImageGenerator just rescaling\n",
"# the image, we also rotate and do other operations\n",
"train_datagen = ImageDataGenerator(\n",
" rescale=1./255,\n",
" rotation_range=40,\n",
" width_shift_range=0.2,\n",
" height_shift_range=0.2,\n",
" shear_range=0.2,\n",
" zoom_range=0.2,\n",
" horizontal_flip=True,\n",
" fill_mode='nearest')\n",
"\n",
"test_datagen = ImageDataGenerator(rescale=1./255)\n",
"\n",
"# Flow training images in batches of 20 using train_datagen generator\n",
"train_generator = train_datagen.flow_from_directory(\n",
" train_dir, # This is the source directory for training images\n",
" target_size=(150, 150), # All images will be resized to 150x150\n",
" batch_size=20,\n",
" # Since we use binary_crossentropy loss, we need binary labels\n",
" class_mode='binary')\n",
"\n",
"# Flow validation images in batches of 20 using test_datagen generator\n",
"validation_generator = test_datagen.flow_from_directory(\n",
" validation_dir,\n",
" target_size=(150, 150),\n",
" batch_size=20,\n",
" class_mode='binary')\n",
"\n",
"# Train the new model\n",
"history_with_aug = model_for_aug.fit(\n",
" train_generator,\n",
" steps_per_epoch=100, # 2000 images = batch_size * steps\n",
" epochs=EPOCHS,\n",
" validation_data=validation_generator,\n",
" validation_steps=50, # 1000 images = batch_size * steps\n",
" verbose=2)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bnyRnwopT5aW"
},
"outputs": [],
"source": [
"# Plot the results of training with data augmentation\n",
"plot_loss_acc(history_with_aug)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1D1hd5fqmJUx"
},
"source": [
"As you can see, the training accuracy has gone down compared to the baseline. This is expected because (as a result of data augmentation) there are more variety in the images so the model will need more runs to learn from them. The good thing is the validation accuracy is no longer stalling and is more in line with the training results. This means that the model is now performing better on unseen data. \n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "z4B9b6GPnKg1"
},
"source": [
"## Wrap Up\n",
"\n",
"This exercise showed a simple trick to avoid overfitting. You can improve your baseline results by simply tweaking the same images you have already. The `ImageDataGenerator` class has built-in parameters to do just that. Try to modify the values some more in the `train_datagen` and see what results you get.\n",
"\n",
"Take note that this will not work for all cases. In the next lesson, Laurence will show a scenario where data augmentation will not help improve your validation accuracy."
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "C2_W2_Lab_1_cats_v_dogs_augmentation.ipynb",
"private_outputs": true,
"provenance": [],
"include_colab_link": true
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment