johnleung8888 · July 21, 2021 08:41
diff --git a/linear-regression-with-synthetic-data.ipynb b/linear-regression-with-synthetic-data.ipynb
 {
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Linear Regression with Synthetic Data.ipynb",
      "private_outputs": true,
      "provenance": [],
      "collapsed_sections": [],
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/gist/johnleung8888/63853a20ea6e97bff95a41dfc32ee4d1/linear-regression-with-synthetic-data.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "wDlWLbfkJtvu",
        "cellView": "form"
      },
      "source": [
        "#@title Copyright 2020 Google LLC. Double-click here for license information.\n",
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TL5y5fY9Jy_x"
      },
      "source": [
        "# Simple Linear Regression with Synthetic Data\n",
        "\n",
        "In this first Colab, you'll explore linear regression with a simple database. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DZz29MwGgE2Y"
      },
      "source": [
        "## Learning objectives:\n",
        "\n",
        "After doing this exercise, you'll know how to do the following:\n",
        "\n",
        "  * Run Colabs.\n",
        "  * Tune the following [hyperparameters](https://developers.google.com/machine-learning/glossary/#hyperparameter):\n",
        "    * [learning rate](https://developers.google.com/machine-learning/glossary/#learning_rate)\n",
        "    * number of [epochs](https://developers.google.com/machine-learning/glossary/#epoch)\n",
        "    * [batch size](https://developers.google.com/machine-learning/glossary/#batch_size)\n",
        "  * Interpret different kinds of [loss curves](https://developers.google.com/machine-learning/glossary/#loss_curve)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wm3K3dHlf_1n"
      },
      "source": [
        "## About Colabs\n",
        "\n",
        "Machine Learning Crash Course uses Colaboratories (**Colabs**) for all programming exercises. Colab is Google's implementation of [Jupyter Notebook](https://jupyter.org/). Like all Jupyter Notebooks, a Colab consists of two kinds of components:\n",
        "\n",
        "  * **Text cells**, which contain explanations. You are currently reading a text cell.\n",
        "  * **Code cells**, which contain Python code for you to run. Code cells have a light gray background.\n",
        "\n",
        "You *read* the text cells and *run* the code cells."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qc3e5Jbbf5aA"
      },
      "source": [
        "### Running code cells\n",
        "\n",
        "You must run code cells in order. In other words, you may only run a code cell once all the code cells preceding it have already been run. \n",
        "\n",
        "To run a code cell:\n",
        "\n",
        "  1. Place the cursor anywhere inside the [ ] area at the top left of a code cell. The area inside the [ ] will display an arrow.\n",
        "  2. Click the arrow.\n",
        "\n",
        "Alternatively, you may invoke **Runtime->Run all**. Note, though, that some of the code cells will fail because not all the coding is complete. (You'll complete the coding as part of the exercise.)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Sa35idJHf0__"
      },
      "source": [
        "### Understanding hidden code cells\n",
        "\n",
        "We've **hidden** the code in code cells that don't advance the learning objectives. For example, we've hidden the code that plots graphs. However, **you must still run code cells containing hidden code**. You'll know that the code is hidden because you'll see a title (for example, \"Load the functions that build and train a model\") without seeing the code.\n",
        "\n",
        "To view the hidden code, just double click the header."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8H508dWUfvY1"
      },
      "source": [
        "### Why did you see an error?\n",
        "\n",
        "If a code cell returns an error when you run it, consider two common problems:\n",
        "\n",
        "  * You didn't run *all* of the code cells preceding the current code cell.\n",
        "  * If the code cell is labeled as a **Task**, then you haven't written the necessary code. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bMr7MPVmoiHf"
      },
      "source": [
        "## Use the right version of TensorFlow\n",
        "\n",
        "The following hidden code cell ensures that the Colab will run on TensorFlow 2.X, which is the most recent version of TensorFlow:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Z1pOWL7eevO8"
      },
      "source": [
        "#@title Run this Colab on TensorFlow 2.x\n",
        "%tensorflow_version 2.x"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xchnxAsaKKqO"
      },
      "source": [
        "## Import relevant modules\n",
        "\n",
        "The following cell imports the packages that the program requires:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "9n9_cTveKmse"
      },
      "source": [
        "import pandas as pd\n",
        "import tensorflow as tf\n",
        "from matplotlib import pyplot as plt"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SIpsyJITPcbG"
      },
      "source": [
        "## Define functions that build and train a model\n",
        "\n",
        "The following code defines two functions:\n",
        "\n",
        "  * `build_model(my_learning_rate)`, which builds an empty model.\n",
        "  * `train_model(model, feature, label, epochs)`, which trains the model from the examples (feature and label) you pass. \n",
        "\n",
        "Since you don't need to understand model building code right now, we've hidden this code cell.  You may optionally double-click the headline to explore this code."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "xvO_beKVP1Ke",
        "cellView": "form"
      },
      "source": [
        "#@title Define the functions that build and train a model\n",
        "def build_model(my_learning_rate):\n",
        "  \"\"\"Create and compile a simple linear regression model.\"\"\"\n",
        "  # Most simple tf.keras models are sequential. \n",
        "  # A sequential model contains one or more layers.\n",
        "  model = tf.keras.models.Sequential()\n",
        "\n",
        "  # Describe the topography of the model.\n",
        "  # The topography of a simple linear regression model\n",
        "  # is a single node in a single layer. \n",
        "  model.add(tf.keras.layers.Dense(units=1, \n",
        "                                  input_shape=(1,)))\n",
        "\n",
        "  # Compile the model topography into code that \n",
        "  # TensorFlow can efficiently execute. Configure \n",
        "  # training to minimize the model's mean squared error. \n",
        "  model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=my_learning_rate),\n",
        "                loss=\"mean_squared_error\",\n",
        "                metrics=[tf.keras.metrics.RootMeanSquaredError()])\n",
        "\n",
        "  return model           \n",
        "\n",
        "\n",
        "def train_model(model, feature, label, epochs, batch_size):\n",
        "  \"\"\"Train the model by feeding it data.\"\"\"\n",
        "\n",
        "  # Feed the feature values and the label values to the \n",
        "  # model. The model will train for the specified number \n",
        "  # of epochs, gradually learning how the feature values\n",
        "  # relate to the label values. \n",
        "  history = model.fit(x=feature,\n",
        "                      y=label,\n",
        "                      batch_size=batch_size,\n",
        "                      epochs=epochs)\n",
        "\n",
        "  # Gather the trained model's weight and bias.\n",
        "  trained_weight = model.get_weights()[0]\n",
        "  trained_bias = model.get_weights()[1]\n",
        "\n",
        "  # The list of epochs is stored separately from the \n",
        "  # rest of history.\n",
        "  epochs = history.epoch\n",
        "  \n",
        "  # Gather the history (a snapshot) of each epoch.\n",
        "  hist = pd.DataFrame(history.history)\n",
        "\n",
        "  # Specifically gather the model's root mean \n",
        "  #squared error at each epoch. \n",
        "  rmse = hist[\"root_mean_squared_error\"]\n",
        "\n",
        "  return trained_weight, trained_bias, epochs, rmse\n",
        "\n",
        "print(\"Defined create_model and train_model\")"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ak_TMAzGOIFq"
      },
      "source": [
        "## Define plotting functions\n",
        "\n",
        "We're using a popular Python library called [Matplotlib](https://developers.google.com/machine-learning/glossary/#matplotlib) to create the following two plots:\n",
        "\n",
        "*  a plot of the feature values vs. the label values, and a line showing the output of the trained model.\n",
        "*  a [loss curve](https://developers.google.com/machine-learning/glossary/#loss_curve).\n",
        "\n",
        "We hid the following code cell because learning Matplotlib is not relevant to the learning objectives. Regardless, you must still run all hidden code cells."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "QF0BFRXTOeR3"
      },
      "source": [
        "#@title Define the plotting functions\n",
        "def plot_the_model(trained_weight, trained_bias, feature, label):\n",
        "  \"\"\"Plot the trained model against the training feature and label.\"\"\"\n",
        "\n",
        "  # Label the axes.\n",
        "  plt.xlabel(\"feature\")\n",
        "  plt.ylabel(\"label\")\n",
        "\n",
        "  # Plot the feature values vs. label values.\n",
        "  plt.scatter(feature, label)\n",
        "\n",
        "  # Create a red line representing the model. The red line starts\n",
        "  # at coordinates (x0, y0) and ends at coordinates (x1, y1).\n",
        "  x0 = 0\n",
        "  y0 = trained_bias\n",
        "  x1 = my_feature[-1]\n",
        "  y1 = trained_bias + (trained_weight * x1)\n",
        "  plt.plot([x0, x1], [y0, y1], c='r')\n",
        "\n",
        "  # Render the scatter plot and the red line.\n",
        "  plt.show()\n",
        "\n",
        "def plot_the_loss_curve(epochs, rmse):\n",
        "  \"\"\"Plot the loss curve, which shows loss vs. epoch.\"\"\"\n",
        "\n",
        "  plt.figure()\n",
        "  plt.xlabel(\"Epoch\")\n",
        "  plt.ylabel(\"Root Mean Squared Error\")\n",
        "\n",
        "  plt.plot(epochs, rmse, label=\"Loss\")\n",
        "  plt.legend()\n",
        "  plt.ylim([rmse.min()*0.97, rmse.max()])\n",
        "  plt.show()\n",
        "\n",
        "print(\"Defined the plot_the_model and plot_the_loss_curve functions.\")"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LVSDPusELEZ5"
      },
      "source": [
        "## Define the dataset\n",
        "\n",
        "The dataset consists of 12 [examples](https://developers.google.com/machine-learning/glossary/#example). Each example consists of one [feature](https://developers.google.com/machine-learning/glossary/#feature) and one [label](https://developers.google.com/machine-learning/glossary/#label).\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "rnUSYKw4LUuh"
      },
      "source": [
        "my_feature = ([1.0, 2.0,  3.0,  4.0,  5.0,  6.0,  7.0,  8.0,  9.0, 10.0, 11.0, 12.0])\n",
        "my_label   = ([5.0, 8.8,  9.6, 14.2, 18.8, 19.5, 21.4, 26.8, 28.9, 32.0, 33.8, 38.2])"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "K24afla-4s2x"
      },
      "source": [
        "## Specify the hyperparameters\n",
        "\n",
        "The hyperparameters in this Colab are as follows:\n",
        "\n",
        "  * [learning rate](https://developers.google.com/machine-learning/glossary/#learning_rate)\n",
        "  * [epochs](https://developers.google.com/machine-learning/glossary/#epoch)\n",
        "  * [batch_size](https://developers.google.com/machine-learning/glossary/#batch_size)\n",
        "\n",
        "The following code cell initializes these hyperparameters and then invokes the functions that build and train the model."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Ye730h13CQ97"
      },
      "source": [
        "learning_rate=0.01\n",
        "epochs=10\n",
        "my_batch_size=12\n",
        "\n",
        "my_model = build_model(learning_rate)\n",
        "trained_weight, trained_bias, epochs, rmse = train_model(my_model, my_feature, \n",
        "                                                         my_label, epochs,\n",
        "                                                         my_batch_size)\n",
        "plot_the_model(trained_weight, trained_bias, my_feature, my_label)\n",
        "plot_the_loss_curve(epochs, rmse)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QwSm60H6pQjJ"
      },
      "source": [
        "## Task 1: Examine the graphs\n",
        "\n",
        "Examine the top graph. The blue dots identify the actual data; the red line identifies the output of the trained model. Ideally, the red line should align nicely with the blue dots.  Does it?  Probably not.\n",
        "\n",
        "A certain amount of randomness plays into training a model, so you'll get somewhat different results every time you train.  That said, unless you are an extremely lucky person, the red line probably *doesn't* align nicely with the blue dots.  \n",
        "\n",
        "Examine the bottom graph, which shows the loss curve. Notice that the loss curve decreases but doesn't flatten out, which is a sign that the model hasn't trained sufficiently."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lLXPvqCRvgI4"
      },
      "source": [
        "## Task 2: Increase the number of epochs\n",
        "\n",
        "Training loss should steadily decrease, steeply at first, and then more slowly. Eventually, training loss should eventually stay steady (zero slope or nearly zero slope), which indicates that training has [converged](http://developers.google.com/machine-learning/glossary/#convergence).\n",
        "\n",
        "In Task 1, the training loss did not converge. One possible solution is to train for more epochs.  Your task is to increase the number of epochs sufficiently to get the model to converge. However, it is inefficient to train past convergence, so don't just set the number of epochs to an arbitrarily high value.\n",
        "\n",
        "Examine the loss curve. Does the model converge?"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "uXuJH3h6t5qs"
      },
      "source": [
        "learning_rate=0.01\n",
        "epochs= 50   # Replace ? with an integer.\n",
        "my_batch_size=12\n",
        "\n",
        "my_model = build_model(learning_rate)\n",
        "trained_weight, trained_bias, epochs, rmse = train_model(my_model, my_feature, \n",
        "                                                        my_label, epochs,\n",
        "                                                        my_batch_size)\n",
        "plot_the_model(trained_weight, trained_bias, my_feature, my_label)\n",
        "plot_the_loss_curve(epochs, rmse)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "1tWrzP4Ww7sD"
      },
      "source": [
        "#@title Double-click to view a possible solution\n",
        "learning_rate=0.01\n",
        "epochs=450\n",
        "my_batch_size=12 \n",
        "\n",
        "my_model = build_model(learning_rate)\n",
        "trained_weight, trained_bias, epochs, rmse = train_model(my_model, my_feature, \n",
        "                                                         my_label, epochs,\n",
        "                                                         my_batch_size)\n",
        "plot_the_model(trained_weight, trained_bias, my_feature, my_label)\n",
        "plot_the_loss_curve(epochs, rmse)\n",
        "\n",
        "# The loss curve suggests that the model does converge."
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0KmzfFB5zwvd"
      },
      "source": [
        "## Task 3: Increase the learning rate\n",
        "\n",
        "In Task 2, you increased the number of epochs to get the model to converge. Sometimes, you can get the model to converge more quickly by increasing the learning rate. However, setting the learning rate too high often makes it impossible for a model to converge. In Task 3, we've intentionally set the learning rate too high. Run the following code cell and see what happens."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "eD1hTmdd0uCo"
      },
      "source": [
        "# Increase the learning rate and decrease the number of epochs.\n",
        "learning_rate=100 \n",
        "epochs=500 \n",
        "\n",
        "my_model = build_model(learning_rate)\n",
        "trained_weight, trained_bias, epochs, rmse = train_model(my_model, my_feature, \n",
        "                                                         my_label, epochs,\n",
        "                                                         my_batch_size)\n",
        "plot_the_model(trained_weight, trained_bias, my_feature, my_label)\n",
        "plot_the_loss_curve(epochs, rmse)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "c96ITm021NEV"
      },
      "source": [
        "The resulting model is terrible; the red line doesn't align with the blue dots. Furthermore, the loss curve oscillates like a [roller coaster](https://www.wikipedia.org/wiki/Roller_coaster).  An oscillating loss curve strongly suggests that the learning rate is too high. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "r63YkMx82WVr"
      },
      "source": [
        "## Task 4: Find the ideal combination of epochs and learning rate\n",
        "\n",
        "Assign values to the following two hyperparameters to make training converge as efficiently as possible: \n",
        "\n",
        "*  learning_rate\n",
        "*  epochs"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ZYC8eR5x5n4m"
      },
      "source": [
        "# Set the learning rate and number of epochs\n",
        "learning_rate= 1  # Replace ? with a floating-point number\n",
        "epochs= 10   # Replace ? with an integer\n",
        "\n",
        "my_model = build_model(learning_rate)\n",
        "trained_weight, trained_bias, epochs, rmse = train_model(my_model, my_feature, \n",
        "                                                         my_label, epochs,\n",
        "                                                         my_batch_size)\n",
        "plot_the_model(trained_weight, trained_bias, my_feature, my_label)\n",
        "plot_the_loss_curve(epochs, rmse)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_GMGgR6O54IN"
      },
      "source": [
        "#@title Double-click to view a possible solution\n",
        "\n",
        "learning_rate=0.14\n",
        "epochs=70\n",
        "my_batch_size=12\n",
        "\n",
        "my_model = build_model(learning_rate)\n",
        "trained_weight, trained_bias, epochs, rmse = train_model(my_model, my_feature, \n",
        "                                                         my_label, epochs,\n",
        "                                                         my_batch_size)\n",
        "plot_the_model(trained_weight, trained_bias, my_feature, my_label)\n",
        "plot_the_loss_curve(epochs, rmse)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0NDET9e6AAbA"
      },
      "source": [
        "## Task 5: Adjust the batch size\n",
        "\n",
        "The system recalculates the model's loss value and adjusts the model's weights and bias after each **iteration**.  Each iteration is the span in which the system processes one batch. For example, if the **batch size** is 6, then the system recalculates the model's loss value and adjusts the model's weights and bias after processing every 6 examples.  \n",
        "\n",
        "One **epoch** spans sufficient iterations to process every example in the dataset. For example, if the batch size is 12, then each epoch lasts one iteration. However, if the batch size is 6, then each epoch consumes two iterations.  \n",
        "\n",
        "It is tempting to simply set the batch size to the number of examples in the dataset (12, in this case). However, the model might actually train faster on smaller batches. Conversely, very small batches might not contain enough information to help the model converge. \n",
        "\n",
        "Experiment with `batch_size` in the following code cell. What's the smallest integer you can set for `batch_size` and still have the model converge in a hundred epochs?"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_vGx0lOodQrT"
      },
      "source": [
        "learning_rate=0.05\n",
        "epochs=100\n",
        "my_batch_size= 1  # Replace ? with an integer.\n",
        "\n",
        "my_model = build_model(learning_rate)\n",
        "trained_weight, trained_bias, epochs, rmse = train_model(my_model, my_feature, \n",
        "                                                        my_label, epochs,\n",
        "                                                        my_batch_size)\n",
        "plot_the_model(trained_weight, trained_bias, my_feature, my_label)\n",
        "plot_the_loss_curve(epochs, rmse)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "-mtVpoBrANAm"
      },
      "source": [
        "#@title Double-click to view a possible solution\n",
        "\n",
        "learning_rate=0.05\n",
        "epochs=125\n",
        "my_batch_size=1 # Wow, a batch size of 1 works!\n",
        "\n",
        "my_model = build_model(learning_rate)\n",
        "trained_weight, trained_bias, epochs, rmse = train_model(my_model, my_feature, \n",
        "                                                         my_label, epochs,\n",
        "                                                         my_batch_size)\n",
        "plot_the_model(trained_weight, trained_bias, my_feature, my_label)\n",
        "plot_the_loss_curve(epochs, rmse)\n"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aS3q7TIF9SFL"
      },
      "source": [
        "## Summary of hyperparameter tuning\n",
        "\n",
        "Most machine learning problems require a lot of hyperparameter tuning.  Unfortunately, we can't provide concrete tuning rules for every model. Lowering the learning rate can help one model converge efficiently but make another model converge much too slowly.  You must experiment to find the best set of hyperparameters for your dataset. That said, here are a few rules of thumb:\n",
        "\n",
        " * Training loss should steadily decrease, steeply at first, and then more slowly until the slope of the curve reaches or approaches zero. \n",
        " * If the training loss does not converge, train for more epochs.\n",
        " * If the training loss decreases too slowly, increase the learning rate. Note that setting the learning rate too high may also prevent training loss from converging.\n",
        " * If the training loss varies wildly (that is, the training loss jumps around), decrease the learning rate.\n",
        " * Lowering the learning rate while increasing the number of epochs or the batch size is often a good combination.\n",
        " * Setting the batch size to a *very* small batch number can also cause instability. First, try large batch size values. Then, decrease the batch size until you see degradation.\n",
        " * For real-world datasets consisting of a very large number of examples, the entire dataset might not fit into memory. In such cases, you'll need to reduce the batch size to enable a batch to fit into memory. \n",
        "\n",
        "Remember: the ideal combination of hyperparameters is data dependent, so you must always experiment and verify."
      ]
    }
  ]
 }