Skip to content

Instantly share code, notes, and snippets.

@KMarkert
Last active September 22, 2020 14:17
Show Gist options
  • Save KMarkert/aea1bad51f23b78aa5fd08cc471396c4 to your computer and use it in GitHub Desktop.
Save KMarkert/aea1bad51f23b78aa5fd08cc471396c4 to your computer and use it in GitHub Desktop.
Beginning_CNN_MNIST.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "1: TF2.0 beginner CNN.ipynb",
"provenance": [],
"private_outputs": true,
"collapsed_sections": [],
"toc_visible": true,
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/KMarkert/aea1bad51f23b78aa5fd08cc471396c4/1-tf2-0-beginner-cnn.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MifPkzwdwFoh",
"colab_type": "text"
},
"source": [
"This notebook illustrates a simple Convolutional Neural Network example on a test dataset, MNIST. We will implement one of the first CNN architectures, [LeNet-5](http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf), and apply it to predict images labels of hand drawn digits.\n",
"\n",
"FYI: This architecture is considered *the* standard ‘template’ for stacking convolutions and pooling layers, and ending the network with one or more fully-connected layers."
]
},
{
"cell_type": "code",
"metadata": {
"id": "ccqPJ84tDnS-",
"colab_type": "code",
"colab": {}
},
"source": [
"!pip install tf-explain"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Icb_1rmeume3",
"colab_type": "code",
"colab": {}
},
"source": [
"from __future__ import absolute_import, division, print_function, unicode_literals\n",
"\n",
"import tensorflow as tf\n",
"from tf_explain.core.grad_cam import GradCAM\n",
"from tensorflow import keras\n",
"\n",
"%pylab inline"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "b9qOJJgUvbLW",
"colab_type": "text"
},
"source": [
"Load and prepare the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). Convert the samples from integers to floating-point numbers:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "eAiLCj0JvbyW",
"colab_type": "code",
"colab": {}
},
"source": [
"mnist = tf.keras.datasets.mnist\n",
"\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
"x_train, x_test = x_train / 255.0, x_test / 255.0"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "TMJBBa-hvmwU",
"colab_type": "text"
},
"source": [
"If we are to use a CNN then we need to add extra dimensions to the dataset. CNN expect shapes [samples, y, x, channel] for the input images. Here we add an extra channel dimension for the images"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Bn2fXN8cv1gM",
"colab_type": "code",
"colab": {}
},
"source": [
"x_train, x_test = x_train[:,:,:,np.newaxis], x_test[:,:,:,np.newaxis]\n",
"print('Input dimensions (samples,y,x,c): {}'.format(x_train.shape))"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "d3ArzrNm8FiJ",
"colab_type": "text"
},
"source": [
"Let's build the model and train it!!!"
]
},
{
"cell_type": "code",
"metadata": {
"id": "jSKOkI3ZwB5a",
"colab_type": "code",
"colab": {}
},
"source": [
"# LesNet-5 model architecture\n",
"cnn_model = keras.models.Sequential([\n",
" # input data and resize to 32x32x1 image using bilinear interpolation\n",
" keras.layers.Lambda(lambda x: tf.image.resize(x,[32,32]),\n",
" keras.layers.Input(shape=[28, 28, 1]),\n",
" input_shape=[28,28,1],name=\"input\"),\n",
"\n",
" # first convolutional block with pooling\n",
" keras.layers.Conv2D(8, (5,5), padding='same', activation='elu',name=\"block1_conv\"),\n",
" keras.layers.MaxPooling2D((2, 2),name=\"block1_pool\"),\n",
"\n",
" # second convolutional block with pooling\n",
" keras.layers.Conv2D(16, (3,3), padding='same', activation='elu',name=\"block2_conv\"),\n",
" keras.layers.MaxPooling2D((2, 2),name=\"block2_pool\"),\n",
"\n",
" # yet another convolutional block with pooling\n",
" keras.layers.Conv2D(32, (3,3), padding='same', activation='elu',name=\"block3_conv\"),\n",
" keras.layers.MaxPooling2D((2, 2),name=\"block3_pool\"),\n",
"\n",
"\n",
" # fully-connected layers\n",
" keras.layers.GlobalAveragePooling2D(name=\"global_avg_2d\"),\n",
" keras.layers.Dense(10, activation='softmax',name=\"dense_out\")\n",
"])\n",
"\n",
"cnn_model.compile(optimizer='adam',\n",
" loss='sparse_categorical_crossentropy',\n",
" metrics=['accuracy'])\n",
"\n"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "qCmnFdoJ1TXG",
"colab_type": "code",
"colab": {}
},
"source": [
"cnn_model.summary()"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "OB8nxmgkwSxv",
"colab_type": "code",
"colab": {}
},
"source": [
"n_epochs = 5\n",
"n_batches = 50\n",
"cnn_model.fit(x_train, y_train, epochs=n_epochs,batch_size=n_batches)\n",
"\n",
"cnn_model.evaluate(x_test, y_test)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "dDdNXKT-3Yso",
"colab_type": "text"
},
"source": [
"**Note:** The original LeNet-5 architecture uses a Tanh activation function, however, the standard activation function with modern architectures is the ReLU activation function. Try changing the activation functions to ReLU and see how the model performs."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "w-I9isOR0_z9",
"colab_type": "text"
},
"source": [
"The CNN classifier is more computationally heavy which makes training take longer but we can see that over 5 epochs we were able to achieve an accuracy >98% with this dataset, a small improvement over the DNN classifier from the previous example."
]
},
{
"cell_type": "code",
"metadata": {
"id": "847h1S_ix2NM",
"colab_type": "code",
"colab": {}
},
"source": [
"# get random index to test\n",
"testIdx = random.randint(0,x_test.shape[0])\n",
"\n",
"x_val = x_test[testIdx:testIdx+1,:,:,:]\n",
"y_val = y_test[testIdx:testIdx+1]\n",
"\n",
"# setup plot and display the image being predicted\n",
"ax = plt.subplot(111)\n",
"ax.imshow(x_val[0,:,:,0],interpolation='nearest',cmap='gray')\n",
"# Hide the axes\n",
"ax.set_axis_off()\n",
"\n",
"explainer = GradCAM()\n",
"\n",
"cam = explainer.explain(model=cnn_model,\n",
" validation_data=(x_val, y_val),\n",
" class_index=0,\n",
")\n",
"\n",
"ax.imshow(cam,cmap=\"jet\",alpha=0.8)\n",
"\n",
"# perform the model inference\n",
"prediction = cnn_model.predict(x_val,callbacks=callbacks)\n",
"\n",
"# print out the prediction result and expected value\n",
"print('Predicted number: {0}\\tProbability: {1:.4f}'.format(np.argmax(prediction),np.max(prediction)))\n",
"print('Actual number: {}'.format(y_test[testIdx]))"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZJYiu-L_1cTu",
"colab_type": "text"
},
"source": [
"Let's crack open the CNN network to view how the CNN is recognizing image features. We do this by creating a new model and output the activation layers instead of the predicted classes."
]
},
{
"cell_type": "code",
"metadata": {
"id": "q1EXOH6dzsOm",
"colab_type": "code",
"colab": {}
},
"source": [
"# Extracts the outputs for the 2d layers\n",
"layer_outputs = [layer.output for layer in cnn_model.layers[1:7]]\n",
"\n",
"# Create a model that will return these outputs, given the model input\n",
"activation_model = keras.models.Model(inputs=cnn_model.input, outputs=layer_outputs) "
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Y7x7M6fU4jBf",
"colab_type": "code",
"colab": {}
},
"source": [
"# setup plot and display the image being predicted\n",
"ax = plt.subplot(111)\n",
"ax.imshow(x_test[testIdx,:,:,0],interpolation='nearest',cmap='gray')\n",
"# Hide the axes\n",
"ax.set_axis_off()\n",
"ax.set_title('Input Image')\n",
"\n",
"# run an image through the model\n",
"activations = activation_model.predict(x_test[testIdx:testIdx+1,:,:,:]) \n",
"# Returns a list of five Numpy arrays: one array per layer activation\n",
"\n",
"layer_names = []\n",
"for layer in cnn_model.layers[1:7]:\n",
" layer_names.append(layer.name) # Names of the layers, so you can have them as part of your plot\n",
" \n",
"fig,ax = plt.subplots(nrows=len(layer_names),figsize=(15,9))\n",
"\n",
"i = 0\n",
"for layer_name, layer_activation in zip(layer_names, activations): # Displays the feature maps\n",
" if i <= 1: images_per_row = 8\n",
" else: images_per_row = 16\n",
" n_features = layer_activation.shape[-1] # Number of features in the feature map\n",
" size = layer_activation.shape[1] #The feature map has shape (1, size, size, n_features).\n",
" n_cols = n_features // images_per_row # Tiles the activation channels in this matrix\n",
" display_grid = np.zeros((size * n_cols, images_per_row * size))\n",
" for col in range(n_cols): # Tiles each filter into a big horizontal grid\n",
" for row in range(images_per_row):\n",
" channel_image = layer_activation[0,\n",
" :, :,\n",
" col * images_per_row + row]\n",
" channel_image -= channel_image.mean() # Post-processes the feature to make it visually palatable\n",
" channel_image /= channel_image.std()\n",
" channel_image *= 64\n",
" channel_image += 128\n",
" channel_image = np.clip(channel_image, 0, 255).astype('uint8')\n",
" display_grid[col * size : (col + 1) * size, # Displays the grid\n",
" row * size : (row + 1) * size] = channel_image\n",
" scale = 1. / size\n",
"\n",
" ax[i].set_title(layer_name)\n",
" ax[i].set_axis_off()\n",
" ax[i].imshow(display_grid, aspect='auto', cmap='viridis')\n",
" i+=1\n",
"\n",
"fig.subplots_adjust(hspace=0.4)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "hRpOJq3U6BaD",
"colab_type": "text"
},
"source": [
"There are more advanced and *deeper* CNN architectures out there. This example was meant to show how to implement a CNN model for image classification. Check out this [article](https://towardsdatascience.com/illustrated-10-cnn-architectures-95d78ace614d#c5a6) for more explanations on popular CNN architectures."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zMT3kQB0N9tG",
"colab_type": "text"
},
"source": [
"### Exercises\n",
"\n",
"1. Change the pooling layers from average to max pooling. Compare the differences in the resulting accuracy and activations layers results.\n",
"2. Make the network deeper by adding in an additional Convolutional block and Dense fully-conneted layer."
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment