Last active
October 15, 2021 08:12
-
-
Save jurgenizer/6db3d177c43506f549a5b0d4e96f72e8 to your computer and use it in GitHub Desktop.
TensorFlow and Google Colab Demo for INF4015W
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"name": "TensorFlow and Google Colab Demo for INF4015W", | |
"version": "0.3.2", | |
"provenance": [], | |
"private_outputs": true, | |
"collapsed_sections": [], | |
"toc_visible": true, | |
"include_colab_link": true | |
}, | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3" | |
}, | |
"accelerator": "GPU" | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "view-in-github", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/gist/jurgenizer/6db3d177c43506f549a5b0d4e96f72e8/neural-style-transfer-with-eager-execution.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "jo5PziEC4hWs", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"# TensorFlow and Google Colab Demo for INF4015W" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "aDyGj8DmXCJI", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Overview\n", | |
"\n", | |
"This **deep learning** demonstration is based on the Google Seed \"Neural Style Transfer with tf.keras\" by TensorFlow authors, available [here](https://research.google.com/seedbank/seed/neural_style_transfer_with_tfkeras)\n", | |
"\n", | |
"**Deep learning** can be used to compose images in the style of another image (ever wish you could paint like Picasso or Van Gogh?). This is known as **neural style transfer**! This is a technique outlined in [Leon A. Gatys' paper, A Neural Algorithm of Artistic Style](https://arxiv.org/abs/1508.06576).\n", | |
"\n", | |
"\n", | |
"\n", | |
"For example, let’s take an image of this turtle and Katsushika Hokusai's *The Great Wave off Kanagawa*:\n", | |
"\n", | |
"<img src=\"https://github.com/tensorflow/models/blob/master/research/nst_blogpost/Green_Sea_Turtle_grazing_seagrass.jpg?raw=1\" alt=\"Drawing\" style=\"width: 200px;\"/>\n", | |
"<img src=\"https://github.com/tensorflow/models/blob/master/research/nst_blogpost/The_Great_Wave_off_Kanagawa.jpg?raw=1\" alt=\"Drawing\" style=\"width: 200px;\"/>\n", | |
"\n", | |
"[Image of Green Sea Turtle](https://commons.wikimedia.org/wiki/File:Green_Sea_Turtle_grazing_seagrass.jpg)\n", | |
"-By P.Lindgren [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)], from Wikimedia Commons\n", | |
"\n", | |
"\n", | |
"Now how would it look like if Hokusai decided to paint the picture of this Turtle exclusively with this style? Something like this?\n", | |
"\n", | |
"<img src=\"https://github.com/tensorflow/models/blob/master/research/nst_blogpost/wave_turtle.png?raw=1\" alt=\"Drawing\" style=\"width: 500px;\"/>\n", | |
"\n", | |
"Is this magic or just deep learning? Fortunately, this doesn’t involve any witchcraft: style transfer is a fun and interesting technique that showcases the capabilities and internal representations of neural networks. \n", | |
"\n", | |
"Let's try it with some other images.\n", | |
"\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "U8ajP_u73s6m", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Setup\n", | |
"\n", | |
"### Download Images" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "riWE_b8k3s6o", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"import os\n", | |
"img_dir = '/tmp/nst'\n", | |
"if not os.path.exists(img_dir):\n", | |
" os.makedirs(img_dir)\n", | |
"!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/9/9c/Aldrin_Apollo_11.jpg\n", | |
"!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/1/1b/Claude_Monet_-_Woman_with_a_Parasol_-_Madame_Monet_and_Her_Son_-_Google_Art_Project.jpg\n", | |
"!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/0/00/Tuebingen_Neckarfront.jpg\n", | |
"!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg/1024px-Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "eqxUicSPUOP6", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Import and configure modules" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "sc1OLbOWhPCO", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"import matplotlib.pyplot as plt\n", | |
"import matplotlib as mpl\n", | |
"mpl.rcParams['figure.figsize'] = (10,10)\n", | |
"mpl.rcParams['axes.grid'] = False\n", | |
"\n", | |
"import numpy as np\n", | |
"from PIL import Image\n", | |
"import time\n", | |
"import functools" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "RYEjlrYk3s6w", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"import tensorflow as tf\n", | |
"import tensorflow.contrib.eager as tfe\n", | |
"\n", | |
"from tensorflow.python.keras.preprocessing import image as kp_image\n", | |
"from tensorflow.python.keras import models \n", | |
"from tensorflow.python.keras import losses\n", | |
"from tensorflow.python.keras import layers\n", | |
"from tensorflow.python.keras import backend as K" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "L7sjDODq67HQ", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"We’ll begin by enabling [eager execution](https://www.tensorflow.org/guide/eager). Eager execution allows us to work through this technique in the clearest and most readable way. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "sfjsSAtNrqQx", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"tf.enable_eager_execution()\n", | |
"print(\"Eager execution: {}\".format(tf.executing_eagerly()))" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "IOiGrIV1iERH", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"# Set up some global values here\n", | |
"content_path = '/tmp/nst/Aldrin_Apollo_11.jpg'\n", | |
"style_path = '/tmp/nst/Claude_Monet_-_Woman_with_a_Parasol_-_Madame_Monet_and_Her_Son_-_Google_Art_Project.jpg'" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "xE4Yt8nArTeR", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Visualize the input" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "3TLljcwv5qZs", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def load_img(path_to_img):\n", | |
" max_dim = 512\n", | |
" img = Image.open(path_to_img)\n", | |
" long = max(img.size)\n", | |
" scale = max_dim/long\n", | |
" img = img.resize((round(img.size[0]*scale), round(img.size[1]*scale)), Image.ANTIALIAS)\n", | |
" \n", | |
" img = kp_image.img_to_array(img)\n", | |
" \n", | |
" # We need to broadcast the image array such that it has a batch dimension \n", | |
" img = np.expand_dims(img, axis=0)\n", | |
" return img" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "vupl0CI18aAG", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def imshow(img, title=None):\n", | |
" # Remove the batch dimension\n", | |
" out = np.squeeze(img, axis=0)\n", | |
" # Normalize for display \n", | |
" out = out.astype('uint8')\n", | |
" plt.imshow(out)\n", | |
" if title is not None:\n", | |
" plt.title(title)\n", | |
" plt.imshow(out)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "2yAlRzJZrWM3", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"These are input content and style images. We hope to \"create\" an image with the content of our content image, but with the style of the style image. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "_UWQmeEaiKkP", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"plt.figure(figsize=(10,10))\n", | |
"\n", | |
"content = load_img(content_path).astype('uint8')\n", | |
"style = load_img(style_path).astype('uint8')\n", | |
"\n", | |
"plt.subplot(1, 2, 1)\n", | |
"imshow(content, 'Content Image')\n", | |
"\n", | |
"plt.subplot(1, 2, 2)\n", | |
"imshow(style, 'Style Image')\n", | |
"plt.show()" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "EnfAbmzErMmZ", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"[Astronaut Buzz Aldrin walks on the moon](https://upload.wikimedia.org/wikipedia/commons/9/9c/Aldrin_Apollo_11.jpg) by NASA (1969) [Public domain] via Wikimedia Commons\n", | |
"\n", | |
"\n", | |
"[Woman with a Parasol - Madame Monet and Her Son](https://upload.wikimedia.org/wikipedia/commons/1/1b/Claude_Monet_-_Woman_with_a_Parasol_-_Madame_Monet_and_Her_Son_-_Google_Art_Project.jpg) by Claude Monet (1875) [Public domain] via Wikimedia Commons\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "7qMVNvEsK-_D", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Prepare the data" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "hGwmTwJNmv2a", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def load_and_process_img(path_to_img):\n", | |
" img = load_img(path_to_img)\n", | |
" img = tf.keras.applications.vgg19.preprocess_input(img)\n", | |
" return img" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "xCgooqs6tAka", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"In order to view the outputs of our optimization, we are required to perform the inverse preprocessing step. Furthermore, we must clip to maintain our values from within the 0-255 range. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "mjzlKRQRs_y2", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def deprocess_img(processed_img):\n", | |
" x = processed_img.copy()\n", | |
" if len(x.shape) == 4:\n", | |
" x = np.squeeze(x, 0)\n", | |
" assert len(x.shape) == 3, (\"Input to deprocess image must be an image of \"\n", | |
" \"dimension [1, height, width, channel] or [height, width, channel]\")\n", | |
" if len(x.shape) != 3:\n", | |
" raise ValueError(\"Invalid input to deprocessing image\")\n", | |
" \n", | |
" # perform the inverse of the preprocessiing step\n", | |
" x[:, :, 0] += 103.939\n", | |
" x[:, :, 1] += 116.779\n", | |
" x[:, :, 2] += 123.68\n", | |
" x = x[:, :, ::-1]\n", | |
"\n", | |
" x = np.clip(x, 0, 255).astype('uint8')\n", | |
" return x" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "GEwZ7FlwrjoZ", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Define content and style representations\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "N4-8eUp_Kc-j", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"# Content layer where will pull our feature maps\n", | |
"content_layers = ['block5_conv2'] \n", | |
"\n", | |
"# Style layer we are interested in\n", | |
"style_layers = ['block1_conv1',\n", | |
" 'block2_conv1',\n", | |
" 'block3_conv1', \n", | |
" 'block4_conv1', \n", | |
" 'block5_conv1'\n", | |
" ]\n", | |
"\n", | |
"num_content_layers = len(content_layers)\n", | |
"num_style_layers = len(style_layers)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "Jt3i3RRrJiOX", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Build the Model \n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "nfec6MuMAbPx", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def get_model():\n", | |
" \"\"\" Creates our model with access to intermediate layers. \n", | |
" \n", | |
" This function will load the VGG19 model and access the intermediate layers. \n", | |
" These layers will then be used to create a new model that will take input image\n", | |
" and return the outputs from these intermediate layers from the VGG model. \n", | |
" \n", | |
" Returns:\n", | |
" returns a keras model that takes image inputs and outputs the style and \n", | |
" content intermediate layers. \n", | |
" \"\"\"\n", | |
" # Load our model. We load pretrained VGG, trained on imagenet data\n", | |
" vgg = tf.keras.applications.vgg19.VGG19(include_top=False, weights='imagenet')\n", | |
" vgg.trainable = False\n", | |
" # Get output layers corresponding to style and content layers \n", | |
" style_outputs = [vgg.get_layer(name).output for name in style_layers]\n", | |
" content_outputs = [vgg.get_layer(name).output for name in content_layers]\n", | |
" model_outputs = style_outputs + content_outputs\n", | |
" # Build model \n", | |
" return models.Model(vgg.input, model_outputs)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "kl6eFGa7-OtV", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"In the above code snippet, we created a model that will take an input image and output the content and style intermediate layers! \n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "vJdYvJTZ4bdS", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Define and create our loss functions (content and style distances)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "6KsbqPA8J9DY", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Compute content loss\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "d2mf7JwRMkCd", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def get_content_loss(base_content, target):\n", | |
" return tf.reduce_mean(tf.square(base_content - target))" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "F21Hm61yLKk5", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Computing style loss" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "N7MOqwKLLke8", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def gram_matrix(input_tensor):\n", | |
" # We make the image channels first \n", | |
" channels = int(input_tensor.shape[-1])\n", | |
" a = tf.reshape(input_tensor, [-1, channels])\n", | |
" n = tf.shape(a)[0]\n", | |
" gram = tf.matmul(a, a, transpose_a=True)\n", | |
" return gram / tf.cast(n, tf.float32)\n", | |
"\n", | |
"def get_style_loss(base_style, gram_target):\n", | |
" \"\"\"Expects two images of dimension h, w, c\"\"\"\n", | |
" # height, width, num filters of each layer\n", | |
" # We scale the loss at a given layer by the size of the feature map and the number of filters\n", | |
" height, width, channels = base_style.get_shape().as_list()\n", | |
" gram_style = gram_matrix(base_style)\n", | |
" \n", | |
" return tf.reduce_mean(tf.square(gram_style - gram_target))# / (4. * (channels ** 2) * (width * height) ** 2)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "pXIUX6czZABh", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Apply style transfer to our images\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "y9r8Lyjb_m0u", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Run Gradient Descent " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "-kGzV6LTp4CU", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"We’ll define a little helper function that will load our content and style image, feed them forward through our network, which will then output the content and style feature representations from our model. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "O-lj5LxgtmnI", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def get_feature_representations(model, content_path, style_path):\n", | |
" \"\"\"Helper function to compute our content and style feature representations.\n", | |
"\n", | |
" This function will simply load and preprocess both the content and style \n", | |
" images from their path. Then it will feed them through the network to obtain\n", | |
" the outputs of the intermediate layers. \n", | |
" \n", | |
" Arguments:\n", | |
" model: The model that we are using.\n", | |
" content_path: The path to the content image.\n", | |
" style_path: The path to the style image\n", | |
" \n", | |
" Returns:\n", | |
" returns the style features and the content features. \n", | |
" \"\"\"\n", | |
" # Load our images in \n", | |
" content_image = load_and_process_img(content_path)\n", | |
" style_image = load_and_process_img(style_path)\n", | |
" \n", | |
" # batch compute content and style features\n", | |
" style_outputs = model(style_image)\n", | |
" content_outputs = model(content_image)\n", | |
" \n", | |
" \n", | |
" # Get the style and content feature representations from our model \n", | |
" style_features = [style_layer[0] for style_layer in style_outputs[:num_style_layers]]\n", | |
" content_features = [content_layer[0] for content_layer in content_outputs[num_style_layers:]]\n", | |
" return style_features, content_features" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "3DopXw7-lFHa", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Computing the loss and gradients\n", | |
"Here we use [**tf.GradientTape**](https://www.tensorflow.org/programmers_guide/eager#computing_gradients) to compute the gradient." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "oVDhSo8iJunf", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def compute_loss(model, loss_weights, init_image, gram_style_features, content_features):\n", | |
" \"\"\"This function will compute the loss total loss.\n", | |
" \n", | |
" Arguments:\n", | |
" model: The model that will give us access to the intermediate layers\n", | |
" loss_weights: The weights of each contribution of each loss function. \n", | |
" (style weight, content weight, and total variation weight)\n", | |
" init_image: Our initial base image. This image is what we are updating with \n", | |
" our optimization process. We apply the gradients wrt the loss we are \n", | |
" calculating to this image.\n", | |
" gram_style_features: Precomputed gram matrices corresponding to the \n", | |
" defined style layers of interest.\n", | |
" content_features: Precomputed outputs from defined content layers of \n", | |
" interest.\n", | |
" \n", | |
" Returns:\n", | |
" returns the total loss, style loss, content loss, and total variational loss\n", | |
" \"\"\"\n", | |
" style_weight, content_weight = loss_weights\n", | |
" \n", | |
" # Feed our init image through our model. This will give us the content and \n", | |
" # style representations at our desired layers. Since we're using eager\n", | |
" # our model is callable just like any other function!\n", | |
" model_outputs = model(init_image)\n", | |
" \n", | |
" style_output_features = model_outputs[:num_style_layers]\n", | |
" content_output_features = model_outputs[num_style_layers:]\n", | |
" \n", | |
" style_score = 0\n", | |
" content_score = 0\n", | |
"\n", | |
" # Accumulate style losses from all layers\n", | |
" # Here, we equally weight each contribution of each loss layer\n", | |
" weight_per_style_layer = 1.0 / float(num_style_layers)\n", | |
" for target_style, comb_style in zip(gram_style_features, style_output_features):\n", | |
" style_score += weight_per_style_layer * get_style_loss(comb_style[0], target_style)\n", | |
" \n", | |
" # Accumulate content losses from all layers \n", | |
" weight_per_content_layer = 1.0 / float(num_content_layers)\n", | |
" for target_content, comb_content in zip(content_features, content_output_features):\n", | |
" content_score += weight_per_content_layer* get_content_loss(comb_content[0], target_content)\n", | |
" \n", | |
" style_score *= style_weight\n", | |
" content_score *= content_weight\n", | |
"\n", | |
" # Get total loss\n", | |
" loss = style_score + content_score \n", | |
" return loss, style_score, content_score" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "fwzYeOqOUH9_", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def compute_grads(cfg):\n", | |
" with tf.GradientTape() as tape: \n", | |
" all_loss = compute_loss(**cfg)\n", | |
" # Compute gradients wrt input image\n", | |
" total_loss = all_loss[0]\n", | |
" return tape.gradient(total_loss, cfg['init_image']), all_loss" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "T9yKu2PLlBIE", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Optimization loop" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "pj_enNo6tACQ", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"import IPython.display\n", | |
"\n", | |
"def run_style_transfer(content_path, \n", | |
" style_path,\n", | |
" num_iterations=1000,\n", | |
" content_weight=1e3, \n", | |
" style_weight=1e-2): \n", | |
" # We don't need to (or want to) train any layers of our model, so we set their\n", | |
" # trainable to false. \n", | |
" model = get_model() \n", | |
" for layer in model.layers:\n", | |
" layer.trainable = False\n", | |
" \n", | |
" # Get the style and content feature representations (from our specified intermediate layers) \n", | |
" style_features, content_features = get_feature_representations(model, content_path, style_path)\n", | |
" gram_style_features = [gram_matrix(style_feature) for style_feature in style_features]\n", | |
" \n", | |
" # Set initial image\n", | |
" init_image = load_and_process_img(content_path)\n", | |
" init_image = tfe.Variable(init_image, dtype=tf.float32)\n", | |
" # Create our optimizer\n", | |
" opt = tf.train.AdamOptimizer(learning_rate=5, beta1=0.99, epsilon=1e-1)\n", | |
"\n", | |
" # For displaying intermediate images \n", | |
" iter_count = 1\n", | |
" \n", | |
" # Store our best result\n", | |
" best_loss, best_img = float('inf'), None\n", | |
" \n", | |
" # Create a nice config \n", | |
" loss_weights = (style_weight, content_weight)\n", | |
" cfg = {\n", | |
" 'model': model,\n", | |
" 'loss_weights': loss_weights,\n", | |
" 'init_image': init_image,\n", | |
" 'gram_style_features': gram_style_features,\n", | |
" 'content_features': content_features\n", | |
" }\n", | |
" \n", | |
" # For displaying\n", | |
" num_rows = 2\n", | |
" num_cols = 5\n", | |
" display_interval = num_iterations/(num_rows*num_cols)\n", | |
" start_time = time.time()\n", | |
" global_start = time.time()\n", | |
" \n", | |
" norm_means = np.array([103.939, 116.779, 123.68])\n", | |
" min_vals = -norm_means\n", | |
" max_vals = 255 - norm_means \n", | |
" \n", | |
" imgs = []\n", | |
" for i in range(num_iterations):\n", | |
" grads, all_loss = compute_grads(cfg)\n", | |
" loss, style_score, content_score = all_loss\n", | |
" opt.apply_gradients([(grads, init_image)])\n", | |
" clipped = tf.clip_by_value(init_image, min_vals, max_vals)\n", | |
" init_image.assign(clipped)\n", | |
" end_time = time.time() \n", | |
" \n", | |
" if loss < best_loss:\n", | |
" # Update best loss and best image from total loss. \n", | |
" best_loss = loss\n", | |
" best_img = deprocess_img(init_image.numpy())\n", | |
"\n", | |
" if i % display_interval== 0:\n", | |
" start_time = time.time()\n", | |
" \n", | |
" # Use the .numpy() method to get the concrete numpy array\n", | |
" plot_img = init_image.numpy()\n", | |
" plot_img = deprocess_img(plot_img)\n", | |
" imgs.append(plot_img)\n", | |
" IPython.display.clear_output(wait=True)\n", | |
" IPython.display.display_png(Image.fromarray(plot_img))\n", | |
" print('Iteration: {}'.format(i)) \n", | |
" print('Total loss: {:.4e}, ' \n", | |
" 'style loss: {:.4e}, '\n", | |
" 'content loss: {:.4e}, '\n", | |
" 'time: {:.4f}s'.format(loss, style_score, content_score, time.time() - start_time))\n", | |
" print('Total time: {:.4f}s'.format(time.time() - global_start))\n", | |
" IPython.display.clear_output(wait=True)\n", | |
" plt.figure(figsize=(14,4))\n", | |
" for i,img in enumerate(imgs):\n", | |
" plt.subplot(num_rows,num_cols,i+1)\n", | |
" plt.imshow(img)\n", | |
" plt.xticks([])\n", | |
" plt.yticks([])\n", | |
" \n", | |
" return best_img, best_loss " | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "t1JTiKb_Qyu3", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"# Let's do this thing! \n", | |
"(Remember to enable GPUs)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "vSVMx4burydi", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"best, best_loss = run_style_transfer(content_path, \n", | |
" style_path, num_iterations=1000)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "dzJTObpsO3TZ", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"Image.fromarray(best)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "dCXQ9vSnQbDy", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"To download the image from Colab uncomment the code below:\n", | |
"\n", | |
"(Only works in Chrome browser, not Firefox)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "SSH6OpyyQn7w", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"#from google.colab import files\n", | |
"#files.download('astro_monet.png')" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "LwiZfCW0AZwt", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Visualize outputs\n", | |
"We \"deprocess\" the output image in order to remove the processing that was applied to it. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "lqTQN1PjulV9", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def show_results(best_img, content_path, style_path, show_large_final=True):\n", | |
" plt.figure(figsize=(10, 5))\n", | |
" content = load_img(content_path) \n", | |
" style = load_img(style_path)\n", | |
"\n", | |
" plt.subplot(1, 2, 1)\n", | |
" imshow(content, 'Content Image')\n", | |
"\n", | |
" plt.subplot(1, 2, 2)\n", | |
" imshow(style, 'Style Image')\n", | |
"\n", | |
" if show_large_final: \n", | |
" plt.figure(figsize=(10, 10))\n", | |
"\n", | |
" plt.imshow(best_img)\n", | |
" plt.title('Output Image')\n", | |
" plt.show()" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "i6d6O50Yvs6a", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"show_results(best, content_path, style_path)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "tyGMmWh2Pss8", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Try it on other images" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "x2TePU39k9lb", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Starry night + Tuebingen" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "ES9dC6ZyJBD2", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"best_starry_night, best_loss = run_style_transfer('/tmp/nst/Tuebingen_Neckarfront.jpg',\n", | |
" '/tmp/nst/1024px-Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg')" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "X8w8WLkKvzXu", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"show_results(best_starry_night, '/tmp/nst/Tuebingen_Neckarfront.jpg',\n", | |
" '/tmp/nst/1024px-Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg')" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "U-y02GWonqnD", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"\n", | |
"**[Image of Tuebingen](https://commons.wikimedia.org/wiki/File:Tuebingen_Neckarfront.jpg)** \n", | |
"Photo By: Andreas Praefcke [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC BY 3.0 (https://creativecommons.org/licenses/by/3.0)], from Wikimedia Commons\n", | |
"\n", | |
"\n" | |
] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment