fabsta/VGG from scratch.md

Last active November 26, 2017 17:26

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/fabsta/5d6a3d60fb616ec22d8631e7c437b142.js"></script>
Save fabsta/5d6a3d60fb616ec22d8631e7c437b142 to your computer and use it in GitHub Desktop.

Download ZIP

[Deep learning] #deeplearning

Raw

evaluation.md

Evaluation

import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo')
plt.plot(epochs, val_acc, 'b')
plt.title('Training and validation accuracy')

plt.figure()

plt.plot(epochs, loss, 'bo')
plt.plot(epochs, val_loss, 'b')
plt.title('Training and validation loss')

plt.show()

Raw

reading_images.md

into numpy array

Define helper function

def read_img(img_id, train_or_test, size):
    """Read and resize image.
    # Arguments
        img_id: string
        train_or_test: string 'train' or 'test'.
        size: resize the original image.
    # Returns
        Image as numpy array.
    """
    img = image.load_img(join(data_dir, train_or_test, '%s.jpg' % img_id), target_size=size)
    img = image.img_to_array(img)
    return img

INPUT_SIZE = 299
POOLING = 'avg'
x_train = np.zeros((len(labels), INPUT_SIZE, INPUT_SIZE, 3), dtype='float32')


for i, img_id in tqdm(enumerate(labels['id'])):
    img = read_img(img_id, 'train', (INPUT_SIZE, INPUT_SIZE))
    x = xception.preprocess_input(np.expand_dims(img.copy(), axis=0))
    x_train[i] = x
print('Train Images shape: {} size: {:,}'.format(x_train.shape, x_train.size))

Raw

removing_dropout.md

Removing dropout

Our high level approach here will be to start with our fine-tuned cats vs dogs model (with dropout), then fine-tune all the dense layers, after removing dropout from them. The steps we will take are:

Re-create and load our modified VGG model with binary dependent (i.e. dogs v cats)
Split the model between the convolutional (conv) layers and the dense layers
Pre-calculate the output of the conv layers, so that we don't have to redundently re-calculate them on every epoch
Create a new model with just the dense layers, and dropout p set to zero
Train this new model using the output of the conv layers as training data.

As before we need to start with a working model, so let's bring in our working VGG 16 model and change it to predict our binary dependent...

In [4]:

model = vgg_ft(2)
model = vgg_ft(2)
...and load our fine-tuned weights.
In [5]:

model.load_weights(model_path+'finetune3.h5')
We're going to be training a number of iterations without dropout, so it would be best for us to pre-calculate the input to the fully connected layers - i.e. the Flatten() layer. We'll start by finding this layer in our model, and creating a new model that contains just the layers up to and including this layer:
In [6]:

l
layers = model.layers
In [7]:

last_conv_idx = [index for index,layer in enumerate(layers) 
                     if type(layer) is Convolution2D][-1]
In [8]:

last_conv_idx
Out[8]:
30
In [9]:

layers[last_conv_idx]
layers[last_conv_idx]
Out[9]:
<keras.layers.convolutional.Convolution2D at 0x7f64a680f9d0>
In [10]:

conv_layers = layers[:last_conv_idx+1]
conv_model = Sequential(conv_layers)
# Dense layers - also known as fully connected or 'FC' layers
fc_layers = layers[last_conv_idx+1:]
Now we can use the exact same approach to creating features as we used when we created the linear model from the imagenet predictions in the last lesson - it's only the model that has changed. As you're seeing, there's a fairly small number of "recipes" that can get us a long way!
In [4]:

batches = get_batches(path+'train', shuffle=False, batch_size=batch_size)
val_batches = get_batches(path+'valid', shuffle=False, batch_size=batch_size)

val_classes = val_batches.classes
trn_classes = batches.classes
val_labels = onehot(val_classes)
trn_labels = onehot(trn_classes)
Found 23000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.
In [5]:

batches.class_indices
Out[5]:
{'cats': 0, 'dogs': 1}
In [12]:

val_features = conv_model.predict_generator(val_batches, val_batches.nb_sample)
trn_features = conv_model.predict_generator(batches, batches.nb_sample)


In [41]:

save_array(model_path + 'train_convlayer_features.bc', trn_features)
save_array(model_path + 'valid_convlayer_features.bc', val_features)
save_array(model_path + 'train_convlayer_features.bc', trn_features)
save_array(model_path + 'valid_convlayer_features.bc', val_features)
In [89]:

trn_features = load_array(model_path+'train_convlayer_features.bc')
val_features = load_array(model_path+'valid_convlayer_features.bc')
In [90]:

trn_features.shape
Out[90]:
(23000, 512, 14, 14)
For our new fully connected model, we'll create it using the exact same architecture as the last layers of VGG 16, so that we can conveniently copy pre-trained weights over from that model. However, we'll set the dropout layer's p values to zero, so as to effectively remove dropout.
In [16]:

# Copy the weights from the pre-trained model.
# NB: Since we're removing dropout, we want to half the weights
def proc_wgts(layer): return [o/2 for o in layer.get_weights()]
In [17]:

# Such a finely tuned model needs to be updated very slowly!
opt = RMSprop(lr=0.00001, rho=0.7)
In [18]:

def get_fc_model():
    model = Sequential([
        MaxPooling2D(input_shape=conv_layers[-1].output_shape[1:]),
        Flatten(),
        Dense(4096, activation='relu'),
        Dropout(0.),
        Dense(4096, activation='relu'),
        Dropout(0.),
        Dense(2, activation='softmax')
        ])

    for l1,l2 in zip(model.layers, fc_layers): l1.set_weights(proc_wgts(l2))

    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    return model
In [19]:

fc_model = get_fc_model()
And fit the model in the usual way:
In [15]:

fc_model.fit(trn_features, trn_labels, nb_epoch=8, 
             batch_size=batch_size, validation_data=(val_features, val_labels))
Train on 23000 samples, validate on 2000 samples
Epoch 1/8
23000/23000 [==============================] - 17s - loss: 0.2577 - acc: 0.9817 - val_loss: 0.3478 - val_acc: 0.9765
Epoch 2/8
23000/23000 [==============================] - 17s - loss: 0.2052 - acc: 0.9853 - val_loss: 0.2789 - val_acc: 0.9785
Epoch 3/8
23000/23000 [==============================] - 17s - loss: 0.1553 - acc: 0.9895 - val_loss: 0.2358 - val_acc: 0.9845
Epoch 4/8
23000/23000 [==============================] - 17s - loss: 0.1388 - acc: 0.9909 - val_loss: 0.1914 - val_acc: 0.9865
Epoch 5/8
23000/23000 [==============================] - 17s - loss: 0.1335 - acc: 0.9912 - val_loss: 0.2181 - val_acc: 0.9855
Epoch 6/8
23000/23000 [==============================] - 17s - loss: 0.1126 - acc: 0.9924 - val_loss: 0.1850 - val_acc: 0.9875
Epoch 7/8
23000/23000 [==============================] - 17s - loss: 0.1080 - acc: 0.9928 - val_loss: 0.2226 - val_acc: 0.9840
Epoch 8/8
23000/23000 [==============================] - 17s - loss: 0.1005 - acc: 0.9935 - val_loss: 0.2256 - val_acc: 0.9850
Out[15]:
<keras.callbacks.History at 0x7f66b6bcdd10>
In [16]:

fc_model.save_weights(model_path+'no_dropout.h5')
In [ ]:

fc_model.load_weights(model_path+'no_dropout.h5')

Raw

VGG from scratch.md

model setup

from numpy.random import random, permutation
from scipy import misc, ndimage
from scipy.ndimage.interpolation import zoom

import keras
from keras import backend as K
from keras.utils.data_utils import get_file
from keras.models import Sequential, Model
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers import Input
from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD, RMSprop
from keras.preprocessing import image

Reading in the data

# Let's import the mappings from VGG ids to imagenet category ids and descriptions, for display purposes later.
FILES_PATH = 'http://www.platform.ai/models/'; CLASS_FILE='imagenet_class_index.json'
# Keras' get_file() is a handy function that downloads files, and caches them for re-use later
fpath = get_file(CLASS_FILE, FILES_PATH+CLASS_FILE, cache_subdir='models')
with open(fpath) as f: class_dict = json.load(f)
# Convert dictionary with string indexes into an array
classes = [class_dict[str(i)][1] for i in range(len(class_dict))]

# Here's a few examples of the categories we just imported:
classes[:5]
>> ['tench', 'goldfish', 'great_white_shark', 'tiger_shark', 'hammerhead']

# Model from scratch
##  Creating the model involves creating the model architecture, 
##  and then loading the model weights into that architecture

# VGG has just one type of convolutional block, and one type of fully connected ('dense') block. Here's the convolutional block definition:
def ConvBlock(layers, model, filters):
    for i in range(layers): 
        model.add(ZeroPadding2D((1,1)))
        model.add(Conv2D(filters, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

# ...and here's the fully-connected definition.
def FCBlock(model):
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    
# Mean of each channel as provided by VGG researchers
# need to make two changes in regards to the original VGG model
vgg_mean = np.array([123.68, 116.779, 103.939]).reshape((3,1,1))

def vgg_preprocess(x):
    x = x - vgg_mean     # subtract mean
    return x[:, ::-1]    # reverse axis bgr->rgb
  
# Define model architecture
```python
def VGG_16():
    model = Sequential()
    model.add(Lambda(vgg_preprocess, input_shape=(3,224,224)))

    ConvBlock(2, model, 64)
    ConvBlock(2, model, 128)
    ConvBlock(3, model, 256)
    ConvBlock(3, model, 512)
    ConvBlock(3, model, 512)

    model.add(Flatten())
    FCBlock(model)
    FCBlock(model)
    model.add(Dense(1000, activation='softmax'))
    return model

We can create the model like any python object:

model = VGG_16()

Weights

We are using pre-trained weights here: Otherwise have to re-train model from scratch.

fpath = get_file('vgg16.h5', FILES_PATH+'vgg16.h5', cache_subdir='models') model.load_weights(fpath)

Getting imagenet predictions

The setup of the imagenet model is now complete, so all we have to do is grab a batch of images and call predict() on them.

batch_size = 4

Keras provides functionality to create batches of data from directories containing images; all we have to do is to define the size to resize the images to, what type of labels to create, whether to randomly shuffle the images, and how many images to include in each batch. We use this little wrapper to define some helpful defaults appropriate for imagenet data:

def get_batches(dirname, gen=image.ImageDataGenerator(), shuffle=True, batch_size=batch_size, class_mode='categorical'): return gen.flow_from_directory(path+dirname, target_size=(224,224), class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)

From here we can use exactly the same steps as before to look at predictions from the model.

batches = get_batches('train', batch_size=batch_size) val_batches = get_batches('valid', batch_size=batch_size) imgs,labels = next(batches)

This shows the 'ground truth'

plots(imgs, titles=labels)

The VGG model returns 1,000 probabilities for each image, representing the probability that the model assigns to each possible imagenet category for each image. By finding the index with the largest probability (with np.argmax()) we can find the predicted label.

def pred_batch(imgs): preds = model.predict(imgs) idxs = np.argmax(preds, axis=1)

print('Shape: {}'.format(preds.shape))
print('First 5 classes: {}'.format(classes[:5]))
print('First 5 probabilities: {}\n'.format(preds[0, :5]))
print('Predictions prob/class: ')

for i in range(len(idxs)):
    idx = idxs[i]
    print ('  {:.4f}/{}'.format(preds[i, idx], classes[idx]))

pred_batch(imgs)

Shape: (4, 1000) First 5 classes: ['tench', 'goldfish', 'great_white_shark', 'tiger_shark', 'hammerhead'] First 5 probabilities: [ 8.9571e-07 5.3308e-05 2.6785e-05 1.4977e-05 1.3189e-04]

Predictions prob/class: 0.1518/washer 0.4697/Maltese_dog 0.6326/Siamese_cat 0.3976/Egyptian_cat

Raw

VGG.md

First, create a Vgg16 object:

vgg = Vgg16()

Grab data

# Let's grab batches of 4 images from our training folder:
batches = vgg.get_batches(path+'train', batch_size=4)

Look at examples

# Batches is just a regular python iterator. Each iteration returns both the images themselves, as well as the labels.
imgs,labels = next(batches)

Plot examples

plots(imgs, titles=labels)

We can now pass the images to Vgg16's predict() function to get back probabilities, category indexes, and category names for each image's VGG prediction.

vgg.predict(imgs, True)
>>
(array([ 0.9203,  0.5414,  0.0627,  0.3767], dtype=float32),
 array([284, 283, 551, 194]),
 ['Siamese_cat', 'Persian_cat', 'face_powder', 'Dandie_Dinmont'])

# The category indexes are based on the ordering of categories used in the VGG model - e.g here are the first four:

```python
vgg.classes[:4]
>>
['tench', 'goldfish', 'great_white_shark', 'tiger_shark']

Raw

VGG_finetune.md

Import our class, and instantiate

batch_size = 16
import vgg16; reload(vgg16)
from vgg16 import Vgg16
vgg = Vgg16()

# Grab a few images at a time for training and validation.
# NB: They must be in subdirectories named based on their category
batches = vgg.get_batches(path+'train', batch_size=batch_size)
val_batches = vgg.get_batches(path+'valid', batch_size=batch_size*2)
vgg.finetune(batches)
vgg.fit(batches, val_batches, batch_size, nb_epoch=1)