Skip to content

Instantly share code, notes, and snippets.

@fchollet
Last active July 23, 2024 16:32
Show Gist options
  • Save fchollet/7eb39b44eb9e16e59632d25fb3119975 to your computer and use it in GitHub Desktop.
Save fchollet/7eb39b44eb9e16e59632d25fb3119975 to your computer and use it in GitHub Desktop.
Fine-tuning a Keras model. Updated to the Keras 2.0 API.
'''This script goes along the blog post
"Building powerful image classification models using very little data"
from blog.keras.io.
It uses data that can be downloaded at:
https://www.kaggle.com/c/dogs-vs-cats/data
In our setup, we:
- created a data/ folder
- created train/ and validation/ subfolders inside data/
- created cats/ and dogs/ subfolders inside train/ and validation/
- put the cat pictures index 0-999 in data/train/cats
- put the cat pictures index 1000-1400 in data/validation/cats
- put the dogs pictures index 12500-13499 in data/train/dogs
- put the dog pictures index 13500-13900 in data/validation/dogs
So that we have 1000 training examples for each class, and 400 validation examples for each class.
In summary, this is our directory structure:
```
data/
train/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
validation/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
```
'''
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
# path to the model weights files.
weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'fc_model.h5'
# dimensions of our images.
img_width, img_height = 150, 150
train_data_dir = 'cats_and_dogs_small/train'
validation_data_dir = 'cats_and_dogs_small/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16
# build the VGG16 network
model = applications.VGG16(weights='imagenet', include_top=False)
print('Model loaded.')
# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))
# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)
# add the model on top of the convolutional base
model.add(top_model)
# set the first 25 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in model.layers[:25]:
layer.trainable = False
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
# fine-tune the model
model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
epochs=epochs,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)
@rahulkulhalli
Copy link

I've got a small query. While performing transfer learning in order to train ONLY the bottleneck layers, how many epochs should you train it for?

The example given here says 'train for a few epochs'. Can anyone give me a general rule of thumb?

@aafmhh
Copy link

aafmhh commented Jul 3, 2018

Hi
I'm running the code at https://github.com/imatge-upc/detection-2016-nipsws/commits/master
I installed keras 2.0.2, theano0.9.0 with (anaconda3)python3.5. and coding by pycharm on windows10.
but I'm getting an error:

File "C:/Users/heram/PycharmProjects/Hirarchical obj detec/scripts/image_zooms_training.py", line 78, in
model_vgg = obtain_compiled_vgg_16(path_vgg)
File "C:\Users\heram\PycharmProjects\Hirarchical obj detec\scripts\features.py", line 251, in obtain_compiled_vgg_16
model = vgg_16(vgg_weights_path)
File "C:\Users\heram\PycharmProjects\Hirarchical obj detec\scripts\features.py", line 295, in vgg_16
model.add(Flatten())
File "C:\Users\heram\Anaconda3\envs\Hirarchical obj detec\lib\site-packages\keras\models.py", line 455, in add
output_tensor = layer(self.outputs[0])
File "C:\Users\heram\Anaconda3\envs\Hirarchical obj detec\lib\site-packages\keras\engine\topology.py", line 559, in call
output_shape = self.compute_output_shape(input_shape)
File "C:\Users\heram\Anaconda3\envs\Hirarchical obj detec\lib\site-packages\keras\layers\core.py", line 488, in compute_output_shape
'(got ' + str(input_shape[1:]) + '. '
ValueError: The shape of the input to "Flatten" is not fully defined (got (0, 7, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

code
in image_zoomz_training.py:
model_vgg = obtain_compiled_vgg_16(path_vgg)

in features.py file:
def obtain_compiled_vgg_16(vgg_weights_path):
model = vgg_16(vgg_weights_path)
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy')
return model

def vgg_16(weights_path=None):
model = Sequential()
model.add(ZeroPadding2D((1, 1), input_shape=(3, 224, 224)))
model.add(Conv2D(64,(3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(64, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(128, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(128, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1000, activation='softmax'))

if weights_path:
    model.load_weights(weights_path)

return model

@francescopelizza-omega
Copy link

Hello there,

So when the training is concluded and I am happy enough...How can I then use the trained model to predict new unknow pictures of cats/dogs?

Thanks

Copy link

ghost commented Nov 4, 2018

where I could find ## top_model_weights_path = 'fc_model.h5' . fc_model.h5 is not in my directory from where i can download it

@Aaron4Fun
Copy link

Can we use data argument before the fine-tuning?

@FlareJia
Copy link

FlareJia commented Feb 1, 2019

在哪里可以找到## top_model_weights_path ='fc_model.h5'。fc_model.h5不在我可以下载的目录中

fc_model.h5 is the file classifier_from_little_data_script_2.py product
you can use 'bottleneck_fc_model.h5' instead.
like: top_model_weights_path = 'bottleneck_fc_model.h5'

@drfarmerdave
Copy link

drfarmerdave commented Mar 15, 2019

@ghost
@micklexqg
@raaju-shiv
@Saumya7
Those others having the error similar to "Dimension 0 in both shapes must be equal, but are 25088 and 8192 for 'Assign_26'"

Make sure the image dimensions are the same for both the classifier_from_little_data_script_2.py and classifier_from_little_data_script_3.py
e.g.
img_width, img_height = 150, 150

@pchris24
Copy link

Why did I get the same val_acc in every epochs:
Epoch 1/50
125/125 [==============================] - 14s 109ms/step - loss: 0.5215 - acc: 0.9285 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 2/50
125/125 [==============================] - 13s 101ms/step - loss: 0.5790 - acc: 0.9245 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 3/50
125/125 [==============================] - 13s 102ms/step - loss: 0.5965 - acc: 0.9265 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 4/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6562 - acc: 0.9135 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 5/50
125/125 [==============================] - 13s 102ms/step - loss: 0.5102 - acc: 0.9315 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 6/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6308 - acc: 0.9305 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 7/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6134 - acc: 0.9230 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 8/50
125/125 [==============================] - 12s 100ms/step - loss: 0.6208 - acc: 0.9190 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 9/50
125/125 [==============================] - 13s 100ms/step - loss: 0.5764 - acc: 0.9295 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 10/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6356 - acc: 0.9185 - val_loss: 1.0519 - val_acc: 0.8838

I am fine-tuning ResNet and my validation accuracy remains constant too. @JinwenJay what did you do to overcome this?

@jaysumona2019
Copy link

For me, the model.fit runs fine, but I am having problem when I use the model to do prediction
img = image.load_img(img_path, target_size=(224, 224))
img_data = image.img_to_array(img)
img_data = np.expand_dims(img_data, axis=0)
img_data = preprocess_input(img_data)

feature = model.predict(img_data,verbose=0)

I am getting the error: Error when checking input: expected sequential_3_input to have shape (7, 7, 512) but got array with shape (224, 224, 3)
any idea how to fix this issue?

@devjaynemorais
Copy link

devjaynemorais commented Oct 18, 2019

Could someone help me solve the following problem?

Environment: Keras==1.1.0 Theano==1.0.2 numpy==1.15.1 scipy==1.3.0

I created a fine tuning and frozen all layers except layer [2] because I want to get the activation values only from layer [2].

Network summary before freezing:

Layer (type) | Output Shape | Param # | Connected to


dense_1 (Dense) (None, 512) 2097664 dense_input_1[0][0]


dropout_1 (Dropout) (None, 512) 0 dense_1[0][0]
dense_1[0][0]


dense_2 (Dense) (None, 32) 16416 dropout_1[0][0]
dropout_1[1][0]


dropout_2 (Dropout) (None, 32) 0 dense_2[0][0]
dense_2[1][0]


dense_3 (Dense) (None, 1) 33 dropout_2[0][0]
dropout_2[1][0]


Total params: 2114113

Freezing layers:

for layer in model.layers[0:]:
------ layer.trainable = False
model.layers[2].trainable = True

Network summary after freezing:

`Layer (type) | Output Shape | Param # | Connected to


dense_1 (Dense) (None, 512) 0 dense_input_1[0][0]


dropout_1 (Dropout) (None, 512) 0 dense_1[0][0]


dense_2 (Dense) (None, 32) 16416 dropout_1[1][0]


dropout_2 (Dropout) (None, 32) 0 dense_2[1][0]


dense_3 (Dense) (None, 1) 0 dropout_2[1][0]


Total params: 16416`

To print layer output [2]:

OutFunc = keras.backend.function([model2.input], [model2.layers[2].get_output_at(0)])
out_val = OutFunc([inputs])[0]
print(out_val)

Returns the following output error:

MissingInputError Traceback (most recent call last)
in
1 #OutFunc = keras.backend.function([model2.input], [model2.layers[0].output])
----> 2 OutFunc = keras.backend.function([model2.input], [model2.layers[2].get_output_at(0)])
3
4
5 out_val = OutFunc([inputs])[0]

~/anaconda3/lib/python3.7/site-packages/keras/backend/theano_backend.py in function(inputs, outputs, updates, **kwargs)
725 return T.clip(x, min_value, max_value)
726
--> 727
728 def equal(x, y):
729 return T.eq(x, y)

~/anaconda3/lib/python3.7/site-packages/keras/backend/theano_backend.py in init(self, inputs, outputs, updates, **kwargs)
711
712 def pow(x, a):
--> 713 return T.pow(x, a)
714
715

~/anaconda3/lib/python3.7/site-packages/theano/compile/function.py in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
315 on_unused_input=on_unused_input,
316 profile=profile,
--> 317 output_keys=output_keys)
318 return fn

~/anaconda3/lib/python3.7/site-packages/theano/compile/pfunc.py in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
484 accept_inplace=accept_inplace, name=name,
485 profile=profile, on_unused_input=on_unused_input,
--> 486 output_keys=output_keys)
487
488

~/anaconda3/lib/python3.7/site-packages/theano/compile/function_module.py in orig_function(inputs, outputs, mode, accept_inplace, name, profile, on_unused_input, output_keys)
1837 on_unused_input=on_unused_input,
1838 output_keys=output_keys,
-> 1839 name=name)
1840 with theano.change_flags(compute_test_value="off"):
1841 fn = m.create(defaults)

~/anaconda3/lib/python3.7/site-packages/theano/compile/function_module.py in init(self, inputs, outputs, mode, accept_inplace, function_builder, profile, on_unused_input, fgraph, output_keys, name)
1485 # OUTPUT VARIABLES)
1486 fgraph, additional_outputs = std_fgraph(inputs, outputs,
-> 1487 accept_inplace)
1488 fgraph.profile = profile
1489 else:

~/anaconda3/lib/python3.7/site-packages/theano/compile/function_module.py in std_fgraph(input_specs, output_specs, accept_inplace)
179
180 fgraph = gof.fg.FunctionGraph(orig_inputs, orig_outputs,
--> 181 update_mapping=update_mapping)
182
183 for node in fgraph.apply_nodes:

~/anaconda3/lib/python3.7/site-packages/theano/gof/fg.py in init(self, inputs, outputs, features, clone, update_mapping)
173
174 for output in outputs:
--> 175 self.import_r(output, reason="init")
176 for i, output in enumerate(outputs):
177 output.clients.append(('output', i))

~/anaconda3/lib/python3.7/site-packages/theano/gof/fg.py in import_r(self, variable, reason)
344 # Imports the owners of the variables
345 if variable.owner and variable.owner not in self.apply_nodes:
--> 346 self.import(variable.owner, reason=reason)
347 elif (variable.owner is None and
348 not isinstance(variable, graph.Constant) and

~/anaconda3/lib/python3.7/site-packages/theano/gof/fg.py in import(self, apply_node, check, reason)
389 "for more information on this error."
390 % (node.inputs.index(r), str(node)))
--> 391 raise MissingInputError(error_msg, variable=r)
392
393 for node in new_nodes:

MissingInputError: Input 0 of the graph (indices start from 0), used to compute InplaceDimShuffle{x,x}(keras_learning_phase), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.

Backtrace when that variable is created:

File "", line 219, in _call_with_frames_removed
File "/home/jayne/anaconda3/lib/python3.7/site-packages/keras/backend/init.py", line 61, in
from .theano_backend import *
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "/home/jayne/anaconda3/lib/python3.7/site-packages/keras/backend/theano_backend.py", line 23, in
_LEARNING_PHASE = T.scalar(dtype='uint8', name='keras_learning_phase') # 0 = test, 1 = train

@cpoptic
Copy link

cpoptic commented Oct 31, 2019

How to determine the optimal number of layers to freeze, as a function of:

  1. the similarity between the pretrained dataset and your target dataset, and
  2. the size of your target dataset.

Intuitively the more similar between the pretrained model and your dataset, the fewer layers you would need to set_trainable = False.

But it is not obvious exactly how many to set to False.

Here we're setting the first 25 layers (up to the last conv block) to non-trainable (weights will not be updated)

for layer in model.layers[:25]:
    layer.trainable = False

But I ask: why 25? Why not 20? Why not 35? How is this number determined?
One would expect a more algorithmic way to determine how many layers to set as non-trainable (even say as a percentage of the total number of layers)?

@allanchua101
Copy link

Hi,

How do you load a single image from the drive and do a prediction on it using the model produced by this script?

Thanks in advance

@aisha24a
Copy link

aisha24a commented Feb 24, 2020

is there a place from which we can download the fc_model.h5 ? i have a small dataset for eeg data can i apply the same method for them?

thanks in advance

@lokoprof09
Copy link

Hello sir,
I have 60 images for three classes (20 images for each class), can you please help me modify your code to suite my data. I will be very grateful.
my email is [email protected]

Thanks

@savin333
Copy link

savin333 commented May 17, 2020

Fine tuned models' Prediction code

This codes were checked by myself. They all worked fine.

  1. If someone want to predict image classes in same model script where model were trained, here is the code :
img_width, img_height = 224, 224 
batch_size = 1 

datagen = ImageDataGenerator(rescale=1. / 255)

test_generator = datagen.flow_from_directory(  
         test_dir,  
         target_size=(img_width, img_height),
         batch_size=batch_size,  
         class_mode=None,  
         shuffle=False)  

test_generator.reset()
   
pred= model.predict_generator(test_generator, steps = no_of_images/batch_size)
predicted_class_indices=np.argmax(pred, axis =1 )
labels = (train_generator.class_indices)
labels = dict((v, k) for k, v in labels.items())
predictions = [labels[k] for k in predicted_class_indices]
print(predicted_class_indices)
print (labels)
print (predictions)

This code is inspired by stack overflow answer. click here

  1. If someone want to predict image classes in different script (separate from training script file), here is the code :
from tensorflow.python.keras.preprocessing.image import ImageDataGenerator
import json
import os
from tensorflow.keras.models import model_from_json
 
#Just give below lines parameters
best_weights = 'path to .h5 weight file'
model_json = 'path to saved model json file'
test_dir =  'path to test images'

img_width, img_height = 224, 224 
batch_size = 1
nb_img_samples = #no of testing images

with open(model_json, 'r') as json_file:
    json_savedModel= json_file.read()

model = tf.keras.models.model_from_json(json_savedModel)

model.summary()

model.load_weights(best_weights)

datagen = ImageDataGenerator(rescale=1. / 255)

test_generator = datagen.flow_from_directory(  
         folder_path,  
         target_size=(img_width, img_height),
         batch_size=batch_size,  
         class_mode=None,  
         shuffle=False)  

test_generator.reset()
   
pred= model.predict_generator(test_generator, steps = nb_img_samples/batch_size)
predicted_class_indices=np.argmax(pred,axis=1)
labels = {'cats': 0, 'dogs': 1} #if you have more classes, just add like this in correct order where your training folder order.
labels = dict((v,k) for k,v in labels.items())
predictions = [labels[k] for k in predicted_class_indices]
print(predicted_class_indices)
print (labels)
print (predictions) 

@allanchua101 only you have to edit this directorys path according to your drive file paths.

@lokoprof09
Copy link

lokoprof09 commented May 18, 2020 via email

@savin333
Copy link

@lokoprof09 you are welcome

@upcdz
Copy link

upcdz commented Apr 15, 2021

Hi, I get the following error when fine tuning with classifier_from_little_data_script_3.py . Any idea anyone please help.

Traceback (most recent call last):
File "classifier3.py", line 35, in
top_model.add(Dense(256, activation='relu'))
File "/home/zhang/ENTER/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/zhang/ENTER/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 213, in add
output_tensor = layer(self.outputs[0])
File "/home/zhang/ENTER/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 897, in call
self._maybe_build(inputs)
File "/home/zhang/ENTER/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 2416, in _maybe_build
self.build(input_shapes) # pylint:disable=not-callable
File "/home/zhang/ENTER/lib/python3.8/site-packages/tensorflow/python/keras/layers/core.py", line 1154, in build
raise ValueError('The last dimension of the inputs to Dense '
ValueError: The last dimension of the inputs to Dense should be defined. Found None.

@upcdz
Copy link

upcdz commented Apr 15, 2021

@fchollet I would really appreciate it if you can help me,

@EsraGuclu
Copy link

@upcdz

I got the same problem, the following line causes this error:

flatten = Flatten(name='flatten')(vgg16_output)

I changed this line with using GlobalAveragePooling2D(), then it worked.

flatten = GlobalAveragePooling2D()(vgg16_output)

@Gray-ly
Copy link

Gray-ly commented Aug 21, 2021

ValueError: The last dimension of the inputs to Dense should be defined. Found None.

@ashk3301
Copy link

ashk3301 commented Dec 1, 2022

Sorry, I got a error like that The shape of the input to "Flatten" is not fully defined (got (None, None, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

I cannot find the weight file to download, '../keras/examples/vgg16_weights.h5' Thanks

@KennethYCK Hi! I am unable to understand how to get the input shape for my first layer. Can you help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment