Skip to content

Instantly share code, notes, and snippets.

@jeremyjordan
Last active March 3, 2022 08:46
Show Gist options
  • Save jeremyjordan/ac0229abd4b2b7000aca1643e88e0f02 to your computer and use it in GitHub Desktop.
Save jeremyjordan/ac0229abd4b2b7000aca1643e88e0f02 to your computer and use it in GitHub Desktop.
Keras Callback for finding the optimal range of learning rates
import matplotlib.pyplot as plt
import keras.backend as K
from keras.callbacks import Callback
class LRFinder(Callback):
'''
A simple callback for finding the optimal learning rate range for your model + dataset.
# Usage
```python
lr_finder = LRFinder(min_lr=1e-5,
max_lr=1e-2,
steps_per_epoch=np.ceil(epoch_size/batch_size),
epochs=3)
model.fit(X_train, Y_train, callbacks=[lr_finder])
lr_finder.plot_loss()
```
# Arguments
min_lr: The lower bound of the learning rate range for the experiment.
max_lr: The upper bound of the learning rate range for the experiment.
steps_per_epoch: Number of mini-batches in the dataset. Calculated as `np.ceil(epoch_size/batch_size)`.
epochs: Number of epochs to run experiment. Usually between 2 and 4 epochs is sufficient.
# References
Blog post: jeremyjordan.me/nn-learning-rate
Original paper: https://arxiv.org/abs/1506.01186
'''
def __init__(self, min_lr=1e-5, max_lr=1e-2, steps_per_epoch=None, epochs=None):
super().__init__()
self.min_lr = min_lr
self.max_lr = max_lr
self.total_iterations = steps_per_epoch * epochs
self.iteration = 0
self.history = {}
def clr(self):
'''Calculate the learning rate.'''
x = self.iteration / self.total_iterations
return self.min_lr + (self.max_lr-self.min_lr) * x
def on_train_begin(self, logs=None):
'''Initialize the learning rate to the minimum value at the start of training.'''
logs = logs or {}
K.set_value(self.model.optimizer.lr, self.min_lr)
def on_batch_end(self, epoch, logs=None):
'''Record previous batch statistics and update the learning rate.'''
logs = logs or {}
self.iteration += 1
self.history.setdefault('lr', []).append(K.get_value(self.model.optimizer.lr))
self.history.setdefault('iterations', []).append(self.iteration)
for k, v in logs.items():
self.history.setdefault(k, []).append(v)
K.set_value(self.model.optimizer.lr, self.clr())
def plot_lr(self):
'''Helper function to quickly inspect the learning rate schedule.'''
plt.plot(self.history['iterations'], self.history['lr'])
plt.yscale('log')
plt.xlabel('Iteration')
plt.ylabel('Learning rate')
plt.show()
def plot_loss(self):
'''Helper function to quickly observe the learning rate experiment results.'''
plt.plot(self.history['lr'], self.history['loss'])
plt.xscale('log')
plt.xlabel('Learning rate')
plt.ylabel('Loss')
plt.show()
@singhay
Copy link

singhay commented Jul 8, 2018

I was halfway through writing this and was searching for the original paper before this popped up on google.

Thank You Sir for your contribution!

@singhay
Copy link

singhay commented Jul 19, 2018

@jeremyjordan
Description of steps_per_epoch is wrong, it should rather be np.ceil(self.total_samples / float(self.batch_size)) instead of np.ceil(epoch_size/batch_size) (hint: think about when epoch is 1)

@drsxr
Copy link

drsxr commented Aug 9, 2018

@jeremyjordan I am trying to implement this code in Keras 2.1.3 and TF 1.8 as:
lr_finder = LRFinder(min_lr=1e-5, max_lr=1e-2, steps_per_epoch=np.ceil(epoch_size/batch_size), epochs=3)

and then using a model.fit_generator:
history = model.fit_generator( train_generator, steps_per_epoch=Training_case_number/Training_batch_size, epochs=epochs, validation_data=validation_generator, validation_steps=20, callbacks = [LRFinder])

And I am getting a : TypeError: set_model() missing 1 required positional argument: 'model' error.

Any suggestions?

@singhay
Copy link

singhay commented Aug 12, 2018

@drsxr change your callback from LRFinder to lr_finder

@OmaymaS
Copy link

OmaymaS commented Jan 22, 2019

So I assume epoch_size should be defined explicitly in advance before calling LRFinder(), which is the number of entries in X_train?

@WittmannF
Copy link

Here's my version: https://gist.github.com/WittmannF/c55ed82d27248d18799e2be324a79473

Three changes were made:

  • Number of iterations is automatically inferred as the number of batches (i.e., it will always run over one epoch)
  • Set of learning rates are spaced evenly on a log scale (a geometric progression) using np.geospace
  • Automatic stop criteria if current_loss > 10 x lowest_loss

@jeremyjordan
Copy link
Author

@WittmannF looks great! Thanks for sharing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment