Skip to content

Instantly share code, notes, and snippets.

@niazangels
Last active October 4, 2019 06:34
Show Gist options
  • Save niazangels/6425994e1603b65c1f49e6a3d410c707 to your computer and use it in GitHub Desktop.
Save niazangels/6425994e1603b65c1f49e6a3d410c707 to your computer and use it in GitHub Desktop.

Mistakes I've made in training deep neural networks

1. Using a Relu activation in the last layer for classification

nn.CrossEntropyLoss() would automatically apply a softmax over the last layer.

2. Backpropagating with an optimizer whose params were initialized with a different model

Especially in Jupyter Lab.

This is particularly weird because it won't raise any errors and your loss will appear to be stable and the model will not learn anything

   optimizer = optim.SGD(model1.parameters())  # Initialized with model1
   ...
   ...
   preds = model2(inputs)           # This is a different model2
   loss = criterion(preds, labels)
   loss.backward()
   optimizer.step()                 # Nothing happens

3. model.train() before model.to(device)

Ideally you should move your model to the device before calling train(). This is easy to detect since Pytorch will raise a warning when you try this. Another mistake that will raise an error is if you have your model in one device and your input in another

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment