Mistakes I've made in training deep neural networks

1. Using a Relu activation in the last layer for classification

nn.CrossEntropyLoss() would automatically apply a softmax over the last layer.

2. Backpropagating with an optimizer whose params were initialized with a different model

Especially in Jupyter Lab.

This is particularly weird because it won't raise any errors and your loss will appear to be stable and the model will not learn anything

   optimizer = optim.SGD(model1.parameters())  # Initialized with model1
   ...
   ...
   preds = model2(inputs)           # This is a different model2
   loss = criterion(preds, labels)
   loss.backward()
   optimizer.step()                 # Nothing happens

3. `model.train()` before `model.to(device)`

Ideally you should move your model to the device before calling train(). This is easy to detect since Pytorch will raise a warning when you try this. Another mistake that will raise an error is if you have your model in one device and your input in another

niazangels/dl-mistakes.md

Mistakes I've made in training deep neural networks

1. Using a Relu activation in the last layer for classification

2. Backpropagating with an optimizer whose params were initialized with a different model

3. model.train() before model.to(device)

3. `model.train()` before `model.to(device)`