Last active
April 9, 2019 12:49
-
-
Save thomwolf/dc5c9bff2305e3689bd7ad5c473c9f86 to your computer and use it in GitHub Desktop.
Using a parallel model and a parallel criterion in Pytorch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from parallel import DataParallelModel, DataParallelCriterion | |
parallel_model = DataParallelModel(model) # Encapsulate the model | |
parallel_loss = DataParallelCriterion(loss_function) # Encapsulate the loss function | |
predictions = parallel_model(inputs) # Parallel forward pass | |
# "predictions" is a tuple of n_gpu tensors | |
loss = parallel_loss(predictions, labels) # Compute loss function in parallel | |
loss.backward() # Backward pass | |
optimizer.step() # Optimizer step | |
predictions = parallel_model(inputs) # Parallel forward pass with new parameters |
Thanks for the solution @thomwolf
By the way, I got this error message
TypeError: add(): argument 'other' (position 1) must be Tensor, not tuple
after predictions = parallel_model(inputs)
I might did something wrong, any suggestion on this
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
loss.backward()
crashes on multiple gpus, loss is then a tuple of losses. Wouldn't summing them up go against the idea of this optimization since it would bring them back to one gpu ?