Skip to content

Instantly share code, notes, and snippets.

@thomwolf
Last active April 9, 2019 12:49
Show Gist options
  • Save thomwolf/dc5c9bff2305e3689bd7ad5c473c9f86 to your computer and use it in GitHub Desktop.
Save thomwolf/dc5c9bff2305e3689bd7ad5c473c9f86 to your computer and use it in GitHub Desktop.
Using a parallel model and a parallel criterion in Pytorch
from parallel import DataParallelModel, DataParallelCriterion
parallel_model = DataParallelModel(model) # Encapsulate the model
parallel_loss = DataParallelCriterion(loss_function) # Encapsulate the loss function
predictions = parallel_model(inputs) # Parallel forward pass
# "predictions" is a tuple of n_gpu tensors
loss = parallel_loss(predictions, labels) # Compute loss function in parallel
loss.backward() # Backward pass
optimizer.step() # Optimizer step
predictions = parallel_model(inputs) # Parallel forward pass with new parameters
@dmenig
Copy link

dmenig commented Jan 24, 2019

loss.backward() crashes on multiple gpus, loss is then a tuple of losses. Wouldn't summing them up go against the idea of this optimization since it would bring them back to one gpu ?

@puttkraidej
Copy link

Thanks for the solution @thomwolf

By the way, I got this error message
TypeError: add(): argument 'other' (position 1) must be Tensor, not tuple
after predictions = parallel_model(inputs)

I might did something wrong, any suggestion on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment