- β οΈπ’ Issue: training error
[1,mpirank:0,algo-1]<stderr>:../aten/src/ATen/native/cuda/Loss.cu:242: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed.
[1,mpirank:0,algo-1]<stderr>:../aten/src/ATen/native/cuda/Loss.cu:242: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [6[1,mpirank:0,algo-1]<stderr>:,0,0] Assertion `t >= 0 && t < n_classes` failed.
[1,mpirank:0,algo-1]<stderr>:../aten/src/ATen/native/cuda/Loss.cu:242: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [30,0,0] Assertion `t >= 0 && t < n_classes` failed.
...
[1,mpirank:1,algo-2]<stdout>: File "train.py", line 675, in <module>
[1,mpirank:1,algo-2]<stdout>: main(task)
[1,mpirank:1,algo-2]<stdout>: File "train.py", line 572, in main