Now available here: https://github.com/y0ast/pytorch-snippets/tree/main/fast_mnist
Last active
May 30, 2021 21:01
-
-
Save y0ast/f69966e308e549f013a92dc66debeeb4 to your computer and use it in GitHub Desktop.
Unfortunately that does not give the correct behavior: you're not randomizing your batches at each epoch which leads to significant reduced performance.
Yes this normalization is 0 mean, 1 std, for a VAE + MNIST you generally model your data as a multivariate bernoulli, which requires it to be between 0 and 1.
Unfortunately that does not give the correct behavior: you're not randomizing your batches at each epoch which leads to significant reduced performance.
That's true, the shuffling should then be done manually. I think this should work:
train_dataset.data = train_dataset.data[torch.randperm(train_dataset.data.shape[0])]
(assuming the first dimension of train_dataset.data
to be the batch size)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
That's brilliant, thanks! 😃
Anyway, I've noticed that it is possible to completely bypass the DataLoaders and work directly with tensors. This will increase even more the performance.
For example, if the batch size is equal to the whole training set (i.e. batch training), the GPU usage is near 100% (NVIDIA GeForce MX150). Note that this will decrease as the batch size decreases.
Regarding the execution time: with a batch size of 64 and 100 epochs, the whole execution time went from around 238s to around 181s.
(This is just an indication, I haven't performed a complete and rigorous test).
To use directly the tensors:
Furthermore, I think it's worth noticing that the normalization you perform (after the scaling) it's not always good, but this may depend on the task. For example, I'm working with autoencoders and if I don't comment that line I get bad results (in terms of reconstruction error).
Thanks again for your gist, I hope this can help improving even more the performance for small datasets like MNIST! 😃