razhangwei/efficient_pytorch.md

Created April 11, 2019 14:29

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/razhangwei/abc11a6cc0485dc6ab37ea9ff46526fd.js"></script>
Save razhangwei/abc11a6cc0485dc6ab37ea9ff46526fd to your computer and use it in GitHub Desktop.

Efficient Pytorch for extremely large dataset #PyTorch

Raw

Some simple (yet not most efficient) solutions:

torchvision.datasets.ImageFolder/torchvision.datasets.DatasetFolder + data.DataLoader
lmdb (Lightning Memory-mapped database manager)

Key technologies:

lapack for process the data
Magma suuport: MAGMA is a collection of next generation linear algebra (LA) GPU accelerated libraries designed and implemented by the team that developed LAPACK and ScaLAPACK.