mingfeima/pytorch_perf_optimization_cpu.md

Last active December 29, 2017 01:51

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/mingfeima/1c0ac42822c75978f2129d4776f2eb8c.js"></script>
Save mingfeima/1c0ac42822c75978f2129d4776f2eb8c to your computer and use it in GitHub Desktop.

Download ZIP

Raw

pytorch_perf_optimization_cpu.md

PyTorch Performance Optimization on CPU

pytorch mkldnn integration prototype design

mkldnn conv integration
conv3d parallelization: vol2col, col2vol
LSTM optimization non-fused: tanh/sigmoid parallelization

Create MKLDNN conda channel
MKLDNN tensor type

create lib/THMKL?
create ATen backend?
move mkldnn_conv_xxx from torch/csrc to ATen
mkldnn global primitive cache
reorder definition

RNN fused kernel on Xeon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment