- pytorch mkldnn integration prototype design
- mkldnn conv integration
- conv3d parallelization: vol2col, col2vol
- LSTM optimization non-fused: tanh/sigmoid parallelization
-
Create MKLDNN conda channel
-
MKLDNN tensor type
- create lib/THMKL?
- create ATen backend?
- move mkldnn_conv_xxx from torch/csrc to ATen
- mkldnn global primitive cache
- reorder definition
- RNN fused kernel on Xeon