Created
April 23, 2018 14:28
-
-
Save soumith/e06ce27f286ada07cd37bf2827200931 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://github.com/pytorch/pytorch/pull/3526 Reuse intermediate results over multiple backwards grad_inputs | |
https://github.com/pytorch/pytorch/pull/3509 Optimizer: optimize transposes in variety of circumstances | |
https://github.com/pytorch/pytorch/pull/3409 Add Tensor Core ops to RNNs for Volta | |
https://github.com/pytorch/pytorch/pull/3370 Follow up #3211 (sparse broadcast_coalesced, reduce_add_coalesced) | |
https://github.com/pytorch/pytorch/pull/3336 Prevent numerical issues with poisson_nll_loss when log_input=False | |
https://github.com/pytorch/pytorch/pull/2764 [Done]parallelize elementwise operation with openmp | |
https://github.com/pytorch/pytorch/pull/6110 Fix bilinear performance regression | |
https://github.com/pytorch/pytorch/pull/6078 Exp, log, sin, cos vectorized | |
https://github.com/pytorch/pytorch/pull/6062 Enable MKLDNN convolution forward and backward | |
https://github.com/pytorch/pytorch/pull/6026 Speed up sum over a dimension | |
https://github.com/pytorch/pytorch/pull/5913 Optimize unique sorting by using std::vector+sort instead of std::set | |
https://github.com/pytorch/pytorch/pull/5782 fused GLU backward | |
https://github.com/pytorch/pytorch/pull/5747 Save self.numel() for backward computation instead of self | |
https://github.com/pytorch/pytorch/pull/5722 Add optimization to norm for common norms | |
https://github.com/pytorch/pytorch/pull/5710 improve occupancy for cuda rngs | |
https://github.com/pytorch/pytorch/pull/5680 implement TripletMarginLoss as a native function | |
https://github.com/pytorch/pytorch/pull/5646 implement CosineEmbeddingLoss as a native function and add reduce arg | |
https://github.com/pytorch/pytorch/pull/5640 Revert "implement CosineEmbeddingLoss as a native function and add reduce arg" | |
https://github.com/pytorch/pytorch/pull/5447 implement CosineEmbeddingLoss as a native function and add reduce arg | |
https://github.com/pytorch/pytorch/pull/5433 speed up CPU EmbeddingBag (indexSelectAdd op) | |
https://github.com/pytorch/pytorch/pull/5346 Implement MarginRankingLoss as native function and add reduce=True arg to it | |
https://github.com/pytorch/pytorch/pull/5279 Speed-up nn.Linear for the 3d input case | |
https://github.com/pytorch/pytorch/pull/5080 Implement hinge_embedding_loss as a native function. | |
https://github.com/pytorch/pytorch/pull/5064 DDP: 10% of NCCL backend perf improvements with mixed-prec support | |
https://github.com/pytorch/pytorch/pull/5054 Use fast integer division algorithm to avoid division ops inside kernels. | |
https://github.com/pytorch/pytorch/pull/5010 add AVX2 implementation for sigmoid function | |
https://github.com/pytorch/pytorch/pull/4924 add reduce=True argument to MultiLabelMarginLoss | |
https://github.com/pytorch/pytorch/pull/4870 Slightly improve DistributedDataParallel (single-GPU binding) multi-process distributed training performance | |
https://github.com/pytorch/pytorch/pull/4824 parallelize vol2col and col2vol of Conv3D with CPU backend | |
https://github.com/pytorch/pytorch/pull/4803 More efficient squeeze() backward in edge case | |
https://github.com/pytorch/pytorch/pull/4705 adds reduce argument to BCEWithLogitsLoss interface | |
https://github.com/pytorch/pytorch/pull/4312 Vectorize normal_ | |
https://github.com/pytorch/pytorch/pull/4231 Add reduce arg to BCELoss | |
https://github.com/pytorch/pytorch/pull/4183 Allowing usage of GPU Direct within PyTorch for the Broadcast operation | |
https://github.com/pytorch/pytorch/pull/4174 Rearrange dimensions for pointwise operations for better performance. | |
https://github.com/pytorch/pytorch/pull/4094 Implement pin_memory() as a NativeFunction |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment