- FaceNet (Google)
- They use a triplet loss with the goal of keeping the L2 intra-class distances low and inter-class distances high
- DeepID (Hong Kong University)
- They use verification and identification signals to train the network. Afer each convolutional layer there is an identity layer connected to the supervisory signals in order to train each layer closely (on top of normal backprop)
- DeepFace (Facebook)
- Convs followed by locally connected, followed by fully connected
- Quantized Convolutional Neural Networks for Mobile Devices CVPR 2016
- Deep SimNets CVPR 2016
- Quantized Convolutional Neural Networks for Mobile Devices (2016)
- Quantization is applied to both convolutional and fully connected layers. This method has the advantage of accelerating the convolutional layers runtime, which is very important given that those are the most computationally expensive layers in CNNs. The disadvantage of this method is that it leads to a small loss in accuracy.
- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size (2016)
- Use 1x1 convolutions to reduce the number of convolution maps. I.e. if we have 10 traditional convolution filters this will generate 10 feature maps. If then we have N convolutions (where N < 10) the res
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition (2015)
- Uses identity shortcuts connections that skip one or more layers and merge back by adding to the output of the last layer that has been skipped. The point of such netoworks is to be able to train deeper networks without the known gradient vanishing problem. They show that residual networs are easier to optimize and can achieve better accuracy with the depth increase. Same architecture used for classification, feature extraction, object detection and segmentation tasks with success.
- Sergey Ioffe, Christian Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shif (2015)
- An apporach to reduce the internal covariance shift by fixing the input layer distribution for each layer, thus allowing for a much faster learning without vanishing/ex
- Coursera Machine Learning course by Andrew Ng
- Linear regression
- Logistic regression
- Neural networks (basics)
- Machine learning tips (how to apply in real situations) and example application
- SVMs
- Unsupervised learning
- Anomaly detection
- Large scale learning
- WIDER FACE: A Face Detection Benchmark CVPR 2016
- shared_ptr
- Raw pointer can be co-own by several shared pointers and a reference count is kept. Memory is realeased when the reference count is 0.
- unique_ptr
- Raw pointer can be own by only one unique pointer. No assignments or copies can be made. No need to keep a reference count, memory is released when the pointer is out of scope.
- weak_ptr
- Does not grant acess to the pointed data, is a view only pointer which can be used to query the status of the pointed date (if it still exists or not) and to create a shared pointer from it.