Name: Issam H. Laradji
Email: [email protected]
Github: IssamLaradji
Time zone: UTC+03:00
Blog: http://easymachinelearning.blogspot.com/
GSoC Blog RSS feed:
University: King Fahd University of Petroleum & Minerals
Major: Computer Science
Current Year: Fourth semester, Master's degree
Expected Graduation Date: May 15, 2014
Degree: Master of Science Degree
Extending Neural Networks Module for Scikit-learn
The project has three main objectives,
- To implement Extreme Learning Machines (ELM), Sequential ELM, and regularized ELM
- To implement Sparse Auto-encoders
- To extend Multi-layer Perceptron for supporting more than one hidden layer
Below I will explain the contributions in detail,
Extreme Learning Machines (ELM): ELM is a powerful predictor with high generalization performance. It solves the prediction objective function through least-square solutions; so, it trains very quickly. It comes in the form of a single hidden-layer feedforward network which randomly generates the input weights and then solves for the hidden weights. The implementation will be based on the work of Huang et al. [1].
Sequential ELM: One drawback of ELM is the need to process the whole matrix representing the dataset at once, causing problems for computers with small memory. Sequential ELM would counteract this by training on a dataset in relatively small batches. Sequential ELM would accurately update its hidden weights as new batches arrive, for it relies on a recursive dynamic programming scheme. The implementation will be based on the work of Huang et al. [2].
Regularized ELM: In ELM, least-square solutions for the unknown hidden weights are over-determined, as the number of samples far outnumber the number of unknown variables. Therefore, the algorithm could overfit on the training set. Regularized ELM counteracts this overfitting problem by constraining the solutions to be small (This is similar to the regularizer term in Multi-layer Perceptron). The implementation will be based on the work of Huang et al. [3].
Sparse Auto-encoders (SAE): Feature extraction has long been in the spotlight of machine learning research. Commonly used for image recognition, SAE would extract suitable, interesting structural information from the input samples. It learns to reconstruct the input samples while constraining its extracted hidden features to be sparse. SAE has the objective function of a standard Multi-layer Perceptron and an additional term, the Kullback–Leibler divergence [4], that imposes the sparsity constraint. Besides extracting new features, it can also provide good initial weights for networks training with backpropagation. The implementation will be based on Andrew Ng. notes [5].
Greedy Layer-Wise Training of Deep Networks: This would allow the Multi-layer Perceptron module to support more than one hidden layer. In addition, it would use the scikit-learn pipelining functions to support initializing weights after applying either Sparse Auto-encoders or Restricted Boltzmann Machines. My main reference for the implementation is the UFLDL tutorial [6].
Week 1, 2 (May 19 - May 25)
Goal: Implement and revise Extreme Learning Machines.
Week 3, 4, (May 26 - June 8)
Goal: Implement and revise Sequential Extreme Learning Machines.
Week 5, 6 (June 9 - June 29)
Goal: Implement and revise Regularized Extreme Learning Machines.
Note - Pass midterm evaluation on June 27
Week 7, 8, 9 (June 30 - July 20)
Goal: Implement and revise Sparse Auto-encoders.
Week 10, 11, 12 (July 21- August 11)
Goal: Implement and revise Greedy Layer-Wise Training of Deep Networks.
Week 13 - Wrap-up
- scikit-learn/scikit-learn#2120 - (not merged) - Multi-layer Perceptron. About to be merged; it is waiting for few changes and a final review.
- scikit-learn/scikit-learn#2099 - (not merged) - Sparse Auto-encoders.
- scikit-learn/scikit-learn#2680 - (not merged) - Gaussian Restricted Boltzmann Machines.
- http://www.di.unito.it/~cancelli/retineu11_12/ELM-NC-2006.pdf
- http://www.ntu.edu.sg/home/egbhuang/pdf/OS-ELM-TNN.pdf
- http://www.ntu.edu.sg/home/egbhuang/pdf/ELM-Unified-Learning.pdf
- http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
- http://www.stanford.edu/class/cs294a/sparseAutoencoder.pdf
- http://ufldl.stanford.edu/wiki/index.php/Exercise:_Implement_deep_networks_for_digit_classification