- Regularization and variable selection method.
- Sparse Representation
- Exihibits grouping effect.
- Prticulary useful when number of predictors (p) >> number of observations (n).
- LARS-EN algorithm to compute elastic net regularization path.
- Link to paper.
- Introduces techniques to learn word vectors from large text datasets.
- Can be used to find similar words (semantically, syntactically, etc).
- Link to the paper
- Link to open source implementation
- In machine learning, accuracy tends to increase with an increase in the number of training examples and number of model parameters.
- For large data, training becomes slow on even GPU (due to increase CPU-GPU data transfer).
- Solution: Distributed training and inference - DistBelief
- Link to paper
- GraphLab abstraction exposes asynchronous, dynamic, graph-parallel computation model in the shared-memory setting.
- This paper extends the abstraction to the distributed setting.
- Link to the paper.
- Graph Structured Computation
- Sometimes computation requires modeling dependencies between data.
- Evaluating if LSTMs can express and learn short, simple programs (linear time, constant memory) in the sequence-to-sequence framework.
- Link to paper
- Memory Networks combine inference components with a long-term memory component.
- Used in the context of Question Answering (QA) with memory component acting as a (dynamic) knowledge base.
- Link to the paper.
- Neural Network with a recurrent attention model over a large external memory.
- Continous form of Memory-Network but with end-to-end training so can be applied to more domains.
- Extension of RNNSearch and can perform multiple hops (computational steps) over the memory per symbol.
- Link to the paper.
- Link to the implementation.
- Curriculum Learning - When training machine learning models, start with easier subtasks and gradually increase the difficulty level of the tasks.
- Motivation comes from the observation that humans and animals seem to learn better when trained with a curriculum like a strategy.
- Link to the paper.
- Method to visualize high-dimensional data points in 2/3 dimensional space.
- Data visualization techniques like Chernoff faces and graph approaches just provide a representation and not an interpretation.
- Dimensionality reduction techniques fail to retain both local and global structure of the data simultaneously. For example, PCA and MDS are linear techniques and fail on data lying on a non-linear manifold.
- t-SNE approach converts data into a matrix of pairwise similarities and visualizes this matrix.
- Based on SNE (Stochastic Neighbor Embedding)
- Link to paper
#Visualizing Large-scale and High-dimensional Data
- LargeVis - a technique to visualize large-scale and high-dimensional data in low-dimensional space.
- Problem relates to both information visualization and machine learning (and data mining) domain.
- Link to the paper