A Code-First Introduction to Natural Language Processing

2

Singular Value Decomposition (SVD)

We would clearly expect that the words that appear most frequently in one topic would appear less frequently in the other - otherwise that word wouldn't make a good choice to separate out the two topics. Therefore, we expect the topics to be orthogonal.

The SVD algorithm factorizes a matrix into one matrix with orthogonal columns and one with orthogonal rows (along with a diagonal matrix, which contains the relative importance of each factor).

Non-negative Matrix Factorization (NMF)

Rather than constraining our factors to be orthogonal, another idea would to constrain them to be non-negative. Often positive factors will be more easily interpretable.

Truncated SVD

Tools

fbpca

4

Sparse Matrix Representation

Visualizing sparse matrix structure

Sparse matrix storage formats

coordinate-wise and compressed sparse row (CSR) examples

vahbuna/fastai-nlp.md