# Fast sparse, exact KNN Manhattan Distance

Given a 1xN (row) vector and a MxN matrix, the code below calculates the
manhattan distance between the row vector and each row in the matrix. It uses
Numba for significant speadup. 

Currenlty I'm using this to calculate distances between a 1 x 60K  row vector
and a 64k x 60x matrix (with about 500k non-empty values) in ~ 10ms on a single processor


## Example usage: K nearest neightbors

```py
import numpy as np
import sparse_manhattan_distance from sparse_manhattan_distance

def sparse_manhattan_KNN(mat,vec,N,sorted=True):
    """
    @param mat: A sparse M x N matrix 
    @param vec: A sparse N vector to be compared to each row of `mat`
    @param N: The number of nearest neighbors to 
    @param sorted: If `True`, sort the row indexes before returing 

    @return a vector of row indices of the nearest N rows in `mat` to `vec`
    """

    # cant return more observations than data...
    N = min(N, mat.data.shape[0])  

    distances = sparse_manhattan_distance(mat, vec)

    # note that argpartition and then sorting the top N indexes is *much*
    # faster than sorting the full distance vector and then choosing the top N
    nearest_N_rows = np.argpartition(distances, -N)[-N:]
    if sorted:
        return nearest_N_rows[np.argsort(-distances[nearest_N_rows])]

    return  nearest_N_rows
```