Skip to content

Instantly share code, notes, and snippets.

@bkj
Last active December 6, 2017 21:39
Show Gist options
  • Select an option

  • Save bkj/46a5ad13c715c8c5bb909b21ecf28238 to your computer and use it in GitHub Desktop.

Select an option

Save bkj/46a5ad13c715c8c5bb909b21ecf28238 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
"""
bow2adjlist.py
Convert a sparse matrix (eg term-document matrix) to two dense matrices
Useful for feeding into BOW models
"""
import numpy as np
from scipy.sparse import coo_matrix
def bow2adjlist(X, maxcols=None):
x = coo_matrix(X)
_, counts = np.unique(x.row, return_counts=True)
pos = np.hstack([np.arange(c) for c in counts])
adjlist = csr_matrix((x.col + 1, (x.row, pos)))
datlist = csr_matrix((x.data, (x.row, pos)))
if maxcols is not None:
adjlist, datlist = adjlist[:,:maxcols], datlist[:,:maxcols]
return adjlist, datlist
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment