Skip to content

Instantly share code, notes, and snippets.

@dipanjanS
Created January 28, 2018 07:32
Show Gist options
  • Select an option

  • Save dipanjanS/b2e182654577f649e9b5d0ce833e746d to your computer and use it in GitHub Desktop.

Select an option

Save dipanjanS/b2e182654577f649e9b5d0ce833e746d to your computer and use it in GitHub Desktop.
# you can set the n-gram range to 1,2 to get unigrams as well as bigrams
bv = CountVectorizer(ngram_range=(2,2))
bv_matrix = bv.fit_transform(norm_corpus)
bv_matrix = bv_matrix.toarray()
vocab = bv.get_feature_names()
pd.DataFrame(bv_matrix, columns=vocab)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment