Created
January 28, 2018 07:32
-
-
Save dipanjanS/b2e182654577f649e9b5d0ce833e746d to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # you can set the n-gram range to 1,2 to get unigrams as well as bigrams | |
| bv = CountVectorizer(ngram_range=(2,2)) | |
| bv_matrix = bv.fit_transform(norm_corpus) | |
| bv_matrix = bv_matrix.toarray() | |
| vocab = bv.get_feature_names() | |
| pd.DataFrame(bv_matrix, columns=vocab) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment