Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save deerme/7c69dea449d4339e55c6aaa8a80d7f71 to your computer and use it in GitHub Desktop.
Save deerme/7c69dea449d4339e55c6aaa8a80d7f71 to your computer and use it in GitHub Desktop.
How to vectorize sentences using a Pandas and sklearn's CountVectorizer
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer()
corpus = [ 'This is a sentence',
'Another sentence is here',
'Wait for another sentence',
'The sentence is coming',
'The sentence has come'
]
x = vectorizer.fit_transform(corpus)
print(pd.DataFrame(x.A, columns=vectorizer.get_feature_names()).to_string())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment