Skip to content

Instantly share code, notes, and snippets.

@jonathanoheix
Created December 18, 2018 09:50
Show Gist options
  • Save jonathanoheix/9d1b147f4f70caf59fb362665fe14bcd to your computer and use it in GitHub Desktop.
Save jonathanoheix/9d1b147f4f70caf59fb362665fe14bcd to your computer and use it in GitHub Desktop.
# add tf-idfs columns
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf = TfidfVectorizer(min_df = 10)
tfidf_result = tfidf.fit_transform(reviews_df["review_clean"]).toarray()
tfidf_df = pd.DataFrame(tfidf_result, columns = tfidf.get_feature_names())
tfidf_df.columns = ["word_" + str(x) for x in tfidf_df.columns]
tfidf_df.index = reviews_df.index
reviews_df = pd.concat([reviews_df, tfidf_df], axis=1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment