Skip to content

Instantly share code, notes, and snippets.

@erap129
Created February 19, 2022 12:25
Show Gist options
  • Save erap129/c7747025900ede2cd93d3733d5da0a93 to your computer and use it in GitHub Desktop.
Save erap129/c7747025900ede2cd93d3733d5da0a93 to your computer and use it in GitHub Desktop.
stemmer = SnowballStemmer('english')
movies_df['movie_plot'] = movies_df['movie_plot'].apply(lambda x:' '.join([stemmer.stem(y) for y in x.split()]))
tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(movies_df['movie_plot'])
tfidf_df = pd.DataFrame(
tfidf_matrix.todense(),
columns=tfidf.get_feature_names()
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment