Skip to content

Instantly share code, notes, and snippets.

@andrea-dagostino
Last active October 3, 2022 20:58
Show Gist options
  • Select an option

  • Save andrea-dagostino/aff7cd548eb1093c70db6d321d8e529e to your computer and use it in GitHub Desktop.

Select an option

Save andrea-dagostino/aff7cd548eb1093c70db6d321d8e529e to your computer and use it in GitHub Desktop.
text_sim_tfidf
remove_punctuation_map = dict((ord(char), None) for char in string.punctuation)
def preprocess(text):
return nltk.word_tokenize(text.lower().translate(remove_punctuation_map))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment