Skip to content

Instantly share code, notes, and snippets.

@sai-teja-ponugoti
Created June 9, 2020 20:20
Show Gist options
  • Save sai-teja-ponugoti/86014de3b73fa0addb530710cbca5a1e to your computer and use it in GitHub Desktop.
Save sai-teja-ponugoti/86014de3b73fa0addb530710cbca5a1e to your computer and use it in GitHub Desktop.
import spacy
from nltk.tokenize import word_tokenize
# loading english language model of spaCy
en_model = spacy.load('en_core_web_sm')
# gettign the list of default stop words in spaCy english model
stopwords = en_model.Defaults.stop_words
sample_text = "Oh man, this is pretty cool. We will do more such things."
text_tokens = word_tokenize(sample_text)
tokens_without_sw= [word for word in text_tokens if not word in stopwords]
print(text_tokens)
print(tokens_without_sw)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment