Skip to content

Instantly share code, notes, and snippets.

@ereli
Last active January 16, 2018 01:00
Show Gist options
  • Save ereli/361cfd410b98f6fdaaca70fc06a09cd2 to your computer and use it in GitHub Desktop.
Save ereli/361cfd410b98f6fdaaca70fc06a09cd2 to your computer and use it in GitHub Desktop.
python script to clean stopwords from text and output words
#!/usr/bin/env python3
from sys import stdout
from nltk.corpus import stopwords
import fileinput
#import nltk
#nltk.download('stopwords')
stopword = stopwords.words('english')
for line in fileinput.input():
inputtext = [w for w in line.split() if w not in stopword]
for w in inputtext:
stdout.write(w+'\n')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment