Skip to content

Instantly share code, notes, and snippets.

@AnasAlmasri
Created February 13, 2019 01:15
Show Gist options
  • Save AnasAlmasri/9b52e73cfabee7d71e25e9fc5bb085c8 to your computer and use it in GitHub Desktop.
Save AnasAlmasri/9b52e73cfabee7d71e25e9fc5bb085c8 to your computer and use it in GitHub Desktop.
building the vocabulary
import nltk
def buildVocabulary(preprocessedTrainingData):
all_words = []
for (words, sentiment) in preprocessedTrainingData:
all_words.extend(words)
wordlist = nltk.FreqDist(all_words)
word_features = wordlist.keys()
return word_features
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment