Skip to content

Instantly share code, notes, and snippets.

@language-engineering
Created September 17, 2012 09:50
Show Gist options
  • Save language-engineering/3736489 to your computer and use it in GitHub Desktop.
Save language-engineering/3736489 to your computer and use it in GitHub Desktop.
from sussex_nltk.corpus_readers import WSJCorpusReader #import the corpus reader
wsjcr = WSJCorpusReader() #create a new WSJ corpus reader
#get a sample of tokens in the corpus using your 5-digit candidate number
tokens = wsjcr.sample_words(12345)
for token in tokens: #iterate over the tokens
print token #print each token
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment