Skip to content

Instantly share code, notes, and snippets.

@denten
Created May 5, 2014 21:50
Show Gist options
  • Save denten/7af20631441a6403a67b to your computer and use it in GitHub Desktop.
Save denten/7af20631441a6403a67b to your computer and use it in GitHub Desktop.
import nltk
wsj = nltk.corpus.treebank.tagged_words(simplify_tags=True)
cdf = nltk.ConditionalFreqDist((tag, word) for (word, tag) in wsj)
wordlist = cdf['VN'].keys()
for ndx, (word, tag) in enumerate(wsj):
if word in wordlist and tag == 'IN':
print wsj_list[ndx-1:ndx+1], ndx, i
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment