Skip to content

Instantly share code, notes, and snippets.

@neilkod
Created October 21, 2010 13:20
Show Gist options
  • Save neilkod/638471 to your computer and use it in GitHub Desktop.
Save neilkod/638471 to your computer and use it in GitHub Desktop.
concordance for a few terms in zoolander. just for fun
#!/usr/bin/python
import nltk
import string
f = open('zoolander.txt','r').read()
# favorite way to strip punctuation, found on stackoverflow
# http://stackoverflow.com/questions/265960
# it helped the concordance results a little bit but this might
# not be the best approach.
# concordance is sensitive to punctuation in tokens, i dont want it
# for my sample output.
# i'd like feedback on this
f = f.translate(string.maketrans("",""), string.punctuation)
# create an nltk text object
foo=nltk.Text(f.split())
# terms to run concordance against. tried a few funny
# terms from the movie
terms = ['mugatu','hot','good','model',
'stupid','freak','coal','work','derek',
'hansel','read','kill','underwear']
# loop through the terms, print header row, the output, and
# a blank line
for term in terms:
print "concordance for %s....." % (term)
foo.concordance(term)
print
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment