Skip to content

Instantly share code, notes, and snippets.

@language-engineering
Created September 26, 2012 11:08
Show Gist options
  • Select an option

  • Save language-engineering/3787384 to your computer and use it in GitHub Desktop.

Select an option

Save language-engineering/3787384 to your computer and use it in GitHub Desktop.
def lexical_diversity(text):
return len(text) / (len(set(text)) + 0.0) #the addition of 0.0 ensures floating point division, incase you haven't executed: from __future__ import division
def hapax_count(freqdist):
return len(freqdist.hapaxes())
def vocabulary_size(freqdist):
return len(freqdist)
print "Lexical diversity: %s" % lexical_diversity(my_text)
print "Number of hapaxes: %s" % hapax_count(my_freqdist)
print "Vocabulary size : %s" % vocabulary_size(my_freqdist)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment