Skip to content

Instantly share code, notes, and snippets.

@language-engineering
Created October 3, 2012 10:54
Show Gist options
  • Select an option

  • Save language-engineering/3826385 to your computer and use it in GitHub Desktop.

Select an option

Save language-engineering/3826385 to your computer and use it in GitHub Desktop.
from numpy import average
from sussex_nltk.corpus_readers import ReutersCorpusReader
rcr = ReutersCorpusReader()
sample_size = 1000 #The number of sentences in a sample
#Randomly sample 1000 sentences, and build a list of the lengths of each sentence
sentence_lengths = [len(sentence) for sentence in rcr.sample_sents(sample_size)]
#Calculate and print the average sentence length
print "Average sentence length: %s" average(sentence_lengths)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment