#A Collection of NLP notes
##N-grams
###Calculating unigram probabilities:
P( wi ) = count ( wi ) ) / count ( total number of words )
In english..
| # Author: Kyle Kastner # License: BSD 3-Clause # For a reference on parallel processing in Python see tutorial by David Beazley # http://www.slideshare.net/dabeaz/an-introduction-to-python-concurrency # Loosely based on IBM example # http://www.ibm.com/developerworks/aix/library/au-threadingpython/ # If you want to download all the PASCAL VOC data, use the following in bash... """ #! /bin/bash # 2008 wget http://host.robots.ox.ac.uk/pascal/VOC/voc2008/VOCtrainval_14-Jul-2008.tar # 2009 wget http://host.robots.ox.ac.uk/pascal/VOC/voc2009/VOCtrainval_11-May-2009.tar # 2010 wget http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar # 2011 wget http://host.robots.ox.ac.uk/pascal/VOC/voc2011/VOCtrainval_25-May-2011.tar # 2012 wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar # Latest devkit wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCdevkit_18-May-2011.tar """ try: import Queue except ImportError: import queue as Queue import threading import ti |
#A Collection of NLP notes
##N-grams
###Calculating unigram probabilities:
P( wi ) = count ( wi ) ) / count ( total number of words )
In english..