Skip to content

Instantly share code, notes, and snippets.

@espeed
Created July 27, 2015 03:24
Show Gist options
  • Save espeed/888ff7061718fc8b3027 to your computer and use it in GitHub Desktop.
Save espeed/888ff7061718fc8b3027 to your computer and use it in GitHub Desktop.
Python NLTK Trainer sklearn.MultinomialNB example
$ python
Python 2.7.10 (default, Jul 5 2015, 14:15:43)
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> scipy.__version__
'0.14.1'
>>> import numpy
>>> numpy.__version__
'1.9.2'
>>> import sklearn
>>> sklearn.__version__
'0.16.1'
>>> import nltk
>>> nltk.__version__
'3.0.4'
>>> import argparse
>>> argparse.__version__
'1.1'
$ python train_classifier.py --instances files --fraction 0.75 --no-pickle --min_score 2 --ngrams 1 2 3 --show-most-informative 10 movie_reviews --classifier sklearn.MultinomialNB
loading movie_reviews
2 labels: [u'neg', u'pos']
calculating word scores
using bag of words from known set feature extraction
71903 words meet min_score and/or max_feats
1500 training feats, 500 testing feats
training sklearn.MultinomialNB with {'alpha': 1.0}
using dtype bool
training sklearn.MultinomialNB classifier
accuracy: 0.788000
neg precision: 0.918605
neg recall: 0.632000
neg f-measure: 0.748815
pos precision: 0.719512
pos recall: 0.944000
pos f-measure: 0.816609
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment