Skip to content

Instantly share code, notes, and snippets.

@Steboss
Created March 9, 2018 11:38
Show Gist options
  • Save Steboss/dd13a934dd227ecb894130d62569f0a3 to your computer and use it in GitHub Desktop.
Save Steboss/dd13a934dd227ecb894130d62569f0a3 to your computer and use it in GitHub Desktop.
nltk Bayes Classifier with real comments
#Training part
training = pos[:int((.5)*len(pos))] + neg[:int((.5)*len(neg))]
#this cryptic way to write it's just to select the last half part of the
#data set (above)
#and the first half dataset (below)
test = pos[int((.5)*len(pos)):] + neg[int((.5)*len(neg)):]
counter = 0
test_dataset = []
with open("basic_positive.csv","r") as reader:
for line in reader:
test_dataset.append(line)
counter+=1
if counter==25:
break
print(len(test_dataset))
counter= 0
with open("basic_negative.csv","r") as reader:
for line in reader:
test_dataset.append(line)
counter+=1
if counter==25:
break
print(len(test_dataset))
#this will be the testdataset
#at the moment stick to the basic Bayes
classifier = NaiveBayesClassifier.train(training)
print("Most informative features...")
print(classifier.show_most_informative_features())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment