Created
February 28, 2019 12:26
-
-
Save zkan/6eb28a0e0957dda693dc96dbd45ddfd0 to your computer and use it in GitHub Desktop.
Sentiment Analysis with Scikit-Learn
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
from sklearn.feature_extraction.text import CountVectorizer | |
text = [ | |
'Promptly answer to my request.', | |
'great turnaround time!', | |
'Fast response and great service!', | |
'As usual you took the request and completed in as few steps as possible and well done!', | |
'Very quick turnaround time and great communication. Thank you!', | |
'Really quick turnaround. Thanks!!', | |
'reduce the turn around time and increase/upgrade the interaction ability between the client and the pronto team during the building of the website.', | |
'Better turn around time for changes and updates. Better copywriting talents for web content rewrite requests.', | |
'Get back to us faster', | |
'Be more careful when fixing our site to not break the other coding on other pages', | |
'All of the information for this listing is already on the website that Pronto built. It is disappointing that this has taken so many tries to get right.', | |
'slow communication regarding SEO questions, emailing SEO team direct has been slower that going through my account rep- vague and incomplete answers wasting time', | |
'The time difference and language barrier are my main complaints. Otherwise our website looks great.', | |
'Process is way too slow. Working with another country seems to really slow things down alot.', | |
] | |
label = [ | |
'positive', | |
'positive', | |
'positive', | |
'positive', | |
'positive', | |
'positive', | |
'neutral', | |
'neutral', | |
'neutral', | |
'neutral', | |
'negative', | |
'negative', | |
'negative', | |
'negative', | |
] | |
df = pd.DataFrame(data={ | |
'text': text, | |
'label': label | |
}) | |
vectorizer = CountVectorizer(stop_words='english') | |
X_train = df.text | |
y_train = df.label | |
X_train_vectorized = vectorizer.fit_transform(X_train) | |
# import | |
from sklearn import naive_bayes | |
# instantiate | |
clf = naive_bayes.MultinomialNB() | |
# fit | |
clf.fit(X_train_vectorized, y_train) | |
# positive, neutral, negative | |
X_test = [ | |
'Fast service! Thank you', | |
'wish it could have been resolved sooner as it was simple', | |
'Cannot get my website where I want because repeated requests are misunderstood, requests are simply not honored.', | |
] | |
X_test_vectorized = vectorizer.transform(X_test) | |
# predict | |
clf.predict(X_test_vectorized) | |
# predict | |
clf.predict(X_test_vectorized) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment