Skip to content

Instantly share code, notes, and snippets.

@rohithteja
Created August 23, 2021 13:21
Show Gist options
  • Select an option

  • Save rohithteja/ceb0738fafda5bd2f4c9a7204cbe8fe3 to your computer and use it in GitHub Desktop.

Select an option

Save rohithteja/ceb0738fafda5bd2f4c9a7204cbe8fe3 to your computer and use it in GitHub Desktop.
Sentiment Analysis - Vectorization
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
#vectorization
cv = CountVectorizer(lowercase= False)
text_vector = cv.fit_transform(df.text.values)
x = text_vector
y = df.iloc[:,-1].values
# train validation test split
x_train, xtest, y_train, ytest = train_test_split(x, y, stratify = y,
test_size=0.20, random_state=42)
x_val, x_test, y_val, y_test = train_test_split(xtest, ytest,stratify = ytest,
test_size=0.5, random_state=42)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment