Skip to content

Instantly share code, notes, and snippets.

@jonathanoheix
Created December 18, 2018 09:53
Show Gist options
  • Select an option

  • Save jonathanoheix/b51465865de8c6863ce41bba67394ec8 to your computer and use it in GitHub Desktop.

Select an option

Save jonathanoheix/b51465865de8c6863ce41bba67394ec8 to your computer and use it in GitHub Desktop.
# feature selection
label = "is_bad_review"
ignore_cols = [label, "review", "review_clean"]
features = [c for c in reviews_df.columns if c not in ignore_cols]
# split the data into train and test
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(reviews_df[features], reviews_df[label], test_size = 0.20, random_state = 42)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment