Skip to content

Instantly share code, notes, and snippets.

@karamanbk
Last active March 31, 2020 01:23
Show Gist options
  • Save karamanbk/be8ee2a5973b660faf0eddfb63fc57ae to your computer and use it in GitHub Desktop.
Save karamanbk/be8ee2a5973b660faf0eddfb63fc57ae to your computer and use it in GitHub Desktop.
#convert categorical columns to numerical
tx_class = pd.get_dummies(tx_cluster)
#calculate and show correlations
corr_matrix = tx_class.corr()
corr_matrix['LTVCluster'].sort_values(ascending=False)
#create X and y, X will be feature set and y is the label - LTV
X = tx_class.drop(['LTVCluster','m6_Revenue'],axis=1)
y = tx_class['LTVCluster']
#split training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.05, random_state=56)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment