Skip to content

Instantly share code, notes, and snippets.

@ChristopherDaigle
Created May 31, 2020 19:58
Show Gist options
  • Save ChristopherDaigle/4185f9f4ac8b975e671c9d3950dbdbab to your computer and use it in GitHub Desktop.
Save ChristopherDaigle/4185f9f4ac8b975e671c9d3950dbdbab to your computer and use it in GitHub Desktop.
predict_revenue
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.model_selection import train_test_split
X = df_1.drop(['revenue', 'above_ave_rev_yr', 'original_language', 'original_title', 'overview', 'release_date', 'status', 'tagline', 'title'], axis=1)
y = df_1['revenue']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
lm_model = LinearRegression(normalize=True)
r_model = Ridge(normalize=True)
lm_model.fit(X_train, y_train)
r_model.fit(X_train, y_train)
print('Linear Regression Train R2: {}'.format(lm_model.score(X_train, y_train)))
print('Ridge Train R2: {}'.format(r_model.score(X_train, y_train)))
lm_preds = lm_model.predict(X_test)
r_preds = r_model.predict(X_test)
print('Linear Regression Test R2: {}'.format(lm_model.score(X_test, y_test)))
print('Ridge Test R2: {}'.format(r_model.score(X_test, y_test)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment