Skip to content

Instantly share code, notes, and snippets.

@matmoody
Created May 16, 2016 19:37
Show Gist options
  • Save matmoody/ab0493738be9077d682ae298d7c5f089 to your computer and use it in GitHub Desktop.
Save matmoody/ab0493738be9077d682ae298d7c5f089 to your computer and use it in GitHub Desktop.
import numpy as np
import statsmodels.formula.api as smf
import pandas as pd
# Set seed for reproducible results
np.random.seed(414)
# Generate toy data
# Return evenly spaced #'x over specified interval
X = np.linspace(0, 15, 1000)
y = 3 * np.sin(X) + np.random.normal(1 + X, .2, 1000)
train_X, train_y = X[:700], y[:700]
test_X, test_y = X[700:], y[700:]
train_df = pd.DataFrame({'X': train_X, 'y': train_y})
test_df = pd.DataFrame({'X': test_X, 'y': test_y})
# linear fit
poly_1 = smf.ols(formula='y ~ 1 + X', data=train_df).fit()
# quadratic fit
poly_1 = smf.ols(formula='y ~ 1 + X + I(X**2)', data=train_df).fit()
print poly_1.summary()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment