Skip to content

Instantly share code, notes, and snippets.

@watermouth
Last active February 10, 2018 06:12
Show Gist options
  • Select an option

  • Save watermouth/e39c7e0116940af224af6e7c15b194e5 to your computer and use it in GitHub Desktop.

Select an option

Save watermouth/e39c7e0116940af224af6e7c15b194e5 to your computer and use it in GitHub Desktop.

Scikit Learn

Useful sources

API basics

  • methods: fit, transform, predict
  • IO: numpy.ndarray with the shape that has sample data number in the first placeholder. ex. (number_of_samples, feature_dim).
    some classes expect 1D array as y. ex. SGDRegressor

Preprocessing

Feature engineering

  • Polynomial features
# m = 100
# X.shape == (m, 1)
from sklearn.preprocessing import PolynomialFeatures
poly_features = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly_features.fit_transform(X)
all(X_poly[:,1] == X_poly[:,0]**2) # True

scaling

  • StandardScaler
from sklearn.preprocessing import StandardScaler
std_scaler = StandardScaler()
  • MinMaxScaler

models

Linear model

  • LinearRegression
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
  • Ridge
from sklearn.linear_model import Ridge
ridge_reg = Ridge(alpha)

model evaluation

model selection

  • train_test_split
from sklearn.model_selection import train_test_split

metrics

  • mean_squared_error
from sklearn.metrics import mean_squared_error
mean_squared_error(y_true, y_pred)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment