Skip to content

Instantly share code, notes, and snippets.

@madagra
Last active April 26, 2020 10:59
Show Gist options
  • Save madagra/56fd137942318d4b44686f1ad5a52b6e to your computer and use it in GitHub Desktop.
Save madagra/56fd137942318d4b44686f1ad5a52b6e to your computer and use it in GitHub Desktop.
Create lag features for time series analysis
import pandas as pd
from sklearn.preprocessing import StandardScaler
from statsmodels.tsa.stattools import pacf
def create_lag_features(y):
scaler = StandardScaler()
features = pd.DataFrame()
partial = pd.Series(data=pacf(y, nlags=48))
lags = list(partial[np.abs(partial) >= 0.2].index)
df = pd.DataFrame()
# avoid to insert the time series itself
lags.remove(0)
for l in lags:
df[f"lag_{l}"] = y.shift(l)
features = pd.DataFrame(scaler.fit_transform(df[df.columns]),
columns=df.columns)
features.index = y.index
return features
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment