Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save pierrelouisbescond/1553c87bcd93c7e70b5f1c5deaf12b41 to your computer and use it in GitHub Desktop.
Save pierrelouisbescond/1553c87bcd93c7e70b5f1c5deaf12b41 to your computer and use it in GitHub Desktop.
# Let's see how the correlation coefficient evolves as the shift number increases
# and record the successive values into a DataFrame
shift_corr_results = pd.DataFrame(columns=["x1_shifted","x2_shifted","x3_shifted"], dtype=float)
for feature in shift_corr_results.columns:
# We define a shift range from 0 to 50 but it should be adapted to every use-case
for shift_value in range(0,50):
# The correlation coefficient is calculated
tmp_corr_value = df["y"].corr(df[feature].shift(shift_value))
# And recorded into the results DataFrame
shift_corr_results.loc[shift_value, feature] = tmp_corr_value
# After each feature analysis, we identity the best shift to apply to maximize the correlation
print("Best shift of", feature, "at", shift_corr_results[[feature]].idxmax()[0] , "with", shift_corr_results[feature].max())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment