Skip to content

Instantly share code, notes, and snippets.

@WillKoehrsen
Last active September 21, 2018 15:44
Show Gist options
  • Save WillKoehrsen/cfe5955a44f03a972e93734ca5431c5a to your computer and use it in GitHub Desktop.
Save WillKoehrsen/cfe5955a44f03a972e93734ca5431c5a to your computer and use it in GitHub Desktop.
import numpy as np
threshold = 0.95
# Create correlation matrix
corr_matrix = data.corr()
# Select upper triangle of correlation matrix
upper = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(np.bool))
# Find index of feature columns with correlation greater than 0.95
to_drop = [column for column in upper.columns if any(abs(upper[column]) > threshold)]
data = data.drop(columns = to_drop)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment