Skip to content

Instantly share code, notes, and snippets.

@satra
Created October 16, 2014 15:40
Show Gist options
  • Save satra/aa3d19a12b74e9ab7941 to your computer and use it in GitHub Desktop.
Save satra/aa3d19a12b74e9ab7941 to your computer and use it in GitHub Desktop.
Distance Correlation in Python
from scipy.spatial.distance import pdist, squareform
import numpy as np
from numbapro import jit, float32
def distcorr(X, Y):
""" Compute the distance correlation function
>>> a = [1,2,3,4,5]
>>> b = np.array([1,2,9,4,4])
>>> distcorr(a, b)
0.762676242417
"""
X = np.atleast_1d(X)
Y = np.atleast_1d(Y)
if np.prod(X.shape) == len(X):
X = X[:, None]
if np.prod(Y.shape) == len(Y):
Y = Y[:, None]
X = np.atleast_2d(X)
Y = np.atleast_2d(Y)
n = X.shape[0]
if Y.shape[0] != X.shape[0]:
raise ValueError('Number of samples must match')
a = squareform(pdist(X))
b = squareform(pdist(Y))
A = a - a.mean(axis=0)[None, :] - a.mean(axis=1)[:, None] + a.mean()
B = b - b.mean(axis=0)[None, :] - b.mean(axis=1)[:, None] + b.mean()
dcov2_xy = (A * B).sum()/float(n * n)
dcov2_xx = (A * A).sum()/float(n * n)
dcov2_yy = (B * B).sum()/float(n * n)
dcor = np.sqrt(dcov2_xy)/np.sqrt(np.sqrt(dcov2_xx) * np.sqrt(dcov2_yy))
return dcor
@mycarta
Copy link

mycarta commented Sep 24, 2019

@seanlaw and @tfiers :

yes, this is the same distance correlation.

Small reproducible example:

from sklearn.datasets import load_iris
import pandas as pd
import dcor
# https://dcor.readthedocs.io/en/latest/energycomparison.html

iris = load_iris()
iris_df = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                     columns= iris['feature_names'] + ['target'])

print ("dcor distance correlation = {:.3f}".format(dcor.distance_correlation(iris_df['sepal length (cm)'], 
                                                                        iris_df['petal length (cm)'])))
print ("distcorr distance correlation = {:.3f}".format(distcorr(iris_df['sepal length (cm)'], iris_df['petal length (cm)'])))

returns:

dcor distance correlation = 0.859
distcorr distance correlation = 0.859

@JSVJ
Copy link

JSVJ commented Dec 8, 2019

Can anyone help me how to plot this like a matrix in pandas?

Is it possible to have a distance correlation matrix similar to a correlation matrix?

@JazGit1
Copy link

JazGit1 commented Nov 22, 2023

Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment