local-score-BIC-two-bugs.md

1. Didn't consider uncentered data.

import causallearn_5a3219e as causallearn
# causallearn before Jun 16, 2022, where local_score_BIC is calculated using inner product
# download at https://github.com/py-why/causal-learn/tree/5a3219efa60f6d60387363358f6a97401d7546ce

from causallearn.utils.GraphUtils import GraphUtils
from causallearn.search.ScoreBased.GES import ges
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import io

# construct a v-structure X0->X2<-X1 with zero-meaned gaussian noise
X0 = np.random.normal(loc=0, scale=1, size=(5000,))
X1 = np.random.normal(loc=0, scale=1, size=(5000,))
X2 = X0 + X1 + np.random.normal(loc=0, scale=1, size=(5000,))
data = np.array([X0, X1, X2]).T

Record = ges(data, score_func='local_score_BIC')
pyd = GraphUtils.to_pydot(Record['G'])
tmp_png = pyd.create_png(f="png")
fp = io.BytesIO(tmp_png)
img = mpimg.imread(fp, format='png')
plt.axis('off')
plt.imshow(img)
plt.show()

The GES result is correct with centered data:

However, if data are not zero-meaned, the result may be incorrect. E.g:

# data = the above zero-meaned data
Record = ges(data+1, score_func='local_score_BIC')  # add a constant 1 shift

# exogenous noises with mean 1
X0 = np.random.normal(loc=1, scale=1, size=(5000,))
X1 = np.random.normal(loc=1, scale=1, size=(5000,))
X2 = X0 + X1 + np.random.normal(loc=1, scale=1, size=(5000,))

X0 = np.random.normal(loc=0, scale=1, size=(5000,))
X1 = np.random.normal(loc=0, scale=1, size=(5000,))
X2 = X0 + X1 + np.random.normal(loc=50, scale=1, size=(5000,))  # this might be due to overflow in inner product? i didnt check

2. Covariance is calculated as correlation.

import causallearn_80128fc as causallearn
# causallearn as of Feb 25 2023 (today), where np.cov is misused as np.corrcoef
# download at https://github.com/py-why/causal-learn/tree/80128fcdb089a229f210adb7f939660e2091241e

np.random.seed(0)
nodenum = 6
sparsity = 0.7
edgeslist = np.array([(u, v) for u, v in combinations(range(nodenum), 2) if np.random.uniform() > sparsity])
adjmat = np.zeros((nodenum, nodenum))
adjmat[edgeslist[:, 1], edgeslist[:, 0]] = 1
adjmat = adjmat * np.random.uniform(0.5, 2.5, (nodenum, nodenum))
adjmat[np.random.uniform(size=(nodenum, nodenum)) > 0.5] *= -1
mixing = np.linalg.pinv(np.eye(nodenum) - adjmat)
variances = np.random.uniform(0.5, 2.5, nodenum)
E = np.random.multivariate_normal(np.zeros(nodenum), np.diag(variances), 5000)
data = (mixing @ E.T).T

xs, ys = [], []
for i in range(nodenum):
    for S in powerset(range(nodenum)):
        if i in S: continue
        score_cov = local_score_BIC(data, i, list(S), usecoef=False)
        score_cor = local_score_BIC(data, i, list(S), usecoef=True)
        xs.append(score_cov)
        ys.append(score_cor)

still_correct_partial_order = []
for iloc in range(len(xs)):
    for jloc in range(iloc + 1, len(xs)):
        score_i_larger_than_j_in_cov = xs[iloc] > xs[jloc]
        score_i_larger_than_j_in_cor = ys[iloc] > ys[jloc]
        still_correct_partial_order.append(score_i_larger_than_j_in_cov == score_i_larger_than_j_in_cor)
print("partial order still correct ratio:", np.mean(still_correct_partial_order))

plt.figure(figsize=(7, 5))
plt.scatter(xs, ys)
plt.plot([min(xs), max(xs)], [min(xs), max(xs)], color="red", linestyle="--")
plt.xlabel("BIC scores using covariance matrix")
plt.ylabel("BIC scores using correlation coef matrix")
plt.show()

> partial order still correct ratio: 0.6011125654450262

We can see that the partial ordering of scores are also different. But, it seems that using BIC scores from either correlation coefficients or covariance, the GES outputs are always the same. I haven't thought about this yet.

MarkDana/local-score-BIC-two-bugs.md

1. Didn't consider uncentered data.

2. Covariance is calculated as correlation.