Skip to content

Instantly share code, notes, and snippets.

@numpde
Last active March 22, 2023 15:06
Show Gist options
  • Save numpde/16f3a22e352dc43dc01614b50b74645b to your computer and use it in GitHub Desktop.
Save numpde/16f3a22e352dc43dc01614b50b74645b to your computer and use it in GitHub Desktop.
Compute the Betti numbers of a graph
def betti(G, C=None, verbose=False):
# G is a networkx graph
# C is networkx.find_cliques(G)
# RA, 2017-11-03, CC-BY-4.0
# Ref:
# A. Zomorodian, Computational topology (Notes), 2009
# http://www.ams.org/meetings/short-courses/zomorodian-notes.pdf
import itertools
import numpy as np
import networkx as nx
from scipy.sparse import lil_matrix
def DIAGNOSTIC(*params):
if verbose: print(*params)
#
# 1. Prepare maximal cliques
#
# If not provided, compute maximal cliques
if (C is None): C = nx.find_cliques(G)
# Sort each clique, make sure it's a tuple
C = [tuple(sorted(c)) for c in C]
DIAGNOSTIC("Number of maximal cliques: {} ({}M)".format(len(C), round(len(C) / 1e6)))
#
# 2. Enumerate all simplices
#
# S[k] will hold all k-simplices
# S[k][s] is the ID of simplex s
S = []
for k in range(0, max(len(s) for s in C)):
# Get all (k+1)-cliques, i.e. k-simplices, from max cliques mc
Sk = set(c for mc in C for c in itertools.combinations(mc, k + 1))
# Check that each simplex is in increasing order
assert (all((list(s) == sorted(s)) for s in Sk))
# Assign an ID to each simplex, in lexicographic order
S.append(dict(zip(sorted(Sk), range(0, len(Sk)))))
for (k, Sk) in enumerate(S):
DIAGNOSTIC("Number of {}-simplices: {}".format(k, len(Sk)))
# Euler characteristic
ec = sum(((-1) ** k * len(S[k])) for k in range(0, len(S)))
DIAGNOSTIC("Euler characteristic:", ec)
#
# 3. Construct the boundary operator
#
# D[k] is the boundary operator
# from the k complex
# to the k-1 complex
D = [None for _ in S]
# D[0] is the zero matrix
D[0] = lil_matrix((1, G.number_of_nodes()))
# Construct D[1], D[2], ...
for k in range(1, len(S)):
D[k] = lil_matrix((len(S[k - 1]), len(S[k])))
SIGN = np.asmatrix([(-1) ** i for i in range(0, k + 1)]).transpose()
for (ks, j) in S[k].items():
# Indices of all (k-1)-subsimplices s of the k-simplex ks
I = [S[k - 1][s] for s in sorted(itertools.combinations(ks, k))]
D[k][I, j] = SIGN
for (k, d) in enumerate(D):
DIAGNOSTIC("D[{}] has shape {}".format(k, d.shape))
# Check that D[k-1] * D[k] is zero
assert (all((0 == np.dot(D[k - 1], D[k]).count_nonzero()) for k in range(1, len(D))))
#
# 4. Compute rank and dimker of the boundary operators
#
# Rank and dimker
rk = [np.linalg.matrix_rank(d.todense()) for d in D]
ns = [(d.shape[1] - rk[n]) for (n, d) in enumerate(D)]
DIAGNOSTIC("rk:", rk)
DIAGNOSTIC("ns:", ns)
#
# 5. Infer the Betti numbers
#
# Betti numbers
# B[0] is the number of connected components
B = [(n - r) for (n, r) in zip(ns[:-1], rk[1:])]
DIAGNOSTIC("B:", ns)
ec_alt = sum(((-1) ** k * B[k]) for k in range(0, len(B)))
DIAGNOSTIC("Euler characteristic (from Betti numbers):", ec_alt)
# Check: Euler-Poincare formula
assert (ec == ec_alt)
return B
def test1():
import networkx as nx
G = nx.Graph()
G.add_edges_from([(0, 1), (1, 2), (0, 2), (0, 3), (1, 3), (2, 3)])
G.add_edges_from([(4, 5), (5, 6), (6, 7), (7, 4)])
# Diagnostic
print("Number of nodes: {}, edges: {}".format(G.number_of_nodes(), G.number_of_edges()))
print("Betti numbers:", betti(G, verbose=True))
def test2(n):
import networkx as nx
G = nx.cycle_graph(n)
print(list(G.edges))
print("Betti numbers:", betti(G, verbose=True))
if __name__ == '__main__':
test1()
# this is OK
test2(3)
# this fails, probably due to wrong
# computation of the Euler characteristic
test2(4)
@numpde
Copy link
Author

numpde commented Jan 2, 2021

thx for providing this code! When running it,I get the following error in statement
' D[k][I, j] = SIGN':
"cannot reshape array of size 4 into shape (2,)"
what is the best fix?

OK: changing ' D[k][I, j] = SIGN' ---> ' D[k][I, j] = SIGN.squeeze()' seemed to do the trick

I made a clean install with

python --version
Python 3.8.5

requirements.txt
decorator==4.4.2
networkx==2.5
numpy==1.19.4
scipy==1.5.4

and the code runs as is with the following output:

Number of nodes: 8, edges: 10
Number of maximal cliques: 5 (0M)
Number of 0-simplices: 8
Number of 1-simplices: 10
Number of 2-simplices: 4
Number of 3-simplices: 1
Euler characteristic: 1
D[0] has shape (1, 8)
D[1] has shape (8, 10)
D[2] has shape (10, 4)
D[3] has shape (4, 1)
rk: [0, 6, 3, 1]
ns: [8, 4, 1, 0]
Betti numbers: [2, 1, 0]

@alexlangnau
Copy link

Hello again:
I tried to use your code on the circle:
circle
and found that the Euler-characteristic and Poincare-Euler formula don't match. Perhaps it is my versioning again.
I changed the following code:
B = [(n - r) for (n, r) in zip(ns[:-1], rk[1:])]
to
l=rk[1:]
l.append(0)
B=[(ns[k]-l[k]) for k in range(len(ns))]
and it seems to work.

Here is the code for producing the circle:

g=nx.DiGraph()
n=10
g.nodes()
nodes=list(g.nodes())
for i in range(n-1):

g.add_edges_from([(i,i+1)])
g.add_edges_from([(i+1,i)])

g.add_edges_from([(n-1,0)])
g.add_edges_from([(0,n-1)])
g=g.to_undirected()
list(g.edges)
nx.draw(g,with_labels=True)

@numpde
Copy link
Author

numpde commented Jan 3, 2021

Hello again:
I tried to use your code on the circle: [..]
and found that the Euler-characteristic and Poincare-Euler formula don't match. Perhaps it is my versioning again.
I changed the following code: [..]

Thanks for pointing this out.
I tried the following:

import networkx as nx
G = nx.cycle_graph(3)
print(list(G.edges))
print("Betti numbers:", betti(G, verbose=True))

G = nx.cycle_graph(4)
print(list(G.edges))
print("Betti numbers:", betti(G, verbose=True))

The Euler characteristic in the first case is computed as one.
In the second it is zero.
This does not make sense.
However, when computed from the Betti numbers it is one in both cases.

For now, I conclude it is the first computation (of the EC) that is incorrect. I am fairly confident that the computation of the Betti number is correct because I cross-checked them with an independent implementation.

I'll be able to look into this in a few days.

@alexlangnau
Copy link

Thx for your feedback!

I believe that both answers could be correct! It depends on how one interprets the drawing of the circle above:
If one sees it as a 1-dimensional object, i.e. a line where the ends meet, then the Euler characteristic must be zero, since the number of vertices minus the number of edges is zero in this case (note: number of faces=0 in this case). Hence

                                                Euler-characteristic = V-E+F = 11-11+0=0.

However if one interprets the pic like a membrane that is a two-dimensional surface with boundaries being the 1-dimensional circle from above, then the Euler characteristic comes out 1, i.e.

                                                Euler-characteristic = V-E+F = 11-11+1=1.

Ultimately it boils down to the question of what the caller of your function intends when invoking your code. The user can do so as
you nicely provide the possibility to specify the graph along with its cliques C as input. It is only when the Cliquets are not specified in the input, that your code is making an assumption (via the maximal cliquet) but it should do so consistently

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment