Created
July 15, 2016 15:37
-
-
Save tmsss/1789b9d4318854df562c14e8def1c113 to your computer and use it in GitHub Desktop.
Load graph from pandas dataframe as adjacency matrix (from https://www.bountysource.com/issues/31619133-load-graph-from-pandas-dataframe-as-adjacency-matrix)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import numpy as np | |
| # Assume that we have a dataframe df which is an adjacency matrix | |
| def find_edges(df): | |
| """Finds the edges in the square adjacency matrix, using | |
| vectorized operations. Returns a list of pairs of tuples | |
| that represent the edges.""" | |
| values = df.values # Adjacency matrix of 0's and 1's | |
| n_rows, n_columns = values.shape | |
| indices = np.arange(n_rows*n_columns) | |
| values = values.flatten() | |
| _indices = indices[values == 1] # A value of 1 means that the edge exists | |
| # Create two arrays `rows` and `columns` such that for an edge i, | |
| # (rows[i], columns[i]) is its coordinate in the df | |
| rows = _indices / n_columns | |
| columns = _indices % n_columns | |
| # Convert the coordinates to actual names | |
| row_names = df.index[rows] | |
| column_names = df.columns[columns] | |
| return zip(row_names, column_names) # Possible that itertools.izip is faster | |
| G = nx.DiGraph() | |
| G.add_nodes_from(df.index.tolist()) | |
| edges = find_edges(df) | |
| G.add_edges_from(edges) # Speed is questionable, not sure if this is vectorized |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment