Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save wolfram77/c37bb426b5cab8108f39e2f805bda9ac to your computer and use it in GitHub Desktop.

Select an option

Save wolfram77/c37bb426b5cab8108f39e2f805bda9ac to your computer and use it in GitHub Desktop.
Exploring Reddit Community Structure: Bridges, Gateways and Highways : NOTES

My highlighted notes for the following paper:

Sawicki, J., & Ganzha, M. (2024). Exploring Reddit Community Structure: Bridges, Gateways and Highways. Electronics, 13(10), 1935.

Sawicki and Ganzha analyze Reddit’s information structure using text embeddings derived from the DistilBERT model, applying graph and cosine similarity measures. They explore the concepts of gateways and bridges, finding significant overlap between the two, and introduce a new construct -- the highway -- defined as a path traversed by many of the shortest paths connecting communities. This addition extends prior notions by identifying a set of nodes along a path rather than a single key node.

Their work was an interesting read. The communities, gateways, and bridges do indeed make sense, mostly (exc. r/formuladank). In Table 1, Size=15 (first one), most subreddits are also SE Asia/India related. I do agree that gateways and bridges seem to be mostly similar -- but I didn't understand Figures 2 and 3.

Questions and observations:

  1. In Figure 1, isn't the bridge supposed to be a node? In the figure, it appears to be an edge/highway.
  2. How was a subreddit summarized into a single embedding vector?

Also, it seems gateways and bridges are both high PageRank nodes, while highways are edges/paths with high betweenness centrality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment