Skip to content

Instantly share code, notes, and snippets.

@aseemk
Last active March 21, 2022 13:54
Show Gist options
  • Save aseemk/8049714 to your computer and use it in GitHub Desktop.
Save aseemk/8049714 to your computer and use it in GitHub Desktop.
Neo4j Cypher query to get a "normalized" or "weighted" follower count in a social graph.

This is a Neo4j 1.9 (pre-2.0) query:

START user=node:nodes(type='user')
MATCH user <-[:follows]- follower -[?:follows]-> other
WITH user, follower, 1.0 / COUNT(other) AS weighted
WITH user, COUNT(follower) AS numFollowers, SUM(weighted) as totalWeighted
RETURN user, numFollowers,
  ROUND(totalWeighted * 100) / 100.0 AS totalWeighted,
  ROUND(totalWeighted * 100 / numFollowers) / 100.0 AS avgFollowerWeight

The idea is to give more weight to followers who follow fewer other people. So for each follower, their "weighted" contribution is 1 / numFollowing. To prevent division by zero, the original user is included in the "following". The result is that these "normalized" follower counts are always less than users' "simple" follower counts -- but they still make sense relative to each other. Nice!

One missing feature of the current query is that it ignores users w/ no followers. We can include them by changing the first relationship to an optional match, too, but then we get back to divison by zero. We could always add 1 to numFollowing, but then that gives users with no followers a weighted follower count of 1, and users with actual followers a potentially lower weighted follower count. I haven't figured out how to solve this yet.

Run against The Thingdom's database...

Top 10 users by their "simple" follower count:

==> +----------------------------------------------------------------------------------------------+
==> | ID(user) | user.firstName | user.lastName | numFollowers | totalWeighted | avgFollowerWeight |
==> +----------------------------------------------------------------------------------------------+
==> | 2        | "Aseem"        | "Kishore"     | 139          | 54.7          | 0.39              |
==> | 1        | "Daniel"       | "Gasienica"   | 72           | 19.06         | 0.26              |
==> | 39       | "Jenny"        | "Liu"         | 40           | 7.46          | 0.19              |
==> | 197      | "Ian"          | "Gilman"      | 22           | 3.41          | 0.15              |
==> | 4648     | "The Thingdom" | ""            | 20           | 3.32          | 0.17              |
==> | 317      | "James"        | "Darpinian"   | 19           | 2.1           | 0.11              |
==> | 443      | "Ben"          | "Vanik"       | 19           | 2.1           | 0.11              |
==> | 125      | "Shelley"      | "Gu"          | 19           | 2.4           | 0.13              |
==> | 6072     | "Jeremy"       | "Smith"       | 18           | 6.94          | 0.39              |
==> | 12186    | "Brad"         | "Feld"        | 18           | 5.79          | 0.32              |
==> +----------------------------------------------------------------------------------------------+

Top 10 users by their "weighted" follower count:

==> +----------------------------------------------------------------------------------------------+
==> | ID(user) | user.firstName | user.lastName | numFollowers | totalWeighted | avgFollowerWeight |
==> +----------------------------------------------------------------------------------------------+
==> | 2        | "Aseem"        | "Kishore"     | 139          | 54.7          | 0.39              |
==> | 1        | "Daniel"       | "Gasienica"   | 72           | 19.06         | 0.26              |
==> | 39       | "Jenny"        | "Liu"         | 40           | 7.46          | 0.19              |
==> | 6072     | "Jeremy"       | "Smith"       | 18           | 6.94          | 0.39              |
==> | 12186    | "Brad"         | "Feld"        | 18           | 5.79          | 0.32              |
==> | 197      | "Ian"          | "Gilman"      | 22           | 3.41          | 0.15              |
==> | 4648     | "The Thingdom" | ""            | 20           | 3.32          | 0.17              |
==> | 200      | "Frida"        | "Kumar"       | 8            | 3.23          | 0.4               |
==> | 3451     | "Anh Huy"      | "Truong"      | 7            | 2.68          | 0.38              |
==> | 11855    | "Andrew"       | "Dorne"       | 8            | 2.46          | 0.31              |
==> +----------------------------------------------------------------------------------------------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment