This is a Neo4j 1.9 (pre-2.0) query:
START user=node:nodes(type='user')
MATCH user <-[:follows]- follower -[?:follows]-> other
WITH user, follower, 1.0 / COUNT(other) AS weighted
WITH user, COUNT(follower) AS numFollowers, SUM(weighted) as totalWeighted
RETURN user, numFollowers,
ROUND(totalWeighted * 100) / 100.0 AS totalWeighted,
ROUND(totalWeighted * 100 / numFollowers) / 100.0 AS avgFollowerWeight
The idea is to give more weight to followers who follow fewer other people. So for each follower, their "weighted" contribution is 1 / numFollowing
. To prevent division by zero, the original user is included in the "following". The result is that these "normalized" follower counts are always less than users' "simple" follower counts -- but they still make sense relative to each other. Nice!
One missing feature of the current query is that it ignores users w/ no followers. We can include them by changing the first relationship to an optional match, too, but then we get back to divison by zero. We could always add 1 to numFollowing, but then that gives users with no followers a weighted follower count of 1, and users with actual followers a potentially lower weighted follower count. I haven't figured out how to solve this yet.
Run against The Thingdom's database...
Top 10 users by their "simple" follower count:
==> +----------------------------------------------------------------------------------------------+
==> | ID(user) | user.firstName | user.lastName | numFollowers | totalWeighted | avgFollowerWeight |
==> +----------------------------------------------------------------------------------------------+
==> | 2 | "Aseem" | "Kishore" | 139 | 54.7 | 0.39 |
==> | 1 | "Daniel" | "Gasienica" | 72 | 19.06 | 0.26 |
==> | 39 | "Jenny" | "Liu" | 40 | 7.46 | 0.19 |
==> | 197 | "Ian" | "Gilman" | 22 | 3.41 | 0.15 |
==> | 4648 | "The Thingdom" | "" | 20 | 3.32 | 0.17 |
==> | 317 | "James" | "Darpinian" | 19 | 2.1 | 0.11 |
==> | 443 | "Ben" | "Vanik" | 19 | 2.1 | 0.11 |
==> | 125 | "Shelley" | "Gu" | 19 | 2.4 | 0.13 |
==> | 6072 | "Jeremy" | "Smith" | 18 | 6.94 | 0.39 |
==> | 12186 | "Brad" | "Feld" | 18 | 5.79 | 0.32 |
==> +----------------------------------------------------------------------------------------------+
Top 10 users by their "weighted" follower count:
==> +----------------------------------------------------------------------------------------------+
==> | ID(user) | user.firstName | user.lastName | numFollowers | totalWeighted | avgFollowerWeight |
==> +----------------------------------------------------------------------------------------------+
==> | 2 | "Aseem" | "Kishore" | 139 | 54.7 | 0.39 |
==> | 1 | "Daniel" | "Gasienica" | 72 | 19.06 | 0.26 |
==> | 39 | "Jenny" | "Liu" | 40 | 7.46 | 0.19 |
==> | 6072 | "Jeremy" | "Smith" | 18 | 6.94 | 0.39 |
==> | 12186 | "Brad" | "Feld" | 18 | 5.79 | 0.32 |
==> | 197 | "Ian" | "Gilman" | 22 | 3.41 | 0.15 |
==> | 4648 | "The Thingdom" | "" | 20 | 3.32 | 0.17 |
==> | 200 | "Frida" | "Kumar" | 8 | 3.23 | 0.4 |
==> | 3451 | "Anh Huy" | "Truong" | 7 | 2.68 | 0.38 |
==> | 11855 | "Andrew" | "Dorne" | 8 | 2.46 | 0.31 |
==> +----------------------------------------------------------------------------------------------+