Skip to content

Instantly share code, notes, and snippets.

@rvanbruggen
Last active September 3, 2017 23:36
Show Gist options
  • Save rvanbruggen/f52f535ece19c9ec263f to your computer and use it in GitHub Desktop.
Save rvanbruggen/f52f535ece19c9ec263f to your computer and use it in GitHub Desktop.
Dolphin Social Network

Dolphin Social Network

Everyone loves Dolphins

Don’t tell me that you don’t like dolphins. Everyone likes them. And if you don’t, you should.

5753255739 7860fc067c z

So if I told you that there was something interesting about the Bottlenose Dolphins that live in a place like this:

doubtful sound

then what would you say? Interesting? Yes, indeed.

This gist is a little experiment based on this academic paper:

Many complex networks, including human societies, the Internet, the World Wide Web and power grids, have surprising properties that allow vertices (individuals, nodes, Web pages, etc.) to be in close contact and information to be transferred quickly between them. Nothing is known of the emerging properties of animal societies, but it would be expected that similar trends would emerge from the topology of animal social networks. Despite its small size (64 individuals), the `Doubtful Sound community of bottlenose dolphins has the same characteristics`. The connectivity of individuals follows a complex distribution that has a `scale-free power-law distribution` for large k. In addition, the ability for two individuals to be in contact is unaffected by the random removal of individuals. The removal of individuals with many links to others does affect the length of the ‘information’ path between two individuals, but, unlike other scale-free networks, it does not fragment the cohesion of the social network. These self-organizing phenomena allow the network to remain united, even in the case of catastrophic death events.

It’s a really, really interesting, and very small dataset. So let’s load it in. The original file was found over here. I manipulated it a bit, and loaded into Neo4j using this Google Spreadsheet. That’s all I needed to generate the dataset. As it is quite small, we can just load it into this Gist easily.

The Doubtful Sound Dolphins

First we have to load the data into the graph. Using the spreadsheet above I created the Cypher statements, and there we have it:

//create the dolphins
create (_0:`DOLPHIN` {`betscore`:0, `id`:0, `name`:"Beak"})
create (_1:`DOLPHIN` {`betscore`:0, `id`:1, `name`:"Beescratch"})
create (_2:`DOLPHIN` {`betscore`:0, `id`:2, `name`:"Bumper"})
create (_3:`DOLPHIN` {`betscore`:0, `id`:3, `name`:"CCL"})
create (_4:`DOLPHIN` {`betscore`:0, `id`:4, `name`:"Cross"})
create (_5:`DOLPHIN` {`betscore`:0, `id`:5, `name`:"DN16"})
create (_6:`DOLPHIN` {`betscore`:0, `id`:6, `name`:"DN21"})
create (_7:`DOLPHIN` {`betscore`:0, `id`:7, `name`:"DN63"})
create (_8:`DOLPHIN` {`betscore`:0, `id`:8, `name`:"Double"})
create (_9:`DOLPHIN` {`betscore`:0, `id`:9, `name`:"Feather"})
create (_10:`DOLPHIN` {`betscore`:0, `id`:10, `name`:"Fish"})
create (_11:`DOLPHIN` {`betscore`:0, `id`:11, `name`:"Five"})
create (_12:`DOLPHIN` {`betscore`:0, `id`:12, `name`:"Fork"})
create (_13:`DOLPHIN` {`betscore`:0, `id`:13, `name`:"Gallatin"})
create (_14:`DOLPHIN` {`betscore`:0, `id`:14, `name`:"Grin"})
create (_15:`DOLPHIN` {`betscore`:0, `id`:15, `name`:"Haecksel"})
create (_16:`DOLPHIN` {`betscore`:0, `id`:16, `name`:"Hook"})
create (_17:`DOLPHIN` {`betscore`:0, `id`:17, `name`:"Jet"})
create (_18:`DOLPHIN` {`betscore`:0, `id`:18, `name`:"Jonah"})
create (_19:`DOLPHIN` {`betscore`:0, `id`:19, `name`:"Knit"})
create (_20:`DOLPHIN` {`betscore`:0, `id`:20, `name`:"Kringel"})
create (_21:`DOLPHIN` {`betscore`:0, `id`:21, `name`:"MN105"})
create (_22:`DOLPHIN` {`betscore`:0, `id`:22, `name`:"MN23"})
create (_23:`DOLPHIN` {`betscore`:0, `id`:23, `name`:"MN60"})
create (_24:`DOLPHIN` {`betscore`:0, `id`:24, `name`:"MN83"})
create (_25:`DOLPHIN` {`betscore`:0, `id`:25, `name`:"Mus"})
create (_26:`DOLPHIN` {`betscore`:0, `id`:26, `name`:"Notch"})
create (_27:`DOLPHIN` {`betscore`:0, `id`:27, `name`:"Number1"})
create (_28:`DOLPHIN` {`betscore`:0, `id`:28, `name`:"Oscar"})
create (_29:`DOLPHIN` {`betscore`:0, `id`:29, `name`:"Patchback"})
create (_30:`DOLPHIN` {`betscore`:0, `id`:30, `name`:"PL"})
create (_31:`DOLPHIN` {`betscore`:0, `id`:31, `name`:"Quasi"})
create (_32:`DOLPHIN` {`betscore`:0, `id`:32, `name`:"Ripplefluke"})
create (_33:`DOLPHIN` {`betscore`:0, `id`:33, `name`:"Scabs"})
create (_34:`DOLPHIN` {`betscore`:0, `id`:34, `name`:"Shmuddel"})
create (_35:`DOLPHIN` {`betscore`:0, `id`:35, `name`:"SMN5"})
create (_36:`DOLPHIN` {`betscore`:0, `id`:36, `name`:"SN100"})
create (_37:`DOLPHIN` {`betscore`:0, `id`:37, `name`:"SN4"})
create (_38:`DOLPHIN` {`betscore`:0, `id`:38, `name`:"SN63"})
create (_39:`DOLPHIN` {`betscore`:0, `id`:39, `name`:"SN89"})
create (_40:`DOLPHIN` {`betscore`:0, `id`:40, `name`:"SN9"})
create (_41:`DOLPHIN` {`betscore`:0, `id`:41, `name`:"SN90"})
create (_42:`DOLPHIN` {`betscore`:0, `id`:42, `name`:"SN96"})
create (_43:`DOLPHIN` {`betscore`:0, `id`:43, `name`:"Stripes"})
create (_44:`DOLPHIN` {`betscore`:0, `id`:44, `name`:"Thumper"})
create (_45:`DOLPHIN` {`betscore`:0, `id`:45, `name`:"Topless"})
create (_46:`DOLPHIN` {`betscore`:0, `id`:46, `name`:"TR120"})
create (_47:`DOLPHIN` {`betscore`:0, `id`:47, `name`:"TR77"})
create (_48:`DOLPHIN` {`betscore`:0, `id`:48, `name`:"TR82"})
create (_49:`DOLPHIN` {`betscore`:0, `id`:49, `name`:"TR88"})
create (_50:`DOLPHIN` {`betscore`:0, `id`:50, `name`:"TR99"})
create (_51:`DOLPHIN` {`betscore`:0, `id`:51, `name`:"Trigger"})
create (_52:`DOLPHIN` {`betscore`:0, `id`:52, `name`:"TSN103"})
create (_53:`DOLPHIN` {`betscore`:0, `id`:53, `name`:"TSN83"})
create (_54:`DOLPHIN` {`betscore`:0, `id`:54, `name`:"Upbang"})
create (_55:`DOLPHIN` {`betscore`:0, `id`:55, `name`:"Vau"})
create (_56:`DOLPHIN` {`betscore`:0, `id`:56, `name`:"Wave"})
create (_57:`DOLPHIN` {`betscore`:0, `id`:57, `name`:"Web"})
create (_58:`DOLPHIN` {`betscore`:0, `id`:58, `name`:"Whitetip"})
create (_59:`DOLPHIN` {`betscore`:0, `id`:59, `name`:"Zap"})
create (_60:`DOLPHIN` {`betscore`:0, `id`:60, `name`:"Zig"})
create (_61:`DOLPHIN` {`betscore`:0, `id`:61, `name`:"Zipfel"})
create _8-[:`INTERACTS_WITH`]->_3
create _9-[:`INTERACTS_WITH`]->_6
create _9-[:`INTERACTS_WITH`]->_5
create _10-[:`INTERACTS_WITH`]->_2
create _10-[:`INTERACTS_WITH`]->_0
create _13-[:`INTERACTS_WITH`]->_9
create _13-[:`INTERACTS_WITH`]->_6
create _13-[:`INTERACTS_WITH`]->_5
create _14-[:`INTERACTS_WITH`]->_3
create _14-[:`INTERACTS_WITH`]->_0
create _15-[:`INTERACTS_WITH`]->_0
create _16-[:`INTERACTS_WITH`]->_14
create _17-[:`INTERACTS_WITH`]->_13
create _17-[:`INTERACTS_WITH`]->_9
create _17-[:`INTERACTS_WITH`]->_6
create _17-[:`INTERACTS_WITH`]->_1
create _18-[:`INTERACTS_WITH`]->_15
create _19-[:`INTERACTS_WITH`]->_7
create _19-[:`INTERACTS_WITH`]->_1
create _20-[:`INTERACTS_WITH`]->_18
create _20-[:`INTERACTS_WITH`]->_16
create _20-[:`INTERACTS_WITH`]->_8
create _21-[:`INTERACTS_WITH`]->_18
create _22-[:`INTERACTS_WITH`]->_17
create _24-[:`INTERACTS_WITH`]->_18
create _24-[:`INTERACTS_WITH`]->_15
create _24-[:`INTERACTS_WITH`]->_14
create _25-[:`INTERACTS_WITH`]->_17
create _26-[:`INTERACTS_WITH`]->_25
create _26-[:`INTERACTS_WITH`]->_1
create _27-[:`INTERACTS_WITH`]->_26
create _27-[:`INTERACTS_WITH`]->_25
create _27-[:`INTERACTS_WITH`]->_17
create _27-[:`INTERACTS_WITH`]->_7
create _27-[:`INTERACTS_WITH`]->_1
create _28-[:`INTERACTS_WITH`]->_20
create _28-[:`INTERACTS_WITH`]->_8
create _28-[:`INTERACTS_WITH`]->_1
create _29-[:`INTERACTS_WITH`]->_24
create _29-[:`INTERACTS_WITH`]->_21
create _29-[:`INTERACTS_WITH`]->_18
create _29-[:`INTERACTS_WITH`]->_10
create _30-[:`INTERACTS_WITH`]->_28
create _30-[:`INTERACTS_WITH`]->_19
create _30-[:`INTERACTS_WITH`]->_7
create _31-[:`INTERACTS_WITH`]->_17
create _32-[:`INTERACTS_WITH`]->_13
create _32-[:`INTERACTS_WITH`]->_9
create _33-[:`INTERACTS_WITH`]->_21
create _33-[:`INTERACTS_WITH`]->_16
create _33-[:`INTERACTS_WITH`]->_14
create _33-[:`INTERACTS_WITH`]->_12
create _34-[:`INTERACTS_WITH`]->_33
create _34-[:`INTERACTS_WITH`]->_14
create _35-[:`INTERACTS_WITH`]->_29
create _36-[:`INTERACTS_WITH`]->_23
create _36-[:`INTERACTS_WITH`]->_20
create _36-[:`INTERACTS_WITH`]->_1
create _37-[:`INTERACTS_WITH`]->_36
create _37-[:`INTERACTS_WITH`]->_34
create _37-[:`INTERACTS_WITH`]->_33
create _37-[:`INTERACTS_WITH`]->_21
create _37-[:`INTERACTS_WITH`]->_16
create _37-[:`INTERACTS_WITH`]->_14
create _37-[:`INTERACTS_WITH`]->_8
create _38-[:`INTERACTS_WITH`]->_33
create _38-[:`INTERACTS_WITH`]->_20
create _38-[:`INTERACTS_WITH`]->_16
create _38-[:`INTERACTS_WITH`]->_14
create _39-[:`INTERACTS_WITH`]->_36
create _40-[:`INTERACTS_WITH`]->_37
create _40-[:`INTERACTS_WITH`]->_36
create _40-[:`INTERACTS_WITH`]->_33
create _40-[:`INTERACTS_WITH`]->_15
create _40-[:`INTERACTS_WITH`]->_14
create _40-[:`INTERACTS_WITH`]->_7
create _40-[:`INTERACTS_WITH`]->_0
create _41-[:`INTERACTS_WITH`]->_13
create _41-[:`INTERACTS_WITH`]->_9
create _41-[:`INTERACTS_WITH`]->_1
create _42-[:`INTERACTS_WITH`]->_30
create _42-[:`INTERACTS_WITH`]->_10
create _42-[:`INTERACTS_WITH`]->_2
create _42-[:`INTERACTS_WITH`]->_0
create _43-[:`INTERACTS_WITH`]->_38
create _43-[:`INTERACTS_WITH`]->_37
create _43-[:`INTERACTS_WITH`]->_33
create _43-[:`INTERACTS_WITH`]->_29
create _43-[:`INTERACTS_WITH`]->_14
create _44-[:`INTERACTS_WITH`]->_38
create _44-[:`INTERACTS_WITH`]->_34
create _44-[:`INTERACTS_WITH`]->_20
create _44-[:`INTERACTS_WITH`]->_2
create _45-[:`INTERACTS_WITH`]->_37
create _45-[:`INTERACTS_WITH`]->_29
create _45-[:`INTERACTS_WITH`]->_24
create _45-[:`INTERACTS_WITH`]->_23
create _45-[:`INTERACTS_WITH`]->_21
create _45-[:`INTERACTS_WITH`]->_18
create _45-[:`INTERACTS_WITH`]->_15
create _45-[:`INTERACTS_WITH`]->_8
create _46-[:`INTERACTS_WITH`]->_43
create _47-[:`INTERACTS_WITH`]->_42
create _47-[:`INTERACTS_WITH`]->_30
create _47-[:`INTERACTS_WITH`]->_28
create _47-[:`INTERACTS_WITH`]->_20
create _47-[:`INTERACTS_WITH`]->_10
create _47-[:`INTERACTS_WITH`]->_0
create _49-[:`INTERACTS_WITH`]->_46
create _49-[:`INTERACTS_WITH`]->_34
create _50-[:`INTERACTS_WITH`]->_45
create _50-[:`INTERACTS_WITH`]->_42
create _50-[:`INTERACTS_WITH`]->_33
create _50-[:`INTERACTS_WITH`]->_20
create _50-[:`INTERACTS_WITH`]->_16
create _50-[:`INTERACTS_WITH`]->_14
create _51-[:`INTERACTS_WITH`]->_50
create _51-[:`INTERACTS_WITH`]->_45
create _51-[:`INTERACTS_WITH`]->_29
create _51-[:`INTERACTS_WITH`]->_24
create _51-[:`INTERACTS_WITH`]->_23
create _51-[:`INTERACTS_WITH`]->_21
create _51-[:`INTERACTS_WITH`]->_18
create _51-[:`INTERACTS_WITH`]->_11
create _51-[:`INTERACTS_WITH`]->_4
create _52-[:`INTERACTS_WITH`]->_40
create _52-[:`INTERACTS_WITH`]->_38
create _52-[:`INTERACTS_WITH`]->_29
create _52-[:`INTERACTS_WITH`]->_14
create _53-[:`INTERACTS_WITH`]->_43
create _54-[:`INTERACTS_WITH`]->_41
create _54-[:`INTERACTS_WITH`]->_19
create _54-[:`INTERACTS_WITH`]->_13
create _54-[:`INTERACTS_WITH`]->_7
create _54-[:`INTERACTS_WITH`]->_6
create _54-[:`INTERACTS_WITH`]->_1
create _55-[:`INTERACTS_WITH`]->_51
create _55-[:`INTERACTS_WITH`]->_15
create _56-[:`INTERACTS_WITH`]->_6
create _56-[:`INTERACTS_WITH`]->_5
create _57-[:`INTERACTS_WITH`]->_54
create _57-[:`INTERACTS_WITH`]->_48
create _57-[:`INTERACTS_WITH`]->_41
create _57-[:`INTERACTS_WITH`]->_39
create _57-[:`INTERACTS_WITH`]->_17
create _57-[:`INTERACTS_WITH`]->_13
create _57-[:`INTERACTS_WITH`]->_9
create _57-[:`INTERACTS_WITH`]->_6
create _57-[:`INTERACTS_WITH`]->_5
create _58-[:`INTERACTS_WITH`]->_38
create _59-[:`INTERACTS_WITH`]->_45
create _59-[:`INTERACTS_WITH`]->_36
create _59-[:`INTERACTS_WITH`]->_15
create _59-[:`INTERACTS_WITH`]->_8
create _59-[:`INTERACTS_WITH`]->_3
create _60-[:`INTERACTS_WITH`]->_32
create _61-[:`INTERACTS_WITH`]->_53
create _61-[:`INTERACTS_WITH`]->_37
create _61-[:`INTERACTS_WITH`]->_2

Let’s see what we have added. Here are the Doubtful Sound Dolphins:

It’s not too complicated as a network. Now just from looking at the visualisation we can immediately see some of the better connected Dolphins. But let’s try to use some graph concepts to find out which Dolphins really matter.

Who’s the Dolphin-daddy?

Degree Centrality as a basic measure …​

A basic measure is the measure of Centrality: how "central" is a Dolphin in this network. A basic measure is to count the number of connections of each Dolphin - known as the "degree" of a node in a graph. So let’s calculate that with a simple cypher query, and order by resulting measure:

//degree centrality
match n-[r]->m
return n.name as Dolphin, count(r) as DegreeScore
order by DegreeScore desc
limit 10;

Or pictured in a visualisation:

//degree centrality
match n-[r]->m
with n as Dolphin, count(r) as DegreeScore
order by DegreeScore desc
limit 10
return Dolphin

Who’s the REAL dolphin-daddy?

Betweenness Centrality as a measure …​

Let’s now look at another meaure for centrality, based on the concept of "Betweenness". A Betweenness Score represents how "between" a particular node is between all of the other nodes. One calculates that score by counting the number of shortest paths from all nodes to all others, that pass through that original node. We can calculate it with this query, which uses the new UNWIND keyword in Cypher. Unwind allows you to split a collection back into its individual rows, and so this is what will allow us to

  • first find all the Shortest Paths from every source dolphin to another target dolphin

  • then make sure that the source and target dolphins are different, and that the length of the paths are 2 hops or more (excluding directly connected paths - there’s no "betweenness" in direct connections)

  • then "unwind" all of these paths into the individual composing dolphins - excluding the start- and end-dolphins of the shortestpaths

  • then count how many times each dolphin would appear on these paths - this is our Betweenness Centrality score for every dolphin

That’s it. Before we look at the query, I really do have to thank Michael again for his awesome help. My early Betweenness Centrality queries were much uglier, and probably wrong :) …​ So here we go:

MATCH p=allShortestPaths((source:DOLPHIN)-[*]-(target:DOLPHIN))
WHERE id(source) < id(target) and length(p) > 1
UNWIND nodes(p)[1..-1] as n
RETURN n.name, count(*) as betweenness
ORDER BY betweenness DESC

And the result:

So we are really getting some very different results there. Betweenness Centrality is a very interesting characteristic to measure, originally described by Linton Freeman in this paper.

THE END

I hope this gist was interesting for you. I found it quite interesting that - even though Neo4j is primarily targeted at graph local operations - it can still really help with graph global operations. The Neo4j graph algorithms may not all be exposed in Cypher, but you can still do some amazing things - easily.

This gist was created by Rik Van Bruggen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment