Skip to content

Instantly share code, notes, and snippets.

@rvanbruggen
Last active November 3, 2020 16:58
Show Gist options
  • Save rvanbruggen/16549b5300c1532e6ea8 to your computer and use it in GitHub Desktop.
Save rvanbruggen/16549b5300c1532e6ea8 to your computer and use it in GitHub Desktop.
Neo4j Knowledge Graph

The Neo4j Knowledge Graph

Our friends of Neueda have been doing more and more work with Neo4j. One of the artefacts of that work (see their github repo for more info) has been an unbelievably wonderful page called Awesome Neo4j. This is a webpage with links and other resources that can be useful for people doing Neo4j projects - whether you are looking for tips and tricks, developer resources, language bindings, frameworks, visualization solutions, graph algorithm components, etc…​ all kinds of links are on this page…​ Truly great work of all the Contributors. And inspiring at that.

A Google Sheet as the main repository

Based on that work, I have tried to add some additional links that I use/know about on a regular basis. Since I am not a developer, my goto-place for documenting/sharing information is not a github page, but a google sheet with all the data.

So what I wanted to do in this Gist is to provide a "graph" version of that page. Essentially what you have there are three things * Resources: all kinds of interesting artefacts, produced by Neo4j or by the community * Authors: individuals or organisations that have been contributing items to the * Tags: I have tried to categorise and tag the different resources with a number of different tags.

That sheet, I can very easily download as a CSV file - and so this Graphgist will actually pull data from that Google sheet in real time, and create a graph(gist) out of it. If and when I update the sheet, this GraphGist will automagically update itself too. Nice!

The Neo4j Knowledge Graph’s Current Graph Model

Here’s what the model looks like (generated using a meta-graph of course):

Screen%252520Shot%2525202016 02 27%252520at%25252013.16.53

Very simple - but it’s so much nicer when you can make it interactive and load it into Neo4j. Let’s do that. Let’s load that data into this graphgist.

load csv with headers from "https://docs.google.com/a/neotechnology.com/spreadsheets/d/1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc/export?format=csv&id=1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc&gid=0" as csv
merge (r:Resource {name: csv.What, url: csv.Where, comments: csv.Comments})
with csv
merge (a:Author {name: csv.Who})
with csv
match (r:Resource {name: csv.What}), (a:Author {name: csv.Who})
merge (a)-[:CREATED]->(r)
with csv,a,r
merge (a)-[:CREATED]->(r)
with csv.What as Resource, csv.Tags as row
unwind row as text
with Resource, [w in split(text,", ") | trim(w)] as words
unwind range(0,size(words)-2) as idx
MATCH (r:Resource {name: Resource})
MERGE (t1:Tag {name:words[idx]})
MERGE (t2:Tag {name:words[idx+1]})
MERGE (r)-[:TAGGED_AS]->(t1)
MERGE (r)-[:TAGGED_AS]->(t2);

Let’s take a look at what we have now:

Ok - so that looks like a big fat hairball. Not very useful. So let’s try to soom in a bit, and run a simple query over our graph: let’s find a couple of episodes and see what they are connected to…​

MATCH p = ((a:Author)--(r:Resource)--(t:Tag))
return p
limit 25

and here’s a sample of the graph:

Let’s do another query:

MATCH (t:Tag)--(r:Resource)--(a:Author)
where a.name contains "Rik" or a.name contains "Max"
return t,r,a

and display the result

Now it gets even nicer when we take a look at some of the resource specifics, and immediately render the URLs:

MATCH (r:Resource)--(a:Author)
where a.name contains "Rik" or a.name contains "Max"
return distinct a.name as Author, r.name as Resource, r.url as URL, r.comments as Description
order by Author;

And finally, we can of course also find some paths between tags:

match (t1:Tag {name:"book"}), (t2:Tag {name:"blog"}),
p = allshortestpaths ( (t1)-[*]-(t2))
return p
limit 10

and then we get this lovely ring:

Just a start…​

There are so many other things that we could look at. Use the console below to explore if you are interested in more. As always the load script is on github.

I hope this gist was interesting for you, and that we will see each other soon.

This gist was created by Rik Van Bruggen

//Neo4j Knowledge Graph
create index on :Resource(name);
create index on :Resource(comments);
//load the resource
// file:///neo4jknowledge.csv" as csv
load csv with headers from "https://docs.google.com/a/neotechnology.com/spreadsheets/d/1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc/export?format=csv&id=1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc&gid=0" as csv
merge (r:Resource {name: csv.What, url: csv.Where, comments: csv.Comments});
//load the Authors
// load csv with headers from "file:///neo4jknowledge.csv" as csv
load csv with headers from "https://docs.google.com/a/neotechnology.com/spreadsheets/d/1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc/export?format=csv&id=1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc&gid=0" as csv
merge (a:Author {name: csv.Who});
//load the people
// load csv with headers from "file:///neo4jknowledge.csv" as csv
load csv with headers from
"https://docs.google.com/a/neotechnology.com/spreadsheets/d/1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc/export?format=csv&id=1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc&gid=0" as csv
match (a:Author {name: csv.Who}), (r:Resource {name: csv.What})
merge (a)-[:CREATED]->(r);
//Tag the resources
// load csv with headers from "file:///neo4jknowledge.csv" as csv
load csv with headers from
"https://docs.google.com/a/neotechnology.com/spreadsheets/d/1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc/export?format=csv&id=1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc&gid=0" as csv
match (a:Author {name: csv.Who}), (r:Resource {name: csv.What})
merge (a)-[:CREATED]->(r)
with csv.What as Resource, csv.Tags as row
unwind row as text
with Resource, [w in split(text,", ") | trim(w)] as words
unwind range(0,size(words)-2) as idx
MATCH (r:Resource {name: Resource})
MERGE (t1:Tag {name:words[idx]})
MERGE (t2:Tag {name:words[idx+1]})
MERGE (r)-[:TAGGED_AS]->(t1)
MERGE (r)-[:TAGGED_AS]->(t2);
//Neo4j Knowledge Graph
create index on :Resource(name);
create index on :Resource(comments);
//All in one statement
load csv with headers from "https://docs.google.com/a/neotechnology.com/spreadsheets/d/1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc/export?format=csv&id=1X6DpFZoS01V1crgRED4dRz2UkbiYR8FJMPf9xey9Lwc&gid=0" as csv
merge (r:Resource {name: csv.What, url: csv.Where, comments: csv.Comments})
with csv
merge (a:Author {name: csv.Who})
with csv
match (r:Resource {name: csv.What}), (a:Author {name: csv.Who})
merge (a)-[:CREATED]->(r)
with csv,a,r
merge (a)-[:CREATED]->(r)
with csv.What as Resource, csv.Tags as row
unwind row as text
with Resource, [w in split(text,", ") | trim(w)] as words
unwind range(0,size(words)-2) as idx
MATCH (r:Resource {name: Resource})
MERGE (t1:Tag {name:words[idx]})
MERGE (t2:Tag {name:words[idx+1]})
MERGE (r)-[:TAGGED_AS]->(t1)
MERGE (r)-[:TAGGED_AS]->(t2);
//Find some Authors, Resources and Tags
MATCH p = ((a:Author)--(r:Resource)--(t:Tag))
return p
limit 25
//Find some Authors, Resources and Tags connected to Rik or Max
MATCH (t:Tag)--(r:Resource)--(a:Author)
where a.name contains "Rik" or a.name contains "Max"
return t,r,a
//find some resources and authors
MATCH (r:Resource)--(a:Author)
where a.name contains "Rik" or a.name contains "Max"
return distinct a.name as Author, r.name as Resource, r.url as URL, r.comments as Description
order by Author;
//find some paths between books and blogs
match (t1:Tag {name:"book"}), (t2:Tag {name:"blog"}),
p = allshortestpaths ( (t1)-[*]-(t2))
return p
limit 10
@victoriastuart
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment