Graph databases are now one of the core technologies of companies dealing with highly connected data.
Business graphs, social graphs, knowledge graphs, interest graphs and media graphs are frequently in the (technology) news. And for a reason. The graph model represents a very flexible way of handling relationships in your data. And graph databases provide fast and efficient storage, retrieval and querying for it.
Neo4j, the most popular graph database, has proven that ability to deal with massive amount of high connected data in many use-cases. During the last GraphConnect conference, TomTom and Ebay's Shuttle demonstrated the value a graph database adds to your company for instance to provide fantastic customer experience or to enable complex route-map editing. Neo4j is developed and supported by Neo Technology a startup grown into a well respected database company.
For the newcomers, here is a short introduction to graph databases and Neo4j
A graph is a generic data structure, composed of of nodes (entities) connected by relationships. Sometimes those are also called vertices and edges. In the property graph model, each node and relationship can be labeled and holds hold any number of properties describing it.
A graph database is a database optimized for operations on connected data. Graph databases provide high performance suitable for online operations by using dedicated storage structures for both nodes and relationships. They don't need to compute relationships (JOINS) at query time but store them efficiently as part of your data.
Let's take a simple social application as an example, where users follow other users.
A user will be represented as a Node and can have a label and properties. Labels depict various roles for your nodes.
The link between these two users will be represented as a Relationship, which can also have properties and a Type to identify the nature of the relationship. Relationships add semantic meaning to your data.
Looking at the graph shows how natural it is, to represent data in a graph and store it in a graph database.
Querying a graph may not appear to be straightforward. To make it easy, Neo4j developed Cypher, a declarative graph query language, focused on readability and expressiveness for humans for developer, administrators and domain experts.
Being declarative, Cypher focuses on of expressing what to retrieve from a graph, not how to retrieve it.
The query language is comprised of several distinct clauses. You can read more details about them in the Neo4j manual.
Here are a few clauses used to read and update the graph:
- MATCH: Finds the "example" graph pattern you provide in the graph and returns one path per found match.
- WHERE: Filters results with predicates, much like in SQL. There are many more predicates in Cypher though, including collection operations and graph matches.
- RETURN: Returns your query result in the form you need, as scalar values, graph elements or paths, or collections or even documents.
- CREATE: Creates graph elements (nodes and relationships) with labels and properties.
- MERGE: Matches existing patterns or create them. It's a combination of
MATCH
andCREATE
.
Cypher is all about patterns, it describes the visual representation you've already seen as textual patterns (using ASCII-art).
It uses round parentheses to depict nodes (like (m:Movie)
or (me:Person:Developer)
) and arrows (like -->
or -[:LOVES]->
) for relationships.
Looking at our last graph of users, a query that will retrieve Hannah Hilpert and the users following her will be written like the following :
MATCH (user:User {name:'Hannah Hilpert'})<-[:FOLLOWS]-(follower)
RETURN user, follower
After this quick introduction to the Neo4j graph database (more here), let's see how we can use it from PHP.
Neo4j is installed as a database server. An HTTP-API is accessible for manipulating the database and issuing Cypher queries.
If you want to install and run the Neo4j graph database, you can download the latest version here : http://neo4j.com/download/ ,
extract the archive on your computer and run the ./bin/neo4j start
command.
Neo4j comes with a cool visual interface, the Neo4j Browser available at http://localhost:7474. Just try it! There are some guides to get started within the browser, but more information can be found online.
If you don't want to install it on your machine, you can always create a free instance on GrapheneDB, a Neo4j As A Service provider.
Neoxygen is a set of open-source components, most of them in PHP, for the Neo4j ecosystem available on Github. Currently I'm the main developer, if you are interested in contributing as well, just ping me.
A powerful Client for the Neo4j HTTP-API is named NeoClient, with multi-database support and built-in high availabililty management.
The installation is trivial, just add the neoclient
dependency in your composer.json
file :
{
"require": {
"neoxygen/neoclient":"~2.0@dev"
}
}
And run the update command :
composer update neoxygen/neoclient
Require the composer's autoloader in your application, and configure your connection when building the client :
<?php
require_once __DIR__.'/vendor/autoload.php';
use Neoxygen\NeoClient\ClientBuilder;
$client = ClientBuilder::create()
->addConnection('default', 'http', 'localhost', 7474)
->build();
If you created an instance on GrapheneDB, you need to configure a secure connection with credentials, it's done by appending true for using the auth mode and your credentials to the addConnection
method :
<?php
require_once __DIR__.'/vendor/autoload.php';
use Neoxygen\NeoClient\ClientBuilder;
$connUrl = parse_url('http://master.sb02.stations.graphenedb.com:24789/db/data/');
$user = 'master';
$pwd = 's3cr3tP@ssw0rd';
$client = ClientBuilder::create()
->addConnection('default', $connUrl['scheme'], $connUrl['host'], $connUrl['port'], true, $user, $password)
->build();
You have now full access to your Neo4j database with the client connecting to the HTTP API.
The library provides handy methods to access the different endpoints. However, the most frequently used method is sending a Cypher query.
Handling graph results in a raw json response is a bit cumbersome. That's why the library comes with a handy result formatter that transform the response into node and relationship objects. The formatter is disabled by default, you can enable it by just adding a line of code in your client building process :
$client = ClientBuilder::create()
->addConnection('default', 'http', 'localhost', 7474)
->setAutoFormatResponse(true)
->build();
We're going to build a set of User nodes and FOLLOWS relationships incrementally. Then we'll be able to query friend-of-a-friend information to provide friendship suggestions.
The query to create a User is the following :
CREATE (user:User {name:'Kenneth'}) RETURN user
The query is composed of 5 parts :
- The CREATE clause (in blue), indicating we want to create a new element.
- The identifier (in orange), used to identify your node in the query
- The label (in red), used to add the user to the
User
labelled group. - The node properties (in green), are specific to that node.
- The RETURN clause, indicating what you want to return, here the created user.
You can also try to run that query in the Neo4j Browser.
No need to wait, let's create this user with the client :
$query = 'CREATE (user:User {name:"Kenneth"}) RETURN user';
$result = $client->sendCypherQuery($query)->getResult();
You can visualize the created node in your browser (open the starred tab and run "Get some data"), or get the graph result with the client.
$user = $result->getSingleNode();
$name = $user->getProperty('name');
We will do the same for another user, now with query parameters. Query parameters are passed along with the query and it allows Neo4j to cache the query execution plan, which will make your further identical queries faster :
$query = 'CREATE (user:User {name: {name} }) RETURN user';
$parameters = array('name' => 'Maxime');
$client->sendCypherQuery($query);
As you can see, parameters are embedded in {}
, and passed in an array of parameters as second argument of the sendCypherQuery
method.
If you look at the graph now, you'll see the two User nodes, but they feel quite alone :( , no ?
In order to create the relationships between our nodes, we'll use Cypher again.
$query = 'MATCH (user1:User {name:{name1}}), (user2:User {name:{name2}}) CREATE (user1)-[:FOLLOWS]->(user2)';
$params = ['user1' => 'Kenneth', 'user2' => 'Maxime'];
$client->sendCypherQuery($query, $params);
Some explanations :
We first match for existing users named Kenneth and Maxime (names provided as parameters), and then we create a FOLLOWS
relationship between the two.
Kenneth will be the start node of the FOLLOWS
relationship and Maxime the end node.
The relationship type will be FOLLOWS.
Looking at the graph again shows up that the relationship has been created.
Manually writing all the creation statements for a set of 100 users and the relationships would be boring.
I want to introduce a very useful tool called Graphgen
(one of the Neoxygen components) for generating graph data with ease.
It uses a specification that is very close to Cypher to describe the graph you want.
Here we're going to create a set of 50 users and the corresponding FOLLOWS
relationships.
So go to http://graphgen.neoxygen.io , copy and paste the following pattern in the editor area, and click on Generate :
(user:User {login: userName, firstname: firstName, lastname: lastName} *50)-[:FOLLOWS *n..n]->(user)
You can see that it automatically generates a graph with 50 users, the relationships, and realistic values for login, firstname and lastname. Impressive, or?
Let's import this graph in our local graph database, click on Populate your database and use the default settings.
In no time, the database will be populated with the data.
If you open the Neo4j browser, and run "Get some data" again, you can see all the user nodes and their relationships.
Getting suggestions with Neo4j is simple, you just need to match one user, follow the FOLLOWS relationships to the other users, then for each found user, find the users they follow and return those that you do not follow already and also the suggestion is not the user for whom we are looking for suggestions
In a common application, there will be a login system and the user will be only allowed to see the users he is following. For the sake of this blog post which is introducing you Neo4j, you'll be able to play with all the users.
Let's write it in Cypher :
$query = 'MATCH (user:User {firstname: {firstname}})-[:FOLLOWS]->(followed)-[:FOLLOWS]->(suggestion)
WHERE user <> suggestion
AND NOT (user)-[:FOLLOWS]->(suggestion)
RETURN user, suggestion, count(*) as occurrence
ORDER BY occurrence DESC
LIMIT 10';
$params = ['firstname' => 'Francisco'];
$result = $client->sendCypherQuery($query, $params)->getResult();
$suggestions = $result->get('suggestion'); // Returns a set of nodes
If you run this query in the neo4j browser, you'll get your first matched user and the suggestions :
- You've discovered graph databases and Neo4j
- You learned the basics of the Cypher Query Language
- You've seen how to connect to and run queries on a Neo4j database with PHP
No we want to build a social network application, using Silex (the php micro-framework based on the Symfony components).
I'll use Silex, Twig, Bootstrap and NeoClient to build the application.
Create a directory for the app, I named mine spsocial
.
Add these lines to your composer.json
and run composer install
to install the dependencies :
{
"require": {
"silex/silex": "~1.1",
"twig/twig": ">=1.8,<2.0-dev",
"symfony/twig-bridge": "~2.3",
"neoxygen/neoclient": "~2.0.0"
},
"autoload": {
"psr-4": {
"Ikwattro\\SocialNetwork\\": "src"
}
}
}
You can download and install Bootstrap to the web/assets
folder of your project.
You can find the bootstrap demo app here as well: https://github.com/ikwattro/social-network
We need to configure Silex and declare Neo4jClient so it will be available in the Silex Application,
create an index.php
file in the web/
folder of your project :
// todo should these be taken from github (gist or embeds from the repo?)
<?php
require_once __DIR__.'/../vendor/autoload.php';
use Neoxygen\NeoClient\ClientBuilder;
$app = new Silex\Application();
$app['neo'] = $app->share(function(){
$client = ClientBuilder::create()
->addDefaultLocalConnection()
->setAutoFormatResponse(true)
->build();
return $client;
});
$app->register(new Silex\Provider\TwigServiceProvider(), array(
'twig.path' => __DIR__.'/../src/views',
));
$app->register(new Silex\Provider\MonologServiceProvider(), array(
'monolog.logfile' => __DIR__.'/../logs/social.log'
));
$app->register(new Silex\Provider\UrlGeneratorServiceProvider());
$app->get('/', 'Ikwattro\\SocialNetwork\\Controller\\WebController::home')
->bind('home');
$app->run();
Twig is configured to have its template files located in the src/views
folder.
A home route pointing to /
is registered and configured to use the WebController
we will create later.
The application structure should look like this :
Note that here I used bower to install bootstrap, but it is up to you what you want to use.
The next step is to create our base layout with a content block that our child Twig templates will override with their own content. I'll take the default bootstrap theme with a navbar on top :
// todo should these be taken from github (gist or embeds from the repo?)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="">
<meta name="author" content="">
<title>My first Neo4j application</title>
<!-- Bootstrap core CSS -->
<link href="{{ app.request.basepath }}/assets/bootstrap/dist/css/bootstrap.min.css" rel="stylesheet">
<!-- HTML5 shim and Respond.js IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
<style>
body { padding-top: 70px; }
</style>
</head>
<body>
<div class="navbar navbar-inverse navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" id="collbut" class="navbar-toggle collapsed" data-toggle="collapse" data-target=".navbar-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="#">My first Neo4j application</a>
</div>
</div>
</div>
<div class="container-fluid">
{% block content %}
{% endblock content %}
</div>
</body>
</html>
So far, we have Neo4j available in the application, our base template is created and we want to list all users on the home page.
We can achieve this in two steps :
- Create our
home
controller action and retrieve users from Neo4j - Pass the list of users to the template and list them
// todo should these be taken from github (gist or embeds from the repo?)
<?php
namespace Ikwattro\SocialNetwork\Controller;
use Silex\Application;
use Symfony\Component\HttpFoundation\Request;
class WebController
{
public function home(Application $application, Request $request)
{
$neo = $application['neo'];
$q = 'MATCH (user:User) RETURN user';
$result = $neo->sendCypherQuery($q)->getResult();
$users = $result->get('user');
return $application['twig']->render('index.html.twig', array(
'users' => $users
));
}
}
The controller shows the process, we retrieve the neo
service and issue a Cypher query to retrieve all the users.
The users collection is then passed to the index.html.twig
template.
// todo wouldn't a table work better for the template?
{% extends "layout.html.twig" %}
{% block content %}
<ul class="list-unstyled">
{% for user in users %}
<li>{{ user.property('firstname') }} {{ user.property('lastname') }}</li>
{% endfor %}
</ul>
{% endblock %}
The template is very light, it extends our base layout and add an unsorted list with the user's firstnames and lastnames in the content
inherited block.
Start the built-in php server and admire your work :
cd spsocial/web
php -S localhost:8000
open localhost:8000
Let's say now that we want to click on a user, and be presented his detail information and the users he follows.
Step 1 : Create a route in index.php
$app->get('/user/{login}', 'Ikwattro\\SocialNetwork\\Controller\\WebController::showUser')
->bind('show_user');
Step 2: Create the showUser
controller action
public function showUser(Application $application, Request $request, $login)
{
$neo = $application['neo'];
$q = 'MATCH (user:User) WHERE user.login = {login}
OPTIONAL MATCH (user)-[:FOLLOWS]->(f)
RETURN user, collect(f) as followed';
$p = ['login' => $login];
$result = $neo->sendCypherQuery($q, $p)->getResult();
$user = $result->get('user');
$followed = $result->get('followed');
if (null === $user) {
$application->abort(404, 'The user $login was not found');
}
return $application['twig']->render('show_user.html.twig', array(
'user' => $user,
'followed' => $followed
));
}
The workflow is similar to any other applications, you try to find the user based on the login. If it does not exist you show a 404 error page, otherwise you pass the user data to the template.
Step 3 : Create the show_user
template file
{% extends "layout.html.twig" %}
{% block content %}
<h1>User informations</h1>
<h2>{{ user.property('firstname') }} {{ user.property('lastname') }}</h2>
<h3>{{ user.property('login') }}</h3>
<hr/>
<div class="row">
<div class="col-sm-6">
<h4>User <span class="label label-info">{{ user.property('login') }}</span> follows :</h4>
<ul class="list-unstyled">
{% for follow in followed %}
<li>{{ follow.property('login') }} ( {{ follow.property('firstname') }} {{ follow.property('lastname') }} )</li>
{% endfor %}
</ul>
</div>
</div>
{% endblock %}
Step 4 : Refactor the list of users in the homepage to show links to their profile
{% for user in users %}
<li>
<a href="{{ path('show_user', { login: user.property('login') }) }}">
{{ user.property('firstname') }} {{ user.property('lastname') }}
</a>
</li>
{% endfor %}
Refresh the homepage and click on any user for showing his profile and the list of followed users
The next step is to provide suggestions to the profile.
We need to slightly extend our cypher query in the controller by adding an OPTIONAL MATCH
to find suggestions based on the second degree network.
The optinal prefix causes a MATCH
to return a row even if there were no matches but with the non-resolved parts set to null
(much like an outer JOIN).
As we potentially get multiple paths for each friend-of-a-friend (fof), we need to distinct the results in order to avoid duplicates in our list (collect is an aggregation operation that collects values into an array):
The updated controller :
public function showUser(Application $application, Request $request, $login)
{
$neo = $application['neo'];
$q = 'MATCH (user:User) WHERE user.login = {login}
OPTIONAL MATCH (user)-[:FOLLOWS]->(f)
OPTIONAL MATCH (f)-[:FOLLOWS]->(fof)
WHERE user <> fof
AND NOT (user)-[:FOLLOWS]->(fof)
RETURN user, collect(f) as followed, collect(distinct fof) as suggestions';
$p = ['login' => $login];
$result = $neo->sendCypherQuery($q, $p)->getResult();
$user = $result->get('user');
$followed = $result->get('followed');
$suggestions = $result->get('suggestions');
if (null === $user) {
$application->abort(404, 'The user $login was not found');
}
return $application['twig']->render('show_user.html.twig', array(
'user' => $user,
'followed' => $followed,
'suggestions' => $suggestions
));
}
The updated template :
{% extends "layout.html.twig" %}
{% block content %}
<h1>User informations</h1>
<h2>{{ user.property('firstname') }} {{ user.property('lastname') }}</h2>
<h3>{{ user.property('login') }}</h3>
<hr/>
<div class="row">
<div class="col-sm-6">
<h4>User <span class="label label-info">{{ user.property('login') }}</span> follows :</h4>
<ul class="list-unstyled">
{% for follow in followed %}
<li>{{ follow.property('login') }} ( {{ follow.property('firstname') }} {{ follow.property('lastname') }} )</li>
{% endfor %}
</ul>
</div>
<div class="col-sm-6">
<h4>Suggestions for user <span class="label label-info">{{ user.property('login') }}</span> </h4>
<ul class="list-unstyled">
{% for suggested in suggestions %}
<li>{{ suggested.property('login') }} ( {{ suggested.property('firstname') }} {{ suggested.property('lastname') }} )</li>
{% endfor %}
</ul>
</div>
</div>
{% endblock %}
You can immediately explore the suggestions in your application :
In order to connected to a suggested user, we'll add a post form link to each suggested user containing both users as hidden fields. We'll also create the corresponding route and controller action.
Creating the route :
#web/index.php
$app->post('/relationship/create', 'Ikwattro\\SocialNetwork\\Controller\\WebController::createRelationship')
->bind('relationship_create');
The controller action :
public function createRelationship(Application $application, Request $request)
{
$neo = $application['neo'];
$user = $request->get('user');
$toFollow = $request->get('to_follow');
$q = 'MATCH (user:User {login: {login}}), (target:User {login:{target}})
MERGE (user)-[:FOLLOWS]->(target)';
$p = ['login' => $user, 'target' => $toFollow];
$neo->sendCypherQuery($q, $p);
$redirectRoute = $application['url_generator']->generate('show_user', array('login' => $user));
return $application->redirect($redirectRoute);
}
Nothing unusual here, we MATCH
for the start user node and the target user node and then we MERGE
the corresponding FOLLOWS
relationship.
We use MERGE on the relationship to avoid duplicate entries.
The template :
<div class="col-sm-6">
<h4>Suggestions for user <span class="label label-info">{{ user.property('login') }}</span> </h4>
<ul class="list-unstyled">
{% for suggested in suggestions %}
<li>
{{ suggested.property('login') }} ( {{ suggested.property('firstname') }} {{ suggested.property('lastname') }} )
<form method="POST" action="{{ path('relationship_create') }}">
<input type="hidden" name="user" value="{{ user.property('login') }}"/>
<input type="hidden" name="to_follow" value="{{ suggested.property('login') }}"/>
<button type="submit" class="btn btn-success btn-sm">Follow</button>
</form>
<hr/>
</li>
{% endfor %}
</ul>
</div>
You can now click on the FOLLOW
button of the suggested user you want to follow :
Removing relationships :
The workflow for removing relationships is pretty much the same as for adding new relationships, create a route, a controller action and adapt the layout :
The route :
#web/index.php
$app->post('/relationship/remove', 'Ikwattro\\SocialNetwork\\Controller\\WebController::removeRelationship')
->bind('relationship_remove');
The controller action :
public function removeRelationship(Application $application, Request $request)
{
$neo = $application['neo'];
$user = $request->get('login');
$toRemove = $request->get('to_remove');
$q = 'MATCH (user:User {login: {login}} ), (badfriend:User {login: {target}} )
MATCH (user)-[follows:FOLLOWS]->(badfriend)
DELETE follows';
$p = ['login' => $user, 'target' => $toRemove];
$neo->sendCypherQuery($q, $p);
$redirectRoute = $application['url_generator']->generate('show_user', array('login' => $user));
return $application->redirect($redirectRoute);
}
You can see here that I used MATCH
to find the relationship between the two users,
and I added an identifier follows
to the relationship to be able to DELETE
it.
The template :
<h4>User <span class="label label-info">{{ user.property('login') }}</span> follows :</h4>
<ul class="list-unstyled">
{% for follow in followed %}
<li>
{{ follow.property('login') }} ( {{ follow.property('firstname') }} {{ follow.property('lastname') }} )
<form method="POST" action="{{ path('relationship_remove') }}">
<input type="hidden" name="login" value="{{ user.property('login') }}"/>
<input type="hidden" name="to_remove" value="{{ follow.property('login') }}"/>
<button type="submit" class="btn btn-sm btn-warning">Remove relationship</button>
</form>
<hr/>
</li>
{% endfor %}
</ul>
You can now click the Remove relationship button under each followed user :
Graph databases are the perfect fit for relational data, and using it with PHP and NeoClient is easy. Cypher is a convenient query language you will quickly love, because it makes possible to query your graph in a natural way.
There is so much benefit from using Graph Databases for real world data, I invite you to discover more by reading the manual http://neo4j.com/docs/stable/ , having a look at use cases and examples supplied by Neo4j users and following @Neo4j on Twitter.