Tuesday, 27 January 2015

Fun with Graphs

Some years ago I worked for a GraphDB company. So one of my side projects at Couchbase is an implementation of Tinkerpop's Blueprints GraphDB API ( http://www.tinkerpop.com ) on top of the Couchbase Document Database.

The source code of this project was published here:
Important is that the implementation is not yet published as a stable version. So there is still work in progress in order to test, stabilize and improve it.

Anyway: Let's have a little bit fun with Graphs!

The first thing you want to do is to configure the connection to your Couchbase cluster. If you checkout the source code then there is a 'couchbase.properties' file which needs to be modified regarding your environment:

cb.con.hosts=192.168.56.104,192.168.56.105,192.168.56.106
cb.con.port=8091
cb.con.bucket.name=graph
cb.con.bucket.pwd=test
cb.timeout.op=30000
cb.admin.user=couchbase
cb.admin.pwd=couchbase
cb.view.designdoc=graph_views
cb.view.alledges=all_edges
cb.view.allvertices=all_vertices
The configuration is more or less self explaining:

  • In order to establish a connection to your Couchbase cluster, just provide some server nodes.
Side note: You don't need to provide all of the server nodes of your cluster. Best practice is to provide a subset of the set of server nodes. So if you have a cluster of E.G. 5 nodes then it would be sufficient to provide for instance 2 nodes for connection purposes.
  • The bucket which is used here is called 'graph' and it is secured with the password 'test'.
  •  Couchbase Views are used in order to access all vertices and edges. There are methods as part of the implementation those are allowing you to create the Views programmatically. The 'resources' directory contains the JS files of them whereby the names are configured via 'couchbase.properties'.
Side note: You should keep in mind that you would usually not access all edges or vertices. The way how you interact with a Graph Database is usually that you start with a set of vertices (or for instance one vertex) by then traversing from this one to other ones by following edges.

All you need now is to create an instance of CBGraph:

Graph graph = new CBGraph();
In the next step we want to add a new Vertex. So let's add the vertex 'Bart' and then set some properties on it.

Vertex tavwp_bart = graph.addVertex("tavwp_bart");

tavwp_bart.setProperty("first_name", "Bart");
tavwp_bart.setProperty("last_name", "Simpson");
tavwp_bart.setProperty("city", "Springfield");
tavwp_bart.setProperty("age", 8);
tavwp_bart.setProperty("is_student", true);
To store only vertices would not add any benefit. The real power of Graphs relies on data connectivity. So let's add two new vertices and then add an edge between them. So let's assume that there are two persons 'Moe' and 'Barney'. 'Barney' is a guest of 'Moe'.

Vertex v_tae_moe = graph.addVertex("tae_moe");
Vertex v_tae_barney = graph.addVertex("tae_barney");

Edge e_1 = graph.addEdge("e_1", v_tae_barney, v_tae_moe, "guest of");
What you can do now is to retrieve the vertex 'Moe' in order to find every guest of it:

Iterable<Vertex> guests = v_tae_moe.getVertices(Direction.OUT, "guest of");
Cool? Just try it out by your own!

If you do so, then you could investigate how the data was stored in Couchbase by connecting to Couchbase's Admin UI (E.G. http://192.168.56.104:8091). Because Couchbase is a document database and Key-Value store, the data is stored as Key-Value pairs whereby the value is a JSON document.

The key of a vertex is now looking as the following one:

v_tavwp_bart
The value of the vertex (as a JSON String) would be:

{\"edges\":{\"in\":{},\"out\":{}},\"type\":\"vertex\",\"props\":{\"city\":\"Springfield\",\"last_name\":\"Simpson\",\"first_name\":\"Bart\",\"is_student\":true,\"age\":8}}

The above mentioned example has no incoming or outgoing edges. So a key of an edge would be stored as:

e_tae_barney->|guest of|->tae_moe
As you can see the key is using a human readable pattern, which means that there is an edge from 'Barney' to 'Moe' whereby the relationship has the label 'Guest of'.

The value of an edge is providing this information as JSON:

{\"to\":\"v_tae_homer\",\"label\":\"son of\",\"from\":\"v_tae_bart\",\"type\":\"edge\"}

So a vertex with an edge looks like the following one:

{\"edges\":{\"in\":{\"son of\":[\"e_tae_bart->|son of|->tae_homer\"]},\"out\":{}},\"type\":\"vertex\",\"props\":{}}
Further examples and details will follow in future blog posts but I hope that this one already gave you some insights about the Graph API on top of Couchbase.


No comments:

Post a Comment