Do more with Python: Creating a graph application with Python, Neo4j, Gephi, and Linkurious

Here is how to build a neat app with graph visualization of Python and related topics from Packt and StackOverflow, combining Gephi, Linkurious, and Neo4j.



Publish It!

So here’s the workflow I used to get the Python topic graph out of Neo4j and onto the web.

-Use Py2neo to graph the subgraph of content and topics pertinent to Python, as described above

-Add to this some other topics linked to the same books to give a fuller picture of the Python “world”

-Add in topic-topic edges and product-product edges to show the full breadth of connections observed in the data

-export all the nodes and edges to csv files

-import node and edge tables into Gephi.

The reason I’m using Gephi as a middle step is so that I can fiddle with the visualisation in Gephi until it looks perfect. The layout plugin in Sigma is good, but this way the graph is presentable as soon as the page loads, the communities are much clearer, and I’m not putting undue strain on browsers across the world!

-The layout of the graph has been achieved using a number of plugins. Instead of using the pre-installed ForceAtlaslayouts, I’ve used the OpenOrd layout, which I feel really shows off the communities of a large graph. There’s a really interesting and technical presentation about how this layout works here.

-Export the graph into gexf format, having applied some partition and ranking functions to make it more clear and appealing.

Now it’s all down to Linkurious and its various plugins! You can explore the source code of the final page to see all the details, but here I’ll give an overview of the different plugins I’ve used for the different parts of the visualisation:

First instantiate the graph object, pointing to a container (note the CSS of the container, without this, the graph won’t display properly:

<style type="text/css">
#container {
max-width: 1500px;
height: 850px;
margin: auto;
background-color: #E5E5E5;
}
</style>

<div id="container"></div>

<script>
s= new sigma({
container: 'container',
renderer: {
container: document.getElementById('container'),
type: 'canvas'
},
settings: {

}
});
sigma.parsers.gexf - used for (trivially!) importing a gexf file into a sigma instance
sigma.parsers.gexf(
'static/data/Graph1.gexf',
s,
function(s) {
//callback executed once the data is loaded, use this to set up any aspects of the app which depend on the data
});

-sigma.plugins.filter – Adds the ability to very simply hide nodes/edges based on a callback function. This powers the filtering widget on the page.

<input id="min-degree" type="range" min="0" max="0" value="0">

functionapplyMinDegreeFilter(e) {
var v = e.target.value;
$('#min-degree-val').textContent = v;
filter
.undo('min-degree')
.nodesBy(
function(n, options) {
returnthis.graph.degree(n.id) >= options.minDegreeVal;
},{
minDegreeVal: +v
},
'min-degree'
)
.apply();
};
$('#min-degree').change(applyMinDegreeFilter);

-sigma.plugins.locate – Adds the ability to zoom in on a single node or collection of nodes. Very useful if you’re filtering a very large initial graph

functionlocateNode (nid) {
if (nid == '') {
locate.center(1);
}
else {
locate.nodes(nid);
}
};

-sigma.renderers.glyphs – Allows you to add custom glyphs to each node. Useful if you have many types of node.

Outro

This application has been a very fun little project to build. The improvements to Sigma wrought by Linkurious have resulted in an incredibly powerful toolkit to rapidly generate graph based applications with a great degree of flexibility and interaction potential.

None of this would have been possible were it not for Python. Python is my right (left, I’m left handed) hand which I use for almost everything. Its versatility and expressiveness make it an incredibly robust Swiss army knife in any data-analysts toolkit.

greg-roberts

Bio: Greg Roberts is a Data Analyst at Packt Publishing, and has a Masters degree in Theoretical Physics and Applied Maths. Since joining Packt he has developed a passion for working with data, and enjoys learning about new or novel methods of predictive analysis and machine learning. To this end, he spends most of his time fiddling with models in python, and creating ever more complex Neo4j instances, for storing and analysing any and all data he can find. When not writing Python, he is a big fan of reddit, cycling and making interesting noises with a guitar.

You can find Greg on Twitter @GregData

 

Related