Notice that FaceBook uses two types of URLs :
- the old one that contains profile.php in it;
- the new one that only contains the nickname.
This implies that we have to clean separately these two types of URLs.
Columns “Source” and “Target” won’t change but we are going to build a third column “C”, based on column B and A. This column will contain a link to a special URL of Facebook, which displays the common friends between you and another friend.
Simply use a concatenate function to achieve that: Or: The result should look like this:
Now we need to transform the URL column in a HTML link that can be easily parsed by OutWit Hub:
This will allow OutWit Hub to extract your common friends by visiting the HTML link.
Once it is done, export this project to HTML.
3rd step : scrape the data!
Open the HTML file in OutWit Hub. Notice that the URL column is seen as a link by Outwit.
Create a macro in OutWit Hub, that will parse every link on this page. It’s basically the same operations as Step 1, but automated.
Find your friends’ friends.
Depending on the number of friends you have and the machine you’re working on, this process may last several hours.
Once it is done, you will get a CSV file with your dataset. At this point, you may append this file with your initial list of friends (step 1), and clean the double entries that may appear, using Openrefine.
As you see, building a good dataset with Facebook is not really trivial, but can be achieved combining scraping and data-cleansing techniques.
Storing your Facebook graph in Neo4j
It is hard to understand the connections in your Facebook network with a tool like Excel. We are going to use Neo4j database to store the data.
Here is how to import your CSV-formatted data into Neo4j:
view rawfacebook.cql hosted with ❤ by GitHub
You can download the Neo4j dataset used in this article here. Now we can search and visualize our network.
Visualizing your Facebook graph
Neo4j offers an out of the box visualization tool. It allows you to visualize Cypher queries, a graph query language:
If you want something easier and more powerful, you can use Linkurious to explore your Facebook network (try Linkurious now).
Simply type the name of any of your contact, and you will visualize him.
Visualizing a Facebook social network.
We can also zoom on particular details.
Visualizing a community within a larger social network.
You can select nodes, hide them based on their properties. You can search for paths between two persons. All via an easy to use interface.
Graph visualization allows you to understand your social network. You can see who knows who. Who has a lot of connections. Who is isolated. What are the communities within your network. All of this can be discovered through visual exploration.
You can follow Hervé Piedcoq to stay up to date on data investigation techniques and tools. You can try Linkurious now and learn how to use graph visualization to understand your data.
Original, reposted with permission.