KDnuggets Home » News » 2015 » Sep » Opinions, Interviews, Reports » Dissecting the Big Data Twitter Community through a Big data Lens ( 15:n31 )

Dissecting the Big Data Twitter Community through a Big data Lens


Tweeter communities have activities: tweets, retweets, replies, and followers. Retweets graph is a good representation of actual connections in the network, their strengths, as well as the propagation of information through the network.



By Srinath Perera, (@srinath_perera).

BigData hashtag is hyperactive, with close to 2000 tweets each day with more than 20,000 tweeps. This post digs into the tweet archive from August 03-25 to understand dynamics about the Big data community.

Tweeter communities have activities: tweets, retweets, replies, and followers. Among them, retweets suggest a strong agreement by the actor with the tweet’s content. Hence, retweets graph is a good representation of actual connections in the network, their strengths, as well as the propagation of information through the network.

The Network

This post, therefore, will focus on the retweets graph. The following graph shows a visualization of the retweets graph  where vertex represents an account, edges represent retweets, and the size of the node represents the number of retweets each node has received. Each edge is weighted by the number of retweets between two accounts, and it shows an edge from account A to B only if B has retweeted two or more tweets by A.

network2

The first thing you will notice is that the top three tweeps have received a large proportion of retweets. The following heat map shows retweets received by top tweeps.

retweetsHeatmap

  1. KirkDBorne 2588
  2. jose_garde 1730
  3. craigbrownphd 1546

In the network, we can see that the three of them have their own following. However, the graph has a phantom edge which has lots of edges placed around it in the right middle. That turns out to be a twitter bot ( BigDataTweetBot) which has tweeted lots of other people’s tweets.

network10

The following figure shows a more spare version of the same graph that only shows an edge if two accounts have more than 10 retweets between them. On this network, the KirkDBorne community seems to be pretty well-connected, while others are pretty isolated. This suggests that his community is stronger.


Sign Up