Must-Know: How to determine the influence of a Twitter user?

The influence of a Twitter user goes beyond the simple number of followers. We also want to examine how effective are tweets - how likely they are to be retweeted, favorited, or the links inside clicked upon. What exactly is an influential user depends on the definition.



Editor's note: This post was originally included as an answer to a question posed in our 17 More Must-Know Data Science Interview Questions and Answers series earlier this year. The answer was thorough enough that it was deemed to deserve its own dedicated post.

Social networks are at the center of today's web, and determining the influence in a social network is a huge area of research. Twitter influence is a narrow area within the overall social network influence research.

The influence of a Twitter user goes beyond the simple number of followers. We also want to examine how effective are tweets - how likely they are to be retweeted, favorited, or the links inside clicked upon. What exactly is an influential user depends on the definition - different types of influence discussed included celebrities, opinion leaders, influencers, discussers, innovators, topical experts, curators, commentators, and more.

A key challenge is to compute influence efficiently. An additional problem on Twitter is separating humans and bots.

Common measures used to quantify influence on Twitter include many versions of network centrality - how important is the node within the network, and PageRank-based metrics.

NodeXL KDnuggets
KDnuggets Twitter Social Network, as visualized in NodeXL in May 2014.

Traditional network measures used include

  • Closeness Centrality, based on the length of the shortest paths from a node to everyone else. It measures the visibility or accessibility of each node with respect to the entire network
  • Betweenness centrality considers for each node i all the shortest paths that should pass through i to connect all the other nodes in the network. It measures the ability of each node to facilitate communication within the network.

Other proposed measures include retweet impact (how likely is the tweet be retweeted) and variations of PageRank, such as TunkRank - see A Twitter Analog to PageRank.

An important refinement to overall influence is looking at influence within a topic - done by Agilience and RightRelevant. For instance, Justin Bieber may have high influence overall, but he is less influential than KDnuggets in the area of Data Science.

Twitter provides a REST API which allows access to key measures, but with limits on the number of requests and the data returned.

There were a number of websites that measured Twitter user influence, but many of their business models did not pan out, since many of them were acquired or went out of business. Ones which are currently active include the following:

Free:

  • Agilience (KDnuggets is #1 in Machine Learning, #1 is Data Mining, #2 in Data Science)
  • Klout, klout.com  (KDnuggets Klout score is 79)
  • Influence Tracker, www.influencetracker.com , KDnuggets influence metric 39.2
  • Right Relevance - measures specific relevance of twitter users within a topic.

Paid:

  • Brandwatch (bought PeerIndex)
  • Hubspot
  • Simplymeasured

Relevant KDnuggets posts:

Relevant KDnuggets tags:

For a more in-depth analysis, see technical articles below:

Related: