KDnuggets Home » News » 2014 » May » News, Features » KDnuggets Social Network in NodeXL, May 2014 ( 14:n14 )

KDnuggets Social Network in NodeXL, May 2014


We examine KDnuggets Twitter Social Network, as generated by NodeXL, looking at clusters, top Twitter accounts, URLs, hashtags, words, and what does it all mean?



By Gregory Piatetsky, @kdnuggets, May 29, 2014.

NodeXL is a free, open-source template for Microsoft Excel 2007, 2010 and 2013 that makes it easy to explore network graphs. NodeXL is the product/service of Connected Action, which is headed by Marc Smith, @Marc_smith, Director of Social Media Research Foundation.

Here is KDnuggets Twitter Social Network, as visualized in NodeXL on May 25, 2014.

The graph represents a network of about 1,396 Twitter users whose tweets between Apr 24 and May 25, 2014 contained "kdnuggets", or who were replied to or mentioned in those tweets.

Some of the findings
  • The Twitter users most active in KDnuggets network: @yvesmulkers, @kirkdborne, @hey_anmol
  • Top shared content (in cluster G1) has infographics, cartoons, or other visual
  • There is a large connected component with about 40% of all users that has top hash keywords; max diameter in that component is 3 (small world).
  • Another cluster (G3) component is more associated with #ff and #fs
  • Cluster G2 has lots of single-vertex nodes. Marc Smith described it as "brand" cluster - people are talking about @KDnuggets.
  • Better interpretation tools are needed to make sense of such graphs!

 
KDnuggets Twitter Network, for 30 days ending May 25, 2014

An experimental interactive version of this graph is here.

There is an edge for each "replies-to" relationship in a tweet, an edge for each "mentions" relationship in a tweet, and a self-loop edge for each tweet that is not a "replies-to" or "mentions".

The graph is directed. The graph's vertices were grouped by cluster using the Clauset-Newman-Moore cluster algorithm. The graph was laid out using the Harel-Koren Fast Multiscale layout algorithm.

Overall Graph Metrics:
  • Vertices: 1396
  • Unique Edges: 1608
  • Total Edges: 3487
  • Reciprocated Vertex Pair Ratio: 0.06
  • Reciprocated Edge Ratio: 0.12
  • Connected Components: 336
  • Maximum Vertices in a Connected Component: 957
  • Maximum Edges in a Connected Component: 2697
  • Maximum Geodesic Distance (Diameter): 8
  • Average Geodesic Distance: 2.5
  • Graph Density: 0.0008

 
Top 10 Vertices and their Betweenness Centrality, ranked by Betweenness Centrality
 
Top URLs in Tweet in Entire Graph:
 
Top Hashtags in Tweet in Entire Graph:
  • #bigdata
  • #analytics
  • #data
  • #datamining
  • #bigdataco
  • #datascience
  • #rstats
  • #ff
  • #hadoop
  • #datascientist

 
Top Word Pairs and Count
  • big data, 509
  • data mining, 202
  • via kdnuggets, 197
  • data scientist, 145
  • summit 2014, 106
  • innovation summit, 101
  • day 1, 90
  • data analysis, 88
  • talks day, 85
  • 9 free, 85
  • free books, 85

 
Cluster G1 (largest)
Summary: This is the main connected component, of tweets originating from @kdnuggets.

Details:
 
Top URLs in Tweet in G1
The top URLs here are infographics and cartoons
 
Cluster G2 (single-vertex unconnected components) Summary: The top URLs here are KDnuggets posts - reports on meetings and interviews that were shared perhaps via KDnuggets tweet button and not via social media.

Details:
  • Vertices: 267
  • Unique Edges: 204
  • Connected components: 267
  • Single-vertex components: 267
  • Maximum Geodesic Distance (Diameter): 0
  • Average Geodesic Distance: 0
  • Top hashtags: #bigdata #data #analytics #datascience #bigdataawards #datamining #socialmedia #businessintelligence #learning #tech
  • Top Words in Tweet: data big analytics feedback 2014 interview highlights day mining innovation
  • Top Mentioned in Tweet: none

 
Top URLs:
 
Cluster G3 (small connected component)
Summary: Note that this cluster includes #ff and #fs tags - KDnuggets was frequently mentioned as part of #ff (Follow Friday) and #fs (Follow Saturday) tweets. The central node in this graph is @kirkdborne.

Details:
  • Vertices: 131
  • Unique Edges: 247
  • Connected components: 1
  • Single-vertex components: 0
  • Maximum Geodesic Distance (Diameter): 6
  • Average Geodesic Distance: 2.57
  • Top hashtags: #bigdata #analytics #datamining #datascience #ff #datascientist #hadoop #datawest14 #fs #rstats
  • Top Words in Tweet: data big analytics feedback 2014 interview highlights day mining innovation
  • Top Mentioned in Tweet: @kdnuggets @kirkdborne @marcusborba @merv @bigdatagal @sve_sic @data_nerd @yvesmulkers @mphnyc @salfordsystems

 
Top URLs in Tweet in G3:
 
Cluster G4 (small connected component).
Central nodes in this cluster are @hey_anmol and @yvesmulkers.
Details:
  • Vertices: 53
  • Unique Edges: 56
  • Connected components: 1
  • Single-vertex components: 0
  • Maximum Geodesic Distance (Diameter): 5
  • Average Geodesic Distance: 2.8
  • Top hashtags: #bigdata #analytics #interview #yarn #anaytics #publicpolicy #crowdsourcing #government #masstlc #hadoop
  • Top Words in Tweet: data kdnuggets big yvesmulkers via analytics interview 2014 top hey_anmol
  • Top Mentioned in Tweet: @kdnuggets @yvesmulkers @hey_anmol @talksumdata @drussell41 @kirkdborne @redpointglobal @georgecorugedo @ramirogoncalez @dutchlight360

 
Top URLs in Tweet in G4:
 
Related:


Sign Up

By subscribing you accept KDnuggets Privacy Policy