What Data You Analyzed – KDnuggets Poll Results and Trends

Image/video data analysis is surging, JSON replacing XML, anonymized data usage is growing in US and Europe (but not in Asia), itemsets and Twitter analysis is declining - some of the highlights of KDnuggets Poll on data types used.



Over 600 readers voted in latest KDnuggets Poll asked:
What data types you analyzed in the past 12 months?

Here are the highlights:
  • Table, Text, and Time Series remained the most popular types of data used
  • the usage of image/video is surging (186% up)
  • anonymized data use is growing in US, Canada, and Europe (but not in Asia)
  • JSON usage is up, replacing XML (whose usage is down)
  • Itemsets/transaction analysis is down 20% (the association rule algoritms replaced by more complex analysis)
  • web log data analysis is declining - perhaps because large sites are relying more on Google Analytics.
Poll Data Types Analyzed 2017
Fig. 1: Data Types Analyzed, 2017


The most popular data types analyzed in 2017 were
  1. Table data (fixed n. columns), 69.8%, first place, as in the past polls
  2. Text, 46.4% - moved to 2nd place compared to 2014
  3. Time series, 45.6%, dropped to 3rd place
  4. JSON, 25.5%, up from 7th place
  5. Anonymized data, 22.8%, up from 10th place
  6. Location/geo, 22.6%,


Comparing with a similar 2014 KDnuggets Poll: Data Types/Sources Analyzed, we see the largest increases in share of responses for
  • Images / video, from 4.9% to 14.1%, 186% up
  • Anonymized data, from 14.0% to 22.8%, 63% up
  • Other, from 7.2% to 11.2%, 56% up
  • JSON, from 17.0% to 25.5%, 50% up
  • Location/geo, from 19.7% to 22.6%, 14.9% up
The biggest decreases, compared to 2014 poll were for
  • Itemsets / transactions, from 26.5% to 20.1%, 24% down
  • Web clickstream/web log, from 12.5% to 10.0%, 20% down
  • Twitter, from 17.8% to 14.7%, 17% down
  • XML, from 14% to 12%, 14% down
  • Table data (fixed n. columns), from 76.9% to 69.8%, 9.3% down
Poll Data Types Analyzed 2017 Vs 2014
Fig. 2: Data Types Analyzed, 2017 vs 2014


Regional distribution of 632 voters
  • US/Canada, 36.6%
  • Europe, 34.2%
  • Asia, 17.4%
  • Africa/Middle East, 4.4%
  • Latin America, 4.3%
  • Australia/NZ, 3.2%
Next, we compared the share of top 3 most popular data types, and also anonymized data and Image/video across 3 largest regions.

Poll Data Types Analyzed Region 2017
Fig. 3: Popular Data Types Analyzed by region, 2017


Some observations:
  • US data analysts use tabular data the most;
  • text usage is similar across regions;
  • Asian data analysts use much less anonymized data, but slightly lead in image/video data analysis.


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

No, thanks!