What Data You Analyzed – KDnuggets Poll Results and Trends
Image/video data analysis is surging, JSON replacing XML, anonymized data usage is growing in US and Europe (but not in Asia), itemsets and Twitter analysis is declining - some of the highlights of KDnuggets Poll on data types used.
Over 600 readers voted in latest KDnuggets Poll
asked:
Here are the highlights:
Fig. 1: Data Types Analyzed, 2017
The most popular data types analyzed in 2017 were
Comparing with a similar 2014 KDnuggets Poll: Data Types/Sources Analyzed, we see the largest increases in share of responses for
Fig. 2: Data Types Analyzed, 2017 vs 2014
Regional distribution of 632 voters
Fig. 3: Popular Data Types Analyzed by region, 2017
Some observations:
What data types you analyzed in the past 12 months?
Here are the highlights:
- Table, Text, and Time Series remained the most popular types of data used
- the usage of image/video is surging (186% up)
- anonymized data use is growing in US, Canada, and Europe (but not in Asia)
- JSON usage is up, replacing XML (whose usage is down)
- Itemsets/transaction analysis is down 20% (the association rule algoritms replaced by more complex analysis)
- web log data analysis is declining - perhaps because large sites are relying more on Google Analytics.
Fig. 1: Data Types Analyzed, 2017
The most popular data types analyzed in 2017 were
- Table data (fixed n. columns), 69.8%, first place, as in the past polls
- Text, 46.4% - moved to 2nd place compared to 2014
- Time series, 45.6%, dropped to 3rd place
- JSON, 25.5%, up from 7th place
- Anonymized data, 22.8%, up from 10th place
- Location/geo, 22.6%,
Comparing with a similar 2014 KDnuggets Poll: Data Types/Sources Analyzed, we see the largest increases in share of responses for
- Images / video, from 4.9% to 14.1%, 186% up
- Anonymized data, from 14.0% to 22.8%, 63% up
- Other, from 7.2% to 11.2%, 56% up
- JSON, from 17.0% to 25.5%, 50% up
- Location/geo, from 19.7% to 22.6%, 14.9% up
- Itemsets / transactions, from 26.5% to 20.1%, 24% down
- Web clickstream/web log, from 12.5% to 10.0%, 20% down
- Twitter, from 17.8% to 14.7%, 17% down
- XML, from 14% to 12%, 14% down
- Table data (fixed n. columns), from 76.9% to 69.8%, 9.3% down
Fig. 2: Data Types Analyzed, 2017 vs 2014
Regional distribution of 632 voters
- US/Canada, 36.6%
- Europe, 34.2%
- Asia, 17.4%
- Africa/Middle East, 4.4%
- Latin America, 4.3%
- Australia/NZ, 3.2%
Fig. 3: Popular Data Types Analyzed by region, 2017
Some observations:
- US data analysts use tabular data the most;
- text usage is similar across regions;
- Asian data analysts use much less anonymized data, but slightly lead in image/video data analysis.