KDnuggets Home » Polls » Data Types Analyzed / Mined (Sep 2008)

Types of Data Analyzed/Mined in the past 12 months:


 
  

Poll
Types of Data Analyzed/Mined in the past 12 months: [108 voters]

table data, fixed # of columns (84) 77.8%
time series (40) 37.0%
text, free-form (39) 36.1%
itemsets/transactions (27) 25.0%
anonymized data (19) 17.6%
spatial data (2D, 3D) (14) 13.0%
web content (11) 10.2%
social network-type data (9) 8.3%
web clickstream (6) 5.6%
images/video (6) 5.6%
email (6) 5.6%
XML data (6) 5.6%
other (2) 1.9%
music / audio (1) 0.9%


Kirk Simmons, data being mined
I have induced decision tree ensembles using numerical descriptors of chemistry and the biological activity associated with each record. These classifiers can be used, essentially for insilico "drug" discovery by mining large chemical databases for potentially active molecules.

Gregory Piatetsky-Shapiro, Data Type Trends

Comparing with 2007 KDnuggets Poll: Data Types Analyzed Mined, we see overall stability (top 5 data types are the same in both years).
Data types analyzed/mined, 2008 vs 2007



There is a slight increase in interest (as measured by % of voters) in
  • itemsets/transactions, from 22% to 25%
  • text (free-form) , from 34% to 36%
Several categories saw a significant drop in interest:
  • music / audio, from 4% to 1%
  • web clickstream, from 12% to 6%
  • email, from 12% to 6%
  • XML data, from 11% to 6%
  • web content, from 14% to 10%

KDnuggets Home » Polls » Data Types Analyzed / Mined (Sep 2008)