| Poll |
Types of Data Analyzed/Mined in the past 12 months:
[108 voters]
|
| table data, fixed # of columns (84) |
77.8% |
| time series (40) |
37.0% |
| text, free-form (39) |
36.1% |
| itemsets/transactions (27) |
25.0% |
| anonymized data (19) |
17.6% |
| spatial data (2D, 3D) (14) |
13.0% |
| web content (11) |
10.2% |
| social network-type data (9) |
8.3% |
| web clickstream (6) |
5.6% |
| images/video (6) |
5.6% |
| email (6) |
5.6% |
| XML data (6) |
5.6% |
| other (2) |
1.9% |
| music / audio (1) |
0.9% |
|
Kirk Simmons, data being mined
I have induced decision tree ensembles using numerical descriptors of
chemistry and the biological activity associated with each record.
These classifiers can be used, essentially for insilico "drug" discovery
by mining large chemical databases for potentially active molecules.
Gregory Piatetsky-Shapiro, Data Type Trends
Comparing with
2007 KDnuggets Poll: Data Types Analyzed Mined,
we see overall stability (top 5 data types are the same in both years).
There is a slight increase in interest (as measured by % of voters) in
- itemsets/transactions, from 22% to 25%
- text (free-form) , from 34% to 36%
Several categories saw a significant drop in interest:
- music / audio, from 4% to 1%
- web clickstream, from 12% to 6%
- email, from 12% to 6%
- XML data, from 11% to 6%
- web content, from 14% to 10%
| |
|