Kirk Simmons, data being mined
I have induced decision tree ensembles using numerical descriptors of chemistry and the biological activity associated with each record. These classifiers can be used, essentially for insilico "drug" discovery by mining large chemical databases for potentially active molecules.
Gregory Piatetsky-Shapiro, Data Type Trends
Comparing with 2007 KDnuggets Poll: Data Types Analyzed Mined, we see overall stability (top 5 data types are the same in both years).
There is a slight increase in interest (as measured by % of voters) in
- itemsets/transactions, from 22% to 25%
- text (free-form) , from 34% to 36%
- music / audio, from 4% to 1%
- web clickstream, from 12% to 6%
- email, from 12% to 6%
- XML data, from 11% to 6%
- web content, from 14% to 10%