KDnuggets Home » Polls » Data Mining Applications - Industries (June 2006)

Data Mining Applications - Industries

Industries/fields where you applied data mining in the past 12 months [111 voters, 278 votes total]

CRM (43) 39.1%
Fraud Detection (24) 21.8%
Direct Marketing/ Fundraising (22) 20.0%
Credit Scoring (21) 19.1%
Biotech/Genomics (17) 15.5%
Web content mining/Search (15) 13.6%
Other (15) 13.6%
Telecom (14) 12.7%
Web usage mining (12) 10.9%
Science (12) 10.9%
Insurance (12) 10.9%
Retail (11) 10.0%
Investment / Stocks (11) 10.0%
Medical/ Pharma (8) 7.3%
Manufacturing (7) 6.4%
Government/Military (7) 6.4%
e-commerce (6) 5.5%
Travel/Hospitality (5) 4.5%
Security / Anti-terrorism (5) 4.5%
Health care/ HR (5) 4.5%
Junk email / Anti-spam (2) 1.8%
Entertainment/ Music (2) 1.8%
Banking (1) 0.9%

Note: percentages are relative to the number of voters.

Oren Etzioni, Farecast.com
On June 27th, Farecast announced the Public Beta launch of Farecast.com, the first and only airfare prediction site on the Web. Their predictions are based on data-mining methods originally developed at the University of Washington by Prof. Oren Etzioni in collaboration with his student Alex Yates as well as Dr. Craig Knoblock and Rattapoom Tuchinda of USC.

Gunnar Blix, Financials / Lending
Financials / Lending is absent from the list. There are a number of interesting applications in that area, including default and prepayment risk, pipeline conversion, etc. Some applications may fall under CRM and marketing, but certainly not all.

Karl Brazier, Other Fields
I recently worked on a study to try out some data mining in a social policy studies application.

See http://www.ccp.uea.ac.uk/publications.asp, paper 06-1 for a social science paper or
http://www.actapress.com/Content_Of_Proceeding.aspx?ProceedingID=303, paper 468-089 for a bit more explanation of the DM
(Sorry - no free download for this one. If you're really keen, e-mail me on karl.brazier2(a)norwich-union.co.uk and I'll see what I can do...)
I think this is a potentially rich and currently under-exploited field for DM research. It has generated a lot of data from questionnaire surveys, often not targetted at answering a specific question, and continues to do so. Data are usually a complex mixture of categorical, numerical and free text data and number of fields is often high. And there may be a need to induce models on different versions of the outcome variable because the best definition for it is not universally agreed.

The work to be done to open up this field seems to be to overcome resistance from its strongly classical statistical culture, which is rather sceptical of an approach that searches hypothesis spaces instead of doing traditional propose-then-test. But I think DM is so well suited to the material, both data and problems posed, that this resistance needs to be challenged.

KDnuggets Home » Polls » Data Mining Applications - Industries (June 2006)