KDnuggets : News : 2001 : n25 : item7    (previous | next)

News


From: Adam Lynton
Date: Thu, 29 Nov 2001 09:49:27 +1000
Subject: A. Lynton: Response to Statistics is the Road from Data Mining to Knowledge Discovery

Good data miners, a.k.a KDD practitioners, are not ignorant of statistics nor does the KDD community foster ignorance with respect to statistics. There are a selection of good papers on statistics in the Data Mining and Knowledge Discovery journal, Advances in Knowledge Discovery and Data Mining, the Explorations newsletter and KDD conference proceedings. The assertion that "most data miners remain ignorant of statistics" must refer to people outside the KDD community unfamiliar with its literature. There really is no "us and them"; statistics is used in the process of knowledge discovery and data mining, as is machine learning and database theory; they are all just as important as each other. KDD practitioners either have or seek to have a foundations in all of these disciplines.

In my experience some older statisticians are afraid of change. As you said yourself you have "spent forty years as a statistician" no doubt using the tried and true foundational statistical methods (which yes are still very important). However, things are changing, we have parallel processing systems, complex data types such as image and sound, the internet and associated web mining and yes advanced statistical techniques like neural networks, boosting, and support vector machines. KDD practitioners seek to understand the basics of all of these and be adaptive; capable of crossing over into other disciplines where it is necessary and/or relevant.

"...following artificial intelligence and expert systems through the typical computer technology stages of hype, then hope, and finally has-been". This is an astonishing statement. You obviously do not interact with any automated decision or diagnosis systems, speech recognition or handwriting recognition systems. Ever been in a factory production line that uses robots? I guess you don't play strategy games either, because then you would have heard of "Deep Thought" or "TD-Gammon". Something closer to home...as a statistician you probably didn't have a data driven inductive learning algorithm for true non-linear regression in the late '80's early '90's, a.k.a neural networks. Statisticians had mostly ignored Paul Werbos's early work at that time. You have AI practitioners (and relevant engineers) to thank for that as well.

In conclusion, don't think of "data miners" and "statisticians", think of the KDD practitioner. The KDD practitioner is eclectic and embraces techniques and methods from statistics, machine learning and database theory. Most KDD practitioners already know this and practice this; it doesn't take an 'Interface' conference to remind them. Thanks,

Adam Lynton adam@morrisint.com.au Manager, Knowledge Discovery and Data Mining Morris International QLD, Australia


KDnuggets : News : 2001 : n25 : item7    (previous | next)

Copyright © 2001 KDnuggets.   Subscribe to KDnuggets News!