From: Gregory Piatetsky-Shapiro
Date: 27 Aug 2007
Subject: Poll Results: Is "Data Mining" Tainted ?

The previous KDnuggets Poll asked: With data mining recently in the news again in connection with a controversial government scheme, and frequently associated with privacy invasion, the term "data mining" may be becoming "tainted" by association.

Do you agree and if so what other terms you prefer to use to describe the field?

About 40% thought that data mining is still OK to use.

One big reason, pointed by Rob Cooley, is that "data mining" is 20-25 times more popular term than "knowledge discovery", as measured by the web. As of Aug 27, 2007, Googling "data mining" finds 37M documents, whereas for "knowledge discovery" we find 1.7 M documents. Searching Yahoo we have a similar ratio (43M vs 2M).

Several readers asked about the US/non-US difference in perception of "Data Mining". There was a difference, but not as strong as I thought:

about 35% of US respondents said that "Data Mining is still OK to use", compared to 44% of non-us respondents.

26% overall preferred "Knowledge Discovery".

The most significant difference between US and non-US respondents was that 29% of US respondents prefer "Predictive analytics", compared to 14% of non-US.

Other suggested terms included

  • "predictive modeling" (however data mining is wider than predictive modeling, e.g. clustering or anomaly detection is not predictive modeling)
  • and even "UNperceivable EXploring"
Here are full results for 2007 KDnuggets Poll: Is "Data Mining" tainted ?

