KDnuggets : News : 2006 : n24 : item3 < PREVIOUS | NEXT >


Subject: Effective Counterterrorism and the Limited Role of Predictive Data Mining

CATO Institute, Dec 11, 2006. by Jeff Jonas and Jim Harper

The terrorist attacks on September 11, 2001, spurred extraordinary efforts intended to protect America from the newly highlighted scourge of international terrorism. Among the efforts was the consideration and possible use of "data mining" as a way to discover planning and preparation for terrorism. Data mining is the process of searching data for previously unknown patterns and using those patterns to predict future outcomes.


Though data mining has many valuable uses, it is not well suited to the terrorist discovery problem. It would be unfortunate if data mining for terrorism discovery had currency within national security, law enforcement, and technology circles because pursuing this use of data mining would waste taxpayer dollars, needlessly infringe on privacy and civil liberties, and misdirect the valuable time and energy of the men and women in the national security community.


Jeff Jonas is distinguished engineer and chief scientist with IBM's Entity Analytic Solutions Group. Jim Harper is director of information policy studies at the Cato Institute and author of Identity Crisis: How Identification Is Overused and Misunderstood.

Read more about the paper.

Jeff Jonas writes more about this paper in his blog.

One of the big challenges we faced in getting this paper drafted was dealing with all of the confusion related to data mining. It turns out that what is data mining depends on whom you ask.

The key point of our paper is that the form of data mining which uses historical incident data to determine a pattern Ö then using this pattern to predict a future event is not helpful in the terrorism context because there isnít enough historical data to derive a meaningful and statistically reliable pattern. Thus, we settled on the term "predictive data mining" to differentiate what we were characterizing as ineffective from many other effective uses.

This paper also highlights a real governmental need to efficiently locate, access, and aggregate information about specific suspects. To highlight this point we show that starting with two primary suspects, available data points and existing laws, a good number of the 9/11 terrorists could have been identified in a very narrow investigative fashion before September 11th.

Make no confusion about it; though data mining has many value uses from reducing corporate direct marketing costs, to classifying celestial objects and even medical research, it just so happens that it is not so helpful to discover underlying patterns of low- incident terrorism.

KDnuggets : News : 2006 : n24 : item3 < PREVIOUS | NEXT >

Copyright © 2006 KDnuggets.   Subscribe to KDnuggets News!