Data Mining (Analytic) Tools


Data mining/analytic tools you used in 2006: [561 voters]

CART/MARS/TreeNet/RF 159 (72 alone)
SPSS Clementine 127 (47 alone, 46 with SPSS)
SPSS 100 (5 alone, 46 with Clementine)
Excel 100 (3 alone)
KXEN 90 (75 alone)
your own code 77 (1 alone)
SAS 72 (3 alone, 13 with E-Miner)
Weka 62 (7 alone)
R 53 (5 alone)
MATLAB 41 (5 alone)
other free tools 39 (3 alone)
SAS E-Miner 37 (9 alone, 13 with SAS)
SQL Server 32 (3 alone)
other commercial tools 31 (7 alone)
Oracle Data Mining 20 (13 alone)
Insightful Miner/ S-Plus 20 (0 alone)
C4.5/C5.0/See5 18 (0 alone)
Megaputer 16 (14 alone)
Statsoft Statistica 13 (2 alone)
Angoss 8 (2 alone)
Mineset (PurpleInsight) 5 (3 alone)
Eudaptics Viscovery 5 (3 alone)
Xelopes 4 (4 alone)
Visumap 3 (2 alone)
IBM I-miner 3 (0 alone)
Equbits 3 (3 alone)


Editor: this poll always generates controversy, as vendors naturally get their customers and employees to vote, and some vendors were much more energetic than others. To help reduce bias in the results, the obviously duplicate votes were removed from the poll results. In addition, the results also show how many voters selected a particular choice alone. For example, out of 100 people who reported using Excel, only 3 chose Excel alone without any other tools. While recognizing that this poll is unscientific, we hope that it is useful.

Will Dwinnell, Biased results
As always, the results of this particular poll are obviously biased. I have already been lobbied for my vote by one of the usual suspects. Perhaps it would help to run this survey "blind", and not reveal the outcome until the end?

Larry, Tiberius
Tiberius is a free neual net data mining tool that works well and is simple for novices.

Ralf Klinkenberg, Download statistics: open-source data mining tools
Some download statistics for open-source data mining tools (as provided by SourceForge and not by the developers):
WEKA downloads by day: 370-1200 (last 7 days),
WEKA downloads by month: 17,000-26,000 (last 6 month),

YALE downloads by day: 100-260 (last 7 days),
YALE downloads by month: 2700-7000 (last 6 month).

Does anyone have download numbers for Orange (http://www.ailab.si/orange) or for the R package or any other major open-source data mining tool?
Does anyone have sale figures for some of the commercial tools?
OK, I know you can't compare download figures and sales figures, because they require a different degree of commitment and involve different costs, but nonetheless this would be interesting.

Ralf Klinkenberg, Free open-source data mining tool YALE
I would have liked to also see the freely availabale open-source data mining tool YALE (Yet Another Learning Environment) on the list: yale.sf.net/
Since 2001 it has gained a broad user-base as shown by the download statistics provided by SourceForge.net (www.sf.net).

Marko Robnik-Sikonja, Free tools
I suggest Orange - a component-based tool written in C++ and Python with nice interface,

