KDnuggets : Polls : Data Mining Software (May 2008)
Poll: Data Mining Software
What data mining tools have you used for a real project (not just for evaluation) in the past 6 months? [347 voters]

For tools with 20 votes or more, we split results into "alone" votes - where this tool selected alone (narrow bar), and the second (wide) bar to the number votes where the tool was select as one among several; Tools are ordered in descending order of total number of votes
Commercial Software
SPSS Clementine ( 74, 53 alone or with SPSS)
Salford CART, MARS, TreeNet, RF (72, 34 alone)
SPSS (68, 38 alone or with Clementine)
Excel (61, 1 alone)
SAS (55, 6 alone or with SAS EM)
KXEN (32, 25 alone)
SAS Enterprise Miner (24, 6 alone or with SAS)
MATLAB (22,1 alone)
SQL Server (20, 2 alone)
Other commercial tools (12)
Angoss (8)
Oracle DM (7)
Statsoft Statistica (5)
Insightful Miner/S-Plus (5)
Viscovery (4)
Tiberius (2)
Miner3D (1)
Megaputer (1)
FairIsaac Model Builder (1)
Bayesia (1)
Your own code (50, 3 alone)
Free/Open Source Data Mining Software
RapidMiner (72, 49 alone)
R (39, 4 alone)
Weka (36, 4 alone)
KNIME (30, 14 alone)
Other free tools (18)
C4.5/C5.0 (8)
Orange (3)

For comparison, here are the results from 2007 KDnuggets Poll on Data Mining Software Popularity.

This year there were stronger poll verification measures, which eliminated some (but probably not all) biases due to over-enthusiastic voting of some vendors and users.

Among voters from US, the top choice was Salford.

Comments

Tim Manns, Real-world scaleability
I don't have big exposure of all these tools (my experience is limited to the major commerical tools), so am not certain that many of them are scaleable (or not).

In my role as a data miner for a telco I process many millions of rows, with hundreds of columns, and sample to many thousand when building predictive models. An easy UI tool that integrates with the data warehouse is a must, and for this reason I consider in-database mining and connectivity with the data warehouse a necessary requirement for a true data mining tool. - assuming your definition of data mining is (roughly) "to process large amounts of data and indentify useful actionable information".

I realise the world of data mining is varied, and db connectivity or scaleable performance is not always a consideration in these tools. You should too :)

Janaki Gopalan, DM tools
I have been using WEKA and I will be starting to use SPSS next month. In this poll, looks like WEKA is relatively popular free ware tool (6% users, after Rapid Miner (12%) and SPSS (9%). If you have used industry specific tools, can you tell me what is the advantage of them over free ware?


KDnuggets : Polls : Data Mining Software (May 2008)

Copyright © 2008 KDnuggets. Subscribe to KDnuggets News!