KDnuggets Home » Polls » Data Mining / Analytic Tools Used Poll (May 2010)

Data Mining / Analytic Tools Used Poll


 
  
Which data mining/analytic tools you used in the past 12 months for a real project (not just evaluation) [912 voters]
RapidMiner (345) 37.8%
R (272) 29.8%
Excel (222) 24.3%
KNIME (175) 19.2%
Your own code (168) 18.4%
Pentaho/Weka (131) 14.3%
SAS (110) 12.0%
MATLAB (84) 9.2%
IBM SPSS Statistics (72) 7.9%
Other free tools (67) 7.3%
IBM SPSS Modeler (former Clementine) (67) 7.3%
Microsoft SQL Server (63) 6.9%
Statsoft Statistica (57) 6.2%
Other commercial tools (56) 6.1%
SAS Enterprise Miner (50) 5.5%
Zementis (34) 3.7%
Orange (25) 2.7%
Oracle DM (19) 2.1%
KXEN (19) 2.1%
Salford CART Mars other (15) 1.6%
VisuaLinks (12) 1.3%
Viscovery (10) 1.1%
Angoss (8) 0.9%
TIBCO Insightful Miner (7) 0.8%
Miner3D (7) 0.8%
REvolution Computing (4) 0.4%
Megaputer Polyanalyst/TextAnalyst (3) 0.3%
Portrait Software (2) 0.2%
Data Applied (2) 0.2%
Centrifuge (2) 0.2%
PRSD Studio (1) 0.1%
Clario Analytics (1) 0.1%
Bayesia (1) 0.1%

Notes
Even after deleting duplicate votes, this poll had a record participation of over 900 data miners, about 2.5 times more than in 2009 KDnuggets Poll on Data Mining Tools Used.

Direct comparison votes may not be representative, because of different verification strategies in 2009 and 2010, but clearly in 2010 KDnuggets visitors were big users of open-source tools. The leading open source tools were RapidMiner, R, and KNIME.

Among commercial tools, the top tools were SAS, MATLAB, and IBM SPSS Modeler (former Clementine).

Comparing the relative share of tools in 2009 and 2010, the biggest gainers among commercial tools with at least 1% share were

  • Statsoft Statistica: 184% increase* (from 8 users, 2.2% share in 2009 to 57 users, 6.3% share in 2010)
  • Viscovery: 100% increase (0.55% to 1.1%)
  • Microsoft SQL Server: 68% increase (4.1% to 6.9%)
  • Excel: 30% increase (18.7% to 24.3%)
The biggest losers among commercial tools with at least 1% share were
  • IBM SPSS Modeler (former Clementine): -78% decline (120 users, 33.0% share to 67 users, 7.3% share)
  • KXEN: 76% decline (8.5% to 2.1%)
  • SAS Enterprise Miner (18.4% to 5.5%)
(Editor: *Although many companies tried to get their users to vote, Statistica had an especially active campaign in 2010, which explains a large part of their increase. Likewise, part of the decrease for some of the tools in 2010 may be due to lack of such campaign in 2010 vs 2009.)

All open source tools grew in share, but the biggest growth was for

  • KNIME: 288% increase (4.9% to 19.2%)
  • R: 113% increase (14.0% to 29.8%)
  • Orange: 100% increase (1.4% to 2.7%)
  • RapidMiner 79% increase (21.2% to 37.8%)
Analysis of top countries by tool type (below) shows that US has the largest number of commercial-only DM software users, while German (DE) KDnuggets readers mostly use open-source tools (RapidMiner, KNIME).

Data Mining Software type by Country


KDnuggets Home » Polls » Data Mining / Analytic Tools Used Poll (May 2010)