KDnuggets Home » Polls » Data Mining/Analytic Tools Used (May 2011)

Data Mining/Analytic Tools Used


 
  
This poll had a record number of participants (over 1,100), with 43% using only commercial software, 32% only free software, 25% both. The average number of tools per user was 2.2.
RapidMiner, R, and Excel are again the most popular tools, with SAS remaining the top commercial tool.

Regional breakdown was

  • W. Europe 36.8%,
  • US/Canada 34.7%
  • E. Europe 9.8%
  • Asia, 6.1%
  • Australia/New Zealand, 4.5%
  • Latin America, 4.2%
  • Africa/MidEast, 4.0%

Which data mining/analytic tools you used in the past 12 months for a real project (not just evaluation) [1103 voters]
% users in 2011
% users in 2010
RapidMiner (305) 27.7%
37.8%
R (257) 23.3%
29.8%
Excel (240) 21.8%
24.3%
SAS (150) 13.6%
12.1%
Your own code (134) 12.1%
18.4%
KNIME (134) 12.1%
19.2%
Weka (Pentaho) (130) 11.8%
14.4%
Salford (117) 10.6%
1.6%
Statistica (94) 8.5%
6.3%
IBM SPSS Modeler (91) 8.3%
7.3%
MATLAB (79) 7.2%
9.2%
IBM SPSS Statistics (79) 7.2%
7.9%
SAS Enterprise Miner (78) 7.1%
5.5%
JMP (63) 5.7%
11 Ants Analytics (62) 5.6%
Microsoft SQL Server (54) 4.9%
6.9%
Other free software (45) 4.1%
7.3%
Zementis (41) 3.7%
3.7%
Other commercial software (35) 3.2%
6.1%
Tableau (29) 2.6%
C4.5/C5.0/See5 (21) 1.9%
TIBCO Spotfire / S+ / Miner (19) 1.7%
0.8%
Hadoop Map/Reduce (19) 1.7%
Mathematica (18) 1.6%
Revolution Computing (15) 1.4%
0.4%
KXEN (15) 1.4%
2.1%
Orange (14) 1.3%
2.7%
Miner3D (14) 1.3%
0.8%
XLSTAT (10) 0.9%
NoSQL databases (10) 0.9%
Stata (9) 0.8%
Other cloud-based tools (9) 0.8%
Bayesia (9) 0.8%
0.1%
Angoss (9) 0.8%
0.9%
Oracle Data Miner (8) 0.7%
2.1%
Predixion (6) 0.5%
WordStat (5) 0.5%
Megaputer Polyanalyst/TextAnalyst (4) 0.4%
0.3%
Portrait Software (3) 0.3%
0.2%
Grapheur (3) 0.3%
Clarabridge (3) 0.3%
Centrifuge (3) 0.3%
0.2%
Viscovery (1) 0.1%
1.1%
Data Applied (1) 0.1%
0.2%

The following table shows breakdown by region and tool type: commercial/free/both.

Region % users who only use commercial tools
% users who only use free tools
% users who use both
W. Europe
22%              53%              25%
US/Canada
62%                   13%         26%
E. Europe
52%                  26%          22%
Asia
43%              36%              21%
Australia/New Zealand
62%              22%              16%
Latin America
24%              37%              39%
Africa/MidEast
39%              34%              27%

Comments
Trevor Kemmer, Open Source Data Mining Solutions
Like last year, the open source data mining solutions RapidMiner and R again attracted most users. Another powerful open source data mining solution I would like to see in next year's poll is RapidAnalytics.
Since IBM SPSS Modeler (Clementine) and IBM SPSS Statistics are both listed and since SAS (Base) and SAS Enterprise Miner are both listed, I think I would be good to list both, RapidMiner and RapidAnalytics.

Karl Rexer, Missing tools
some tools were missing from this poll which appeared in Rexer 2010 survey:

  • IBM Cognos (used by 6% of data miners in 2010)
  • Minitab (9%)
  • FICO Fair Isaac (3%)
  • SAP Business Objects/NetWeaver (4%)
  • Teradata Warehouse Miner (2%)
  • Unica Predictive Insight (2%)

Andrew, Minitab is not a Data Mining Tool
Good for QC, not some much for true data mining/predictive analytics.

Josh Hemann, Languages listed
Given that languages are being listed (R, SAS), as well as the category of "Your own code", I suggest breaking this out more into the languages commonly used when writing custom analytical applications, e.g. R, SAS, Python, Java, C, etc


KDnuggets Home » Polls » Data Mining/Analytic Tools Used (May 2011)