KDnuggets Home » Polls » Analytics/Data mining software used? (May 2013)

What Analytics, Big Data, Data mining, Data Science software you used in the past 12 months for a real project?


 
  
The 14th annual KDnuggets Software Poll attracted record participation of 1880 voters, more than doubling 2012 numbers.

For full analysis and comments, see:
KDnuggets 2013 Software Poll: RapidMiner and R vie for first place.

This year's poll was noted for the battle between RapidMiner and R for the first place. RapidMiner has been very successful in motivating their users, and got the most votes.

The distinction between commercial and open-source is becoming less clear, since tools like RapidMiner, KNIME, and R also have commercial versions. Many of RapidMiner users were apparently confused by this distinction, since there were more votes (almost 500) for the commercial version of RapidMiner than there were actual licenses (according to Rapid-I CEO Ingo Mierswa). We dealt with this by treating votes that had both commercial and free version of RapidMiner as votes for the free version. This still left 225 RapidMiner users that used only the commercial version.

We found an interesting and stable balance between commercial and free software: 29% of voters used commercial software but not free software (vs 28% in 2012), a very similar number - 30% - used free software but not commercial (same as in 2012), and 41% used both (same as in 2012).

The average number of tools used was 3.0.

Only 14% of voters report using big data tools, compared 15% used them in 2012 (and 3% in 2011).

This suggests that Real Big Data remains isolated among a select group of web giants, government agencies, and similar very large enterprises, and most data analysis is done on "medium" and small data. Recent KDnuggets Poll on Largest DB analyzed supports this conclusion.

The following table shows results of the poll.
% alone is what percent of tool voters used only that tool alone. For example, only 5.6% of voters that used Weka used only Weka, while 43% of Predixion Software users used that tool alone.
For a few tools like Qlikview that were not included in 2012 poll, there are no 2012 numbers.

What Analytics, Big Data, Data mining, Data Science software you used in the past 12 months for a real project? [1880 voters]
Legend: Red: Free/Open Source tools
Green: Commercial tools
% users in 2013
% users in 2012
Rapid-I RapidMiner/RapidAnalytics free edition (737), 30.9% alone 39.2%
26.7%
R (704), 6.5% alone 37.4%
30.7%
Excel (527), 0.9% alone 28.0%
29.8%
Weka / Pentaho (269), 5.6% alone 14.3%
14.8%
Python with any of numpy/scipy/pandas/iPython... packages (250), 0% alone 13.3%
14.9%
Rapid-I RapidAnalytics/RapidMiner Commercial Edition (225), 52.4% alone 12.0%
SAS (202), 2.0% alone 10.7%
12.7%
MATLAB (186), 1.6% alone 9.9%
10.0%
StatSoft Statistica (170), 45.9% alone 9.0%
14.0%
IBM SPSS Statistics (164), 1.8% alone 8.7%
7.8%
Microsoft SQL Server (131), 1.5% alone 7.0%
5.0%
Tableau (118), 0% alone 6.3%
4.4%
IBM SPSS Modeler (114), 6.1% alone 6.1%
6.8%
KNIME free edition (110), 1.8% alone 5.9%
21.8%
SAS Enterprise Miner (110), 0% alone 5.9%
5.8%
Rattle (84), 0% alone 4.5%
JMP (77), 7.8% alone 4.1%
4.0%
Orange (67), 13.4% alone 3.6%
5.3%
Other free analytics/data mining software (64), 3.1% alone 3.4%
4.9%
Gnu Octave (54), 0% alone 2.9%
Revolution Analytics R Enterprise (53), 1.9% alone 2.8%
1.4%
Predixion Software (51), 43.1% alone 2.7%
0.4%
KNIME Professional (46), 4.3% alone 2.4%
Revolution Analytics R free edition (46), 2.2% alone 2.4%
IBM Cognos (45), 2.2% alone 2.4%
2.0%
Other commercial analytics/data mining/data science software (45), 0% alone 2.4%
4.0%
QlikView (45), 2.2% alone 2.4%
Salford SPM/CART/MARS/TreeNet/RF (42), 26.2% alone 2.2%
1.1%
Mathematica (39), 0% alone 2.1%
2.9%
Stata (39), 2.6% alone 2.1%
1.9%
KXEN (35), 54.3% alone 1.9%
1.8%
Miner3D (34), 41.2% alone 1.8%
2.4%
SAP (including BusinessObjects/Sybase/Hana) (27), 3.7% alone 1.4%
0.9%
TIBCO Spotfire / S+ / Miner (26), 3.8% alone 1.4%
4.6%
C4.5/C5.0/See5 (21), 0% alone 1.1%
1.6%
Bayesia (19), 15.8% alone 1.0%
1.8%
Oracle Data Miner (19), 5.3% alone 1.0%
4.4%
Zementis (17), 41.2% alone 0.9%
1.8%
XLSTAT (16), 0% alone 0.9%
0.9%
F# (14), 14.3% alone 0.7%
0.6%
RapidInsight/Veera (9), 0% alone 0.5%
0.6%
Teradata Miner (9), 0% alone 0.5%
0.5%
Lavastorm (8), 25.0% alone 0.4%
WordStat (7), 0% alone 0.4%
0.4%
Angoss (6), 16.7% alone 0.3%
0.9%
11 Ants Analytics (5), 0% alone 0.3%
0.5%
Alteryx (5), 0% alone 0.3%
Megaputer Polyanalyst/TextAnalyst (2), 0% alone 0.1%

For full analysis and comments, see KDnuggets Annual Software Poll: RapidMiner and R vie for first place.

Here are the results of past polls:


KDnuggets Home » Polls » Analytics/Data mining software used? (May 2013)