KDnuggets Home » News » 2016 » Jun » News, Features » R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results ( 16:n20 )

R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results


R remains the leading tool, with 49% share, but Python grows faster and almost catches up to R. RapidMiner remains the most popular general Data Science platform. Big Data tools used by almost 40%, and Deep Learning usage doubles.



Full Results and 3-year trends

The following table shows the poll results in detail, excluding Deep Learning tools for which 3 year results are not available.


% alone is the percent of tool voters used only that tool alone, shown only for tools that have 5% or such votes. For example, 11.4% of RapidMiner users have used only Rapidminer.

What Analytics, Big Data, Data mining, Data Science software you used in the past 12 months for a real project? [2895 voters]
Legend: red: Free/Open Source tools
green: Commercial tools
Fuchsia: Hadoop/Big Data tools
% users in 2016
% users in 2015
% users in 2014
R (1419) 49.0%
46.9%
38.5%
Python (1325) 45.8%
30.3%
19.5%
SQL (1029) 35.5%
30.9%
25.3%
Excel (972) 33.6%
22.9%
25.8%
RapidMiner (944), 11.7 % alone 32.6%
31.5%
44.2%
Hadoop (641) 22.1%
18.4%
12.7%
Spark (624) 21.6%
11.3%
2.6%
Tableau (536) 18.5%
12.4%
9.1%
KNIME (521) 18.0%
20.0%
15.0%
scikit-learn (497) 17.2%
8.3%
na
Java (487) 16.8%
14.1%
na
Anaconda (462) 16.0%
na
na
Hive (359) 12.4%
10.2%
na
MLlib (337) 11.6%
3.3%
1.0%
Weka (315) 10.9%
11.2%
17.0%
Microsoft SQL Server (314) 10.8%
9.7%
10.5%
Unix shell/awk/gawk (301) 10.4%
8.0%
5.8%
MATLAB (263) 9.1%
8.8%
8.4%
IBM SPSS Statistics (242) 8.4%
7.7%
7.7%
Dataiku (227), 18.1 % alone 7.8%
2.0%
na
SAS base (225) 7.8%
11.3%
10.9%
IBM SPSS Modeler (222) 7.7%
7.1%
5.7%
SQL on Hadoop tools (211) 7.3%
7.2%
na
C/C++ (210) 7.3%
9.4%
na
Other free analytics/data mining tools (198) 6.8%
5.0%
5.1%
Other programming and data languages (197) 6.8%
5.1%
3.0%
H2O (193) 6.7%
2.0%
0.2%
Scala (180) 6.2%
3.5%
na
SAS Enterprise Miner (162) 5.6%
10.9%
7.2%
Microsoft Power BI (161) 5.6%
3.6%
na
HBase (158) 5.5%
4.6%
na
QlikView (153) 5.3%
4.2%
3.0%
Microsoft Azure Machine Learning (147) 5.1%
3.7%
na
Other Hadoop/HDFS-based tools (141) 4.9%
4.5%
3.9%
Apache Pig (132) 4.6%
5.4%
3.5%
IBM Watson (121) 4.2%
2.1%
na
Rattle (103) 3.6%
4.2%
4.9%
Salford SPM/CART/Random Forests/MARS/TreeNet (100), 63.0 % alone 3.5%
2.3%
3.6%
Gnu Octave (89) 3.1%
2.3%
3.9%
Orange (89) 3.1%
1.9%
3.4%
Alteryx (87) 3.0%
5.6%
3.1%
RapidInsight/Veera (87), 51.7 % alone 3.0%
0.2%
0.5%
TIBCO Spotfire (80) 2.8%
4.3%
2.8%
Apache Mahout (74) 2.6%
2.8%
2.5%
Other paid analytics/data mining/data science software (71) 2.5%
2.4%
1.9%
Dato (69) 2.4%
0.5%
0.9%
Pentaho (68) 2.3%
2.7%
2.6%
Perl (67) 2.3%
2.9%
3.0%
IBM Cognos (64) 2.2%
1.8%
1.8%
Splunk/ Hunk (63) 2.2%
1.1%
0.7%
JMP (58) 2.0%
3.1%
3.8%
C4.5/C5.0/See5 (58) 2.0%
1.3%
1.5%
Amazon Machine Learning (55) 1.9%
0.7%
na
Mathematica (53) 1.8%
1.9%
2.3%
Microsoft other ML/Data Science tools (46) 1.6%
na
na
Vowpal Wabbit (45) 1.6%
1.3%
na
Microstrategy (45) 1.6%
0.9%
na
SAP Analytics (42) 1.5%
3.0%
6.8%
Stata (39) 1.3%
1.3%
1.4%
Dell/StatSoft (36), 8.3 % alone 1.2%
1.7%
1.7%
XLMiner (35) 1.2%
na
na
SAP HANA (35) 1.2%
na
na
Julia (32) 1.1%
1.1%
0.8%
Oracle Adv. Analytics (31) 1.1%
0.8%
2.2%
BigML (25), 16.0 % alone 0.9%
0.8%
0.9%
Zementis (25) 0.9%
0.8%
0.8%
BayesiaLab (18) 0.6%
0.6%
4.1%
Alpine Data Labs (16), 12.5 % alone 0.6%
0.5%
2.7%
DataRobot (15), 6.7 % alone 0.5%
na
na
Datameer (13), 7.7 % alone 0.4%
0.9%
0.8%
Lavastorm (12) 0.4%
0.4%
0.3%
F# (11) 0.4%
0.7%
0.5%
Clojure (11) 0.4%
0.5%
0.5%
Actian (10) 0.3%
2.0%
0.5%
WordStat (10) 0.3%
0.3%
0.2%
Ayasdi (9) 0.3%
2.0%
na
Skytree (8) 0.3%
0.1%
na
Lisp (7) 0.2%
0.4%
0.3%
Ontotext GraphDB (6) 0.2%
0.0%
na
SiSense (5) 0.2%
0.2%
0.1%
Birst (5) 0.2%
0.1%
na
FICO Model Builder (5) 0.2%
0.0%
0.2%
WPS World Programming System (4) 0.1%
0.3%
0.2%
Angoss (3) 0.1%
0.4%
0.4%
Predixion Software (2) 0.1%
0.4%
3.7%


Additional tools not included but mentioned in the comments include
  • XLSTAT  
  • BeyondCore
  • Timi and Anatella
  • SAS/STAT  
  • Domino Data Lab
  • MapR
  • Neural Designer
  • Javascript
Here are the results of past polls