Among voters 28% used commercial software but not free software, 30% used free software but not commercial, and 41% used both.
The usage of big data tools grew five-fold: 15% used them in 2012, vs about 3% in 2011.
R, Excel, and RapidMiner are the most popular tools, with Statsoft Statistica becoming the most popular commercial tool, getting more votes from SAS (in part due to more active campaign from Statsoft users, and lack of such campaign from SAS).
Among those who wrote analytics code in lower-level languages, R, SQL, Java, and Python were most popular.
This poll also had a very large number of participants and used email verification and other measures to remove unnatural votes (*see note below).
What Analytics, Data mining, Big Data software you used in the past 12 months for a real project (not just evaluation) [798 voters] | |
Legend: Free/Open Source tools
Commercial tools |
![]() ![]() |
R (245) | ![]() ![]() |
Excel (238) | ![]() ![]() |
Rapid-I RapidMiner (213) | ![]() ![]() |
KNIME (174) | ![]() ![]() |
Weka / Pentaho (118) | ![]() ![]() |
StatSoft Statistica (112) | ![]() ![]() |
SAS (101) | ![]() ![]() |
Rapid-I RapidAnalytics (83) | ![]() not asked in 2011 |
MATLAB (80) | ![]() ![]() |
IBM SPSS Statistics (62) | ![]() ![]() |
IBM SPSS Modeler (54) | ![]() ![]() |
SAS Enterprise Miner (46) | ![]() ![]() |
Orange (42) | ![]() ![]() |
Microsoft SQL Server (40) | ![]() ![]() |
Other free analytics/data mining software (39) | ![]() ![]() |
TIBCO Spotfire / S+ / Miner (37) | ![]() ![]() |
Oracle Data Miner (35) | ![]() ![]() |
Tableau (35) | ![]() ![]() |
JMP (32) | ![]() ![]() |
Other commercial analytics/data mining software (32) | ![]() ![]() |
Mathematica (23) | ![]() ![]() |
Miner3D (19) | ![]() ![]() |
IBM Cognos (16) | ![]() not asked in 2011 |
Stata (15) | ![]() ![]() |
Bayesia (14) | ![]() ![]() |
KXEN (14) | ![]() ![]() |
Zementis (14) | ![]() ![]() |
C4.5/C5.0/See5 (13) | ![]() ![]() |
Revolution Computing (11) | ![]() ![]() |
Salford SPM/CART/MARS/TreeNet/RF (9) | ![]() ![]() |
Angoss (7) | ![]() ![]() |
SAP (including BusinessObjects/Sybase/Hana) (7) | ![]() not asked in 2011 |
XLSTAT (7) | ![]() ![]() |
RapidInsight/Veera (5) | ![]() not asked in 2011 |
11 Ants Analytics (4) | ![]() ![]() |
Teradata Miner (4) | ![]() not asked in 2011 |
Predixion Software (3) | ![]() ![]() |
WordStat (3) | ![]() ![]() |
Among tools with at least 10 users, the tools with the highest increase in "usage percent" were
- Oracle Data Miner, 4.4% in from 2012, up from 0.7% in 2011, 505% increase
- Orange, 5.3% from 1.3%, 315% increase
- TIBCO Spotfire / S+ / Miner, 4.6% from 1.7%, 169% increase
- Stata, 1.9% from 0.8%, 130% increase
- Bayesia, 1.8% from 0.8%, 115% increase
The three tools with highest decrease in usage percent were 11 Ants Analytics, Salford SPM/CART/MARS/TreeNet/RF, and Zementis. Their dramatic decrease is probably due to vendors doing much less (or nothing) to encourage their users to vote in 2012 as compared to 2011.
Note: 3 tools received less than 3 votes and were not included in this table: Clarabridge, Megaputer Polyanalyst/TextAnalyst, Grapheur/LIONsolver.
Big Data
Big data tools use grew 5-fold, from about 3% to about 15% of respondents.
Big Data software you used in the past 12 months | |
Apache Hadoop/Hbase/Pig/Hive (67) | ![]() |
Amazon Web Services (AWS) (36) | ![]() |
NoSQL databases (33) | ![]() |
Other Big Data Data/Cloud analytics software (21) | ![]() |
Other Hadoop-based tools (10) | ![]() |
We also asked about the popularity of the individual languages for data mining. Note that we also included R in this table, as well as among higher-level tools
Your own code you used for analytics/data mining in the past 12 months in: | |
R (245) | ![]() |
SQL (185) | ![]() |
Java (138) | ![]() |
Python (119) | ![]() |
C/C++ (66) | ![]() |
Other languages (57) | ![]() |
Perl (37) | ![]() |
Awk/Gawk/Shell (31) | ![]() |
F# (5) | ![]() |
For comparison here are the recent software polls:
- KDnuggets 2011 Poll: Data Mining/Analytic Tools Used
- KDnuggets 2010 Poll: Data Mining / Analytic Tools Used
- KDnuggets 2009 Poll: Data Mining Tools Used.
Vote: cleaning: To reduce multiple voting this poll used email verification, which reduced the total number of votes compared to 2011, but made results more representative.
Furthermore, some vendors were much more active than others in recruiting their users, and to give a more objective picture of the tool popularity, a large number (over 100) of the "unnatural" votes were removed, leaving 798 votes. Decline in popularity of some tools, such as Salford and 11 Ants Analytics is probably due to these vendors being less active in 2012 than in 2011 in asking their users to vote.