KDnuggets Home » Polls » Languages for analytics/data mining (Aug 2013)

Languages for analytics / data mining / data science


 
  
What programming/statistics languages you used for an analytics / data mining / data science work in 2013? [713 votes total]

% users in 2013   % users in 2012
R (434 voters in 2013) 60.9%
52.5%
Python (277) 38.8%
36.1%
SQL (261) 36.6%
32.1%
SAS (148) 20.8%
19.7%
Java (118) 16.5%
21.2%
MATLAB (89) 12.5%
13.1%
High-level data mining suite (80) 11.2%
not asked in 2012
Unix shell/awk/sed (79) 11.1%
14.7%
C/C++ (66) 9.3%
14.3%
Pig Latin/Hive/other Hadoop-based languages (57) 8.0%
6.7%
Other low-level language (42) 5.9%
11.4%
GNU Octave (40) 5.6%
5.9%
Perl (32) 4.5%
9.0%
Ruby (16) 2.2%
3.8%
Scala (16) 2.2%
2.4%
F# (12) 1.7%
not asked in 2012
Lisp/Clojure (7) 1.0%
4.3%
Julia (5) 0.7%
0.3%
None (2) 0.3%
0.7%

Comments
A number of comments, such as one below, pointed that SPSS also has its own language similar to SAS - will include it in the next poll.

Ralph Winters, SPSS Language
It seems odd to exclude SPSS based upon a definition of what is or what is not language. Especially for a language which has such legacy roots, and is backed by IBM. I could argue that both Matlab and R are both not true progamming language, and SAS, as flexible as it is, I would not consider a standarized programming language as well.

Comparing with a similar KDnuggets Poll in Aug 2012:
What programming/statistics languages you used for analytics / data mining in the past 12 months?

the language with the highest growth was Julia, which doubled in popularity (but still was used only by 0.7% in 2013).

Among more common languages, the largest relative increases in share of usage were for

  • Pig Latin/Hive/other Hadoop-based languages, 19% growth, from 6.7% in 2012 to 8.0% in 2013
  • R, 16% growth
  • SQL, 14% growth (perhaps the result of increasing number of SQL interfaces to Hadoop and other Big Data systems?)
The languages with the largest decline is share of usage were
  • Lisp/Clojure, 77% down
  • Perl, 50% down
  • Ruby, 41% down
  • C/C++, 35% down
  • Unix shell/awk/sed, 25% down
  • Java, 22% down
Regional participation was
  • US/Canada, 50.8%,
  • Europe: 25.7%,
  • Asia: 11.8%,
  • Latin America: 6.7%,
  • AU/NZ: 3.2%,
  • Africa/Middle East: 1.5%

KDnuggets Home » Polls » Languages for analytics/data mining (Aug 2013)