R (used by 45%), SQL (32%), and Python (25%) are top languages for data mining/data analysis. A typical data miner uses 2 languages.
The previous KDnuggets Poll asked
What programming languages you used for data mining / data analysis in the past 12 months?
Here are the results, based on 570 voters:
| R (257) |
45% |
| SQL (184) |
32% |
| Python (140) |
25% |
| Java (139) |
24% |
| SAS (121) |
21% |
| MATLAB (83) |
15% |
| C/C++ (73) |
13% |
| Unix shell/awk/gawk/sed (59) |
10% |
| Perl (45) |
7.9% |
| Hadoop/Pig/Hive (35) |
6.1% |
| Lisp (4) |
0.7% |
| Other (70) |
12.0% |
| None (7) |
1.2% |
Notes
Among the top 5 languages, only about 15-25% were used alone.
- 43.2% of voters used 1 language
- 24.9% used 2 languages
- 17.4% used 3 languages
- 8.4% used 4 languages
- 6.1% used 5 or more languages
On average, data miners used 2.1 languages.
The breakdown by region is:
- US/Canada, 42%
- Europe, 30%
- Asia, 16%
- Latin America, 4.9%
- AU/NZ, 2.8%
- Africa/MidEast, 3.0%
|