Top Algorithms and Methods Used by Data Scientists
Latest KDnuggets poll identifies the list of top algorithms actually used by Data Scientists, finds surprises including the most academic and most industryoriented algorithms.
Pages: 1 2
 US/Canada, 40%
 Europe, 32%
 Asia, 18%
 Latin America, 5.0%
 Africa/Middle East, 3.4%
 Australia/NZ, 2.2%
N(Alg,Ind_Gov) / N(Alg,Aca_Stu)
  1
N(Ind_Gov) / N(Aca_Stu)
Thus algorithm with affinity 0 is used equally in Industry/Government and by Academic Researchers or students. The higher IG affinity the more "industrial" is the algorithms, and the lower it is the more "academic" is the algorithm.
The most "Industrial Algorithms" were:
 Uplift modeling, 2.01
 Anomaly Detection, 1.61
 Survival Analysis, 1.39
 Factor Analysis, 0.83
 Time series/Sequences, 0.69
 Association Rules, 0.5
The most academic algorithms were
 Neural networks  regular, 0.35
 Naive Bayes, 0.35
 SVM, 0.24
 Deep Learning, 0.19
 EM, 0.17
Fig. 3. KDnuggets Poll: Top Algorithms used by Data Scientists: Industry vs Academia
Next table has the details on the algorithms, % respondents who used them in 2016 and 2011 Poll, change (%2016 / %2011  1), and Industry affinity as explained above.
Table 3: KDnuggets 2016 Poll: Algorithms Used by Data Scientists
Next table has the details on the algorithms, with columns
 N: Rank according to share of usage
 Algorithm: algorithm name,
 Type: S  Supervised, U  Unsupervised, M  Meta, Z  Other,
 % respondents who used this algorithm in 2016 Poll
 % respondents who used this algorithm in 2011 Poll
 change (%2016 / %2011  1), and
 Industry affinity as explained above.
N  Algorithm  Type  2016 % used  2011 % used  % Change  Industry Affinity 

1  Regression  S  67%  58%  16%  0.21 
2  Clustering  U  57%  52%  8.7%  0.05 
3  Decision Trees/Rules  S  55%  60%  7.3%  0.21 
4  Visualization  Z  49%  38%  27%  0.44 
5  Knearest neighbors  S  46%  0.32  
6  PCA  U  43%  0.02  
7  Statistics  Z  43%  48%  11.0%  1.39 
8  Random Forests  S  38%  0.22  
9  Time series/Sequence analysis  Z  37%  30%  25.0%  0.69 
10  Text Mining  Z  36%  28%  29.8%  0.01 
11  Ensemble methods  M  34%  28%  18.9%  0.17 
12  SVM  S  34%  29%  17.6%  0.24 
13  Boosting  M  33%  23%  40%  0.24 
14  Neural networks  regular  S  24%  27%  10.5%  0.35 
15  Optimization  Z  24%  0.07  
16  Naive Bayes  S  24%  22%  8.9%  0.02 
17  Bagging  M  22%  20%  8.8%  0.02 
18  Anomaly/Deviation detection  Z  20%  16%  19%  1.61 
19  Neural networks  Deep Learning  S  19%  0.35  
20  Singular Value Decomposition  U  16%  0.29  
21  Association rules  Z  15%  29%  47%  0.50 
22  Graph / Link / Social Network Analysis  Z  15%  14%  8.0%  0.08 
23  Factor Analysis  U  14%  19%  23.8%  0.14 
24  Bayesian networks  S  13%  0.10  
25  Genetic algorithms  Z  8.8%  9.3%  6.0%  0.83 
26  Survival Analysis  Z  7.9%  9.3%  14.9%  0.15 
27  EM  U  6.6%  0.19  
28  Other methods  Z  4.6%  0.06  
29  Uplift modeling  S  3.1%  4.8%  36.1%  2.01 
Related:
 The 10 Algorithms Machine Learning Engineers Need to Know
 10 Algorithm Categories for A.I., Big Data, and Data Science
 Why Implement Machine Learning Algorithms From Scratch?
Pages: 1 2
Top Stories Past 30 Days

