Which methods/algorithms did you use for data analysis in 2011? [311 voters]
|
| Decision Trees/Rules (186) |
59.8 % |
| Regression (180) |
57.9 % |
| Clustering (163) |
52.4 % |
| Statistics (descriptive) (149) |
47.9 % |
| Visualization (119) |
38.3 % |
| Time series/Sequence analysis (92) |
29.6 % |
| Support Vector (SVM) (89) |
28.6 % |
| Association rules (89) |
28.6 % |
| Ensemble methods (88) |
28.3 % |
| Text Mining (86) |
27.7 % |
| Neural Nets (84) |
27.0 % |
| Boosting (73) |
23.5 % |
| Bayesian (68) |
21.9 % |
| Bagging (63) |
20.3 % |
| Factor Analysis (58) |
18.7 % |
| Anomaly/Deviation detection (51) |
16.4 % |
| Social Network Analysis (44) |
14.2 % |
| Survival Analysis (29) |
9.32 % |
| Genetic algorithms (29) |
9.32 % |
| Uplift modeling (15) |
4.82 % |
| Did you use analytics in the cloud, Hadoop, EC2, etc in 2011? |
| Yes |
14% |
| No |
86% |
| Employment type: | Percent all | Avg Num Algorithms |
| Industry analyst/consultant (172) |
55.3% |
6.3 |
| Academic researcher (85) |
27.3% |
5.1 |
| Student (37) |
11.9% |
4.3 |
| Government/Other (17) |
5.5% |
5.0 |
Regional breakdown is
- US/Canada, 40.2%
- Europe, 37.6%
- Asia, 10.3%
- Latin America, 5.8%
- Africa/Middle East, 3.2%
- Australia/NZ 2.9%
We grouped Industry/Gov in one group and Academic researchers/Students into a second group,
and computed the "affinity" of the algorithm to Industry/Gov as
N(Alg,Ind_Gov) / N(Alg,Aca_Stu)
----------------------------------
N(Ind_Gov) / N(Aca_Stu)
Thus algorithm with affinity 1.5 is used 50% more in Industry/Government than by Academic Researchers or students, and the algorithm with affinity 0.6 is used only 60% as much in Industry.
The most "industrial" algorithms ( with the highest Industry / Gov "affinity") are:
- Uplift modeling, INF (no academic users)
- Survival Analysis, 2.47
- Regression, 2.00
The most "academic" algorithms ( with the lowest Industry / Gov "affinity") are:
- Genetic algorithms, 0.60
- Support Vector (SVM), 0.66
- Association Rules, 0.83
The following table shows the algorithms ranked by Industry affinity (third column).
Second column width shows is proportional to academic affinity (inverse of Industry affinity)
| Algorithm |
Academic/ Student Affinity |
Industry / Gov Affinity |
| Uplift modeling |
|
INF |
| Survival Analysis |
|
2.47 |
| Regression |
|
2.00 |
| Visualization |
|
1.55 |
| Statistics |
|
1.54 |
| Boosting |
|
1.50 |
| Time series/Sequence analysis |
|
1.48 |
| Bagging |
|
1.39 |
| Factor Analysis |
|
1.32 |
| Anomaly/Deviation detection |
|
1.29 |
| Text Mining |
|
1.27 |
| Decision Trees |
|
1.20 |
| Neural Nets |
|
1.16 |
| Clustering |
|
1.14 |
| Ensemble methods |
|
1.08 |
| Social Network Analysis |
0.93 |
|
| Bayesian |
0.92 |
|
| Association rules |
0.83 |
|
| Support Vector -SVM |
0.66 |
|
| Genetic algorithms |
0.60 |
|
|