Gregory Piatetsky-Shapiro, Apr 11, 2012.
The New York Times Bits Blog had an interesting feature Data Scientists Get Ranked (Apr 8, 2012) which looked at Kaggle competition platform and their rankings of top competitors.
Kaggle found a similar ranking system in golf, where there is some team play,
different kinds of competitions, and a payoff for doing well on a consistent basis. According to Jeremy Howard, Kaggle's president and chief scientist,
it makes little difference for a top performer if the problem is public health or essays in Arabic. The argument that great data science is just about letting the data talk holds true.
Kaggle plans to rank everyone participating in Kaggle contests, based on a rolling average of performance over the preceding 12 months. The top 10 Kaggle contestants as of Apr 2012 are listed below.
- 76.95 pts, D'yakonov Alexander, 33, Moscow, Russian Federation
- 63.92 pts, Jeremy Howard (Kaggle), Melbourne, Australia
- 62.38 pts, Vivek Sharma,
- 56.30 pts, Ben Hamner, 24, United States
- 55.86 pts, Gxav (Xavier Conort), 39, Singapore
- 55.73 pts, Sergey Yurgenson, 50, Boston, USA
- 47.75 pts, Jose H. Solorzano, 42, Quito, Ecuador
- 44.80 pts, Vladimir Nikulin, 52, Kirov, Russian Federation
- 41.77 pts, Tim Salimans, 26, Utrecht Netherlands
- 39.48 pts, David J. Slate, United States
What is interesting, is that apart from Kaggle's Jeremy Howard, the others appear to be working not as data scientists/analysts, but in related professions - and compete in their free time.
We can also look at LinkedIn ranking of professionals with "Data Mining Skill". Currently, they are
- DJ Patil, Data Scientist in Residence at Greylock
- Ronny Kohavi, Partner Architect at Microsoft
- Karl Rexer, Data Mining & Analytic CRM Consulting
- Marybeth Haas, Analytics Recruiter
- Jure Leskovec, Assistant Professor at Stanford University
- Monica Rogati, Senior Data Scientist @ LinkedIn
- Dean Abbott, Data Mining and Predictive Analytics Expert
- Goutam Chakraborty
- Michael Berry, Business Intelligence Director at TripAdvisor
- Mary Parker, Analytic Search Recruiters / Statisticians Economists Digital Quants at WPS, Inc.
This ranking is probably bases on connections and recognition and seems to be biased towards LinkedIn data scientists and those with more connection. Among with several leading data scientists and consultants we also see two recruiters in top 10.
(Note: I appear in position #12, or #10 if two recruiters are excluded).
Another interesting ranking of online influence is provided by Traackr - see more about it in this KDnuggets story: Top Online Influencers in Data Science
Finally, there is also a ranking of real scientists - those who publish research papers and books. Here are top authors in Data Mining Publications, ranked by H-Index (a measure which combines the number and the impact of author publications)
- Jiawei Han, H-Index=69
- Philip S. Yu, 47
- Rakesh Agrawal, 46
- Christos Faloutsos, 39
- Heikki Mannila, 36
- Eamonn J. Keogh, 35
- George Karypis, 35
- Jian Pei, 34
- Padhraic Smyth, 34
- Hans-Peter Kriegel, 33
See Top 10 in Data Mining for more information.