Mining a Data Mining Conference: Analytics on KDD-2013
We look at interesting analytics and statistics from KDD-2013 Conference on Knowledge Discovery and Data Mining. Which topics are hot, and which are most likely to be accepted?
By Gregory Piatetsky, Aug 15, 2013.
I have just returned from a very successful KDD-2013 Conference on Knowledge Discovery and Data Mining, held on Aug 11-14, 2013 in Chicago, IL.
KDD continues to be the leading research conference in the field, and this year received 726 papers, from which only 125 were accepted, 17.2% acceptance ratio.
KDD-2013 had about 1,200 attendees, which makes it the largest research, peer-reviewed conference in Data Mining, Data Science, and Knowledge Discovery, ever (so far).
KDD-2013 Program Committee co-chairs, Inderjit Dhillon (U. of Texas at Austin) and Yehuda Koren (Google) have compiled an interesting report on KDD-2013 papers, trends, and topics, and here is an excerpt.
KDD is characterized by a healthy mixture of fundamental topics while being in close touch with new applications.
Fundamental topics include: Classification, Clustering, Probabilistic Methods, Rule and pattern mining, active and transfer learning.
New Trends and Applications include social networks, novel statistical techniques for big data, big data, social influence, viral marketing, social media, recommender systems, security & privacy.
Here is a word cloud of 125 accepted papers.
The following table shows Acceptance ratio by topic, which I divided into hot (acceptance rate significantly above average), medium (medium acceptance rate) and cold (below average acceptance rate).
|Topic||% Submissions||Acceptance Ratio|
|big data – scalable methods||2.42%||29.43%|
|social and information networks||6.68%||21.85%|
|security and privacy||1.86%||8.95%|
The topics most likely to be accepted were user modeling, big data – scalable methods, unsupervised learning, supervised learning, and recommender systems. Note the different odds for two classic topics: clustering had almost 2.5 times smaller acceptance rate than classification!
Note to future authors – don’t put “other” as a topic.
The following table shows Acceptance ratio by the number of authors.
Having more authors gives an almost linear improvement in acceptance probability, up to 4 authors. However, the KDD-2013 best paper: “Simple and Deterministic Matrix Sketching”, was written by one author: Edo Liberty, Yahoo! Labs.
Comparing with KDD-2005, which was also held in Chicago, we can see many new topics added since then.
Finally, the most trending topics were social networks, Twitter, and Sampling.
|social networks||10%, (from 22.8% to 32.8%)||25.90%, (from 6.9% to 32.8%)|
Thanks to Yehuda Koren, KDD-2013 Program Chair, for providing the slides from which I extracted the information above.