KDD Cup 2018 Call for Proposals

We are looking for strong proposals that have a novel and motivated goal, a broad business impact, a rigid and fair setup, a challenging yet manageable task, and domain accessibility to the general public.

By Ron Bekkerman, Hang Zhang, and Jeong-Yoon Lee, KDD Cup 2018 Co-chairs.

This Call for Proposals invites industrial or academic institutions to submit their proposals for organizing the 2018 KDD Cup competition. Since 1997, KDD Cup has been the premier annual Data Mining competition held in conjunction with the ACM SIGKDD conference on Knowledge Discovery and Data Mining.

London KDD-2018 will take place in London, UK in August 2018. The KDD Cup competition is anticipated to last for 2-4 months, and the winners will be notified by mid-June. The winners will be honored at the KDD conference opening ceremony and will present their solutions at the KDD Cup workshop during the conference. The winners are expected to be monetarily rewarded, with the first prize being in the ballpark of ten thousand dollars.

We are looking for strong proposals that meet the following requirements: a novel and motivated goal, a broad business impact, a rigid and fair setup, a challenging yet manageable task, and domain accessibility to the general public.

awards 1. A novel and motivated goal. Of a particular interest are tasks that imply machine learning solutions different from the traditional KDD Cup setting (an ensemble of classifiers is learned on a given training set to obtain a high-quality classification result on a held-out test set). Examples of non-traditional setups would be incrementally arriving data and evaluation on the accumulated error; prediction given a limited amount of resources; learning with mostly unlabeled data; addressing cold-start issues in learning;learning over multiple types of data; applications of deep learning models etc.

2. A broad business impact. We encourage organizers to ponder on a practical challenge that has a potential to be deployed in a real-world application and get appreciated by millions of customers.

3. A rigid and fair setup. The organizers should guarantee the availability of the data and the confidentiality of the test set (to prevent information leakages at any cost). The evaluation metrics should be both meaningful for the application in-hand and statistically sound for the objective comparison. The baseline should be established to show that non-trivial results can be achieved. An estimate of what constitutes a significant difference in the performance will be much appreciated.

4. A challenging yet manageable task. The task should be challenging in the sense that there is enough room for improvement from the basic solutions, and novel ideas are required to succeed in the competition. The task should be manageable in about 3 month time, and the underlying infrastructure should be supposed by the organizers such that the competitors could mainly focus on the core challenge.

5. Domain accessibility. The notions presented in the competition description should be accessible to the majority of machine learning and data mining practitioners who might not have an excessive domain knowledge or access to a powerful computational infrastructure.

In addition to meeting the requirements above, good proposals might aim to address the following concerns raised over previous KDD Cups competitions:

1. Model complexity. Although increasing the complexity of the solutions (usually via an ensemble of multiple models) can improve the accuracy, it makes it harder to interpret or deploy the proposed solutions.

2. Static test data - Evaluating the models on static test data motivates participants to squeeze the maximum out of the test data (sometimes by exploiting data leaks), and the solutions might be overfit.

We suggest the proposals to answer the following questions:
  1. How does the proposed challenge meet the five requirements?
  2. How does the proposed challenge address the two concerns?
  3. Which competition infrastructure do you plan to use (e.g. Kaggle, or your own)? Is the competition platform you chose equally accessible to participants all over the world?
  4. What resources (including people, time, and award money) do you plan to invest?
  5. What is your time schedule for the competition?
  6. Is there any concern of the privacy about the released data? Have you obtained the rights to release the data for the competition from your legal counsels?
  7. Do you require the winners to submit the source code of their winning solutions?
  8. How would you handle Q&A and possible revisions during the competition?
  9. What is your baseline solution?

and also include:
  • Names, affiliations, email addresses, phone numbers, and short biographies of the organizers.
  • An endorsement letter from the executive-level management of your organization.
Please keep the proposal concise and strictly confidential. Please send your proposals in the PDF format to kddcup2018chairs@gmail.com by November 1st, 2017.

Important dates:
  • September 1, 2017 - CFP release
  • November 1, 2017 - Proposal submission deadline
  • December 1, 2017 - Decision notification
  • March 1, 2018 - Tentative start of the competition
  • June 15, 2018 - Announcement of the KDD Cup Winner