KDD Cup 2013: Author-Paper Identification Challenge

One of the main challenges of searching academic literature is resolving author-name ambiguity: many authors have similar names, and some authors publish under different variations of their name. This problem is the topic of KDD Cup 2013: determine which papers in Microsoft Academic Search author profile were truly written by that author. Submission deadline: June 12.

The ability to search literature, collect and aggregate metrics around publications is a central tool for modern research. Both academic and industry researchers across hundreds of scientific disciplines, from astronomy to zoology, increasingly rely on search to understand what has been published and by whom.

This problem is at the core of this year's KDD Cup, which is sponsored and designed by a team from Microsoft, the Center for Web and Data Science at University of Washington, Tacoma, and the Computational Web Intelligence team at Ghent University, Belgium.

About the challenge
Microsoft Academic SearchMicrosoft Academic Search is an open platform that provides a variety of metrics and experiences for the research community, in addition to literature search. It covers more than 50 million publications and over 19 million authors across a variety of domains, with updates added each week. One of the main challenges of providing this service is caused by author-name ambiguity. On one hand, there are many authors who publish under several variations of their own name. On the other hand, different authors might share a similar or even the same name.

As a result, the profile of an author with an ambiguous name tends to contain noise, resulting in papers that are incorrectly assigned to him or her. This KDD Cup task challenges participants to determine which papers in an author profile were truly written by a given author.

How to participate
The KDD Cup challenge is hosted by Kaggle, the world's leading platform for predictive modeling competitions. Participants in the KDD Cup must create and use their Kaggle account to download the data.

Deadline for submissions is 12:00 am, Wednesday 12 June 2013 UTC.

