Workshop participants will get Yandex click log challenge dataset, and monetary prizes will be awarded to the winners. Of particular interest is the relation between click data and editorial relevance judgments
The WSCD 2012 chairs invite you to submit articles to the
Second Workshop on Web Search Click Data. This workshop will be held in
conjunction with
WSDM 2012
on February 12, 2012, Seattle WA, USA.
Together with this workshop and using the same dataset provided to all
participant, a Challenge will be organized and money prizes will be
awarded to the winners.
Web search click data has caught the interest of a growing
community of professionals during the last five years. It provides
a snapshot of the typical information access patterns of the user
population, unlike a group of relevance judges or the tags
provided on collaborative sites. Issued queries and clicked
document records have the potential to give an accurate view of
millions of people's daily interests, how these interests evolve,
and how these interest are related, etc. Yet this information is
not easily extracted because in spite of its abundance it is
sparse, most queries are repeated only a few time if at all, and
clicks cannot be interpreted as document relevance
directly. Moreover, user clicks are strongly biased by the search
engine ranking, which mean that many documents are never seen in
spite of being relevant.
Of particular interest is the relation between click data and
editorial relevance judgments. A large body of Literature exists
on how to use editorial judgments to evaluate and train Machine
Learned Ranking functions. This describes what has been and
continues to be a largely successful technology that serves
millions of users every day, yet relying on editorial judgments
as the Gold Standard against which to learn has its limits: 1)
The metrics used to evaluate rankings are heuristics and hard to
relate to user behavior, 2) As search technologies are extended
to new areas, it is ever harder for editors to provide accurate
judgments. This applies to most verticals like
"local search" and to newer research areas like "diversity" where the
question is how to introduce diversity in document ranking or
"personalization" where the goal is to adapt the search results to
take into account what is known from the user who issued the query.
In this workshop, we will attempt to address these issues but we
will also explore novel applications and use of these data for
enhancing the user search experience.
Research on the incorporation of click data into information retrieval
systems, and for understanding user search, has been hampered by a
lack of shared datasets. This workshop provides a common click
dataset and a forum for presenting new results and analysis in the
area. The dataset includes user sessions extracted from Yandex logs,
with queries, URL rankings and clicks. Unlike previous click datasets,
it also includes relevance judgments for the ranked URLs, for the
purposes of training relevance prediction models. To allay privacy
concerns the user data is fully anonymized. So, only meaningless
numeric IDs of queries, sessions, and URLs are released. The queries
are grouped only by sessions and no user IDs are provided. More
details are available at
imat-relpred.yandex.ru/
Submissions should present original results and new ideas and can
be up to 8 pages in length but shorter works are
encouraged. Papers should properly place the work within the
field, cite related work, and clearly indicate the innovative
aspects of the work and its contribution to the field, using for
instance proper evaluation methods. We strongly encourage
evaluations that are repeatable and make use of the provided
dataset.
Submissions should not be under review or be already accepted in
a journal or another conference.
All papers will be peer-reviewed by at least three reviewers from an
International Program Committee; promising papers identified will then
be discussed in a meeting of PC chairs, where the final selections
will be made. Accepted papers will appear in the conference online
proceedings published by the ACM Digital Library and the conference
web site. Authors of accepted papers will retain proprietary rights to
their work, but will be required to sign a copyright release form.
The submission site will come online one month
before the abstract due date.
Important Dates
- Start of Challenge: October 15, 2011
- Papers due: December 5, 2011
- End of Challenge: December 15, 2011
- Notification of Acceptance: January 10, 2012
- Camera-Ready: January 17, 2012
- Workshop: February 12, 2012
The workshop website is
research.microsoft.com/en-us/um/people/nickcr/wscd2012.
See
imat-relpred.yandex.ru/ for more information on the Challenge.
|