KDnuggets : News : 2001 : n20 : item2    (previous | next)

News


From: David Page
Date: Mon, 1 Oct 2001 17:15:54 -0500 (CDT)
Subject: KDD Cup 2001 results and summary
Because of the importance and rapid growth of biological
data, KDD Cup 2001 focused on mining biological databases.
The competition featured three tasks, based on two databases.

The first task required accurate prediction of molecules
that bind to Thrombin.  Each molecule was described by a
record of nearly 140,000 Boolean features.  The database
was graciously provided by DuPont Pharmaceuticals Research
Laboratories.  This task is representative of a large domain
of tasks from the field of drug design.  In each such task,
an accurate and comprehensible predictor of activity often
can be used to guide the design of new drugs.

The second and third tasks centered on a database with properties
of the Yeast genome and proteome.  For each gene, the database
gave the chromosome on which the gene appears and properties of
organisms with mutations in this gene, including viability.
For each gene, the database also described properties of
the protein for which it codes, including its structural
class and complex, as well as other proteins with which it
interacts.  Also included were correlations in gene expression,
as measured by gene expression microarrays.  Finally, for the
"training" genes only, the database recorded protein functions
and localization (where in the cell the protein typically is
located).  The second and third tasks were to predict function
and localization, respectively.

The winner of Task 1 was Jie Cheng, of the Canadian Imperial
Bank of Commerce.  The winner of Task 2 was Mark-A. Krogel
of the University of Magdeburg.  The winner of Task 3 was the
team of Hisashi Hayashi, Jun Sese, and Shinichi Morishita of
the University of Tokyo.  Details of the winners' approaches
are available in their KDD presentations, online at the KDD
Cup 2001 web site:

www.cs.wisc.edu/~dpage/kddcup2001

Participation was outstanding, with 136 groups submitting a
total of 200 predictions for the three tasks.  Further details
on participation and techniques employed also are available at
the web site.


KDnuggets : News : 2001 : n20 : item2    (previous | next)

Copyright © 2001 KDnuggets.   Subscribe to KDnuggets News!