NewsFrom: Stefan Kramer skramer@informatik.uni-freiburg.deDate: Wed, 08 Nov 2000 18:33:33 +0100 Subject: Predictive Toxicology Challenge (PTC) 2000-2001
Summary: Prevention of environmentally-induced cancers is a health issue of
unquestionable importance. Almost every sphere of human activity in an
industrialised society faces potential chemical hazards of some form.
*************************** ANNOUNCEMENT ****************************
* *
* THE PREDICTIVE TOXICOLOGY CHALLENGE 2000-2001 *
* *
* Scientific Discovery with Machine Learning *
* *
*********************************************************************
Prevention of environmentally-induced cancers is a health issue of
unquestionable importance. Almost every sphere of human activity in an
industrialised society faces potential chemical hazards of some form.
It is estimated that nearly 100,000 chemicals are commonly used, and a
further 500-1000 are added every year. Only a small fraction
of these chemicals have been evaluated for toxic effects such as
carcinogenicity. The US National Toxicology Program (NTP) contributes
to this enterprise by conducting standardised chemical bioassays --
exposure of rodents (mice and rats) to a range of chemicals -- to help
identify substances that may have carcinogenic effects on humans.
However, obtaining empirical evidence from such bioassays is expensive
and usually too slow to cope with the number of chemicals that can
result in adverse effects on human exposure. This has resulted in an
urgent need for carcinogenicity models based on chemical structures and
properties. It is envisaged that such models would generate reliable
toxicity predictions for chemicals, enable low cost identification of
hazardous chemicals and refine and reduce the reliance on the use of
large number of laboratory animals
The outcome of the bioassays conducted by the NTP has resulted in a
large (by toxicological standards) database of compounds classified
as carcinogens or otherwise. Predicting the outcome of these tests
using chemical structure (and related information) presents a
formidable test for techniques concerned with knowledge discovery
from databases.
In order to provide Data Mining and Machine Learning programs with the
opportunity to participate in this enterprise of great humanitarian
and scientific value, we have initiated the Predictive Toxicology
Challenge (PTC) 2000-2001. The goal of this competition is to predict
the rodent carcinogenicity of new compounds based on the experimental
results of the US National Toxicology Program (NTP). The PTC will be
one of the official challenges of the ECML/PKDD-2001 conferences in
September 2001 in Freiburg. It is the sequel to the Predictive
Toxicology Evaluation (PTE) challenge posed to the Machine Learning
community by A. Srinivasan, R.D. King, S.H. Muggleton, and M.J.E.
Sternberg at IJCAI-97 and summarized at IJCAI-99. In this year's
challenge, we attempted to improve some of the shortcomings of the
previous competition. In particular, some of the improved features
are:
.) Requirements for sex/species specific models
.) Larger chemical databases than before
.) Opportunity for constructing new features
.) Cost-sensitive assessment of models
.) A large independent validation set is likely to
be available.
.) Special ECML/PKDD workshop expected
The PTC 2000-2001 consists of four phases:
In Phase I chemical structures and carcinogenicity evaluations from
the NTP will be provided by the organizers. Participants can submit new
chemical and structural descriptors for these molecules before March 01,
2001. We would like to encourage researchers in constructive induction,
feature construction and propositionalization to contribute in this
phase of the challenge. These submissions will be used by the model-
developers in Phase II.
In Phase II predictive toxicology models based on descriptors from Phase
I may be submitted until June 01, 2001. Each submission must include
an estimation of the model accuracy according to the rules given on
the website and a translation of the model, which is understandable
by toxicologists.
In Phase III the submitted models will be evaluated on quantitative
(performance) and qualitative (toxicological relevance) scales. From all
submissions we will select a subset of "optimal" models according
to ROC analysis. From these we will (a) identify models that are
particularly relevant to toxicology and (b) descriptors that were
particularly useful for predicting carcinogenicity.
Finally, we expect that the results of the Predictive Toxicology
Challenge 2000-2001 will be presented at a workshop of the PKDD/ECML
conferences in September 2001 in Freiburg.
Detailed information about the Predictive Toxicology Challenge can be
obtained from the website
http://www.informatik.uni-freiburg.de/~ml/ptc/
Please accept our apologies for the late announcement on ML-related
mailing lists.
C. Helma, R.D. King, S. Kramer and A. Srinivasan
|
Copyright © 2000 KDnuggets. Subscribe to KDnuggets News!