KDnuggets News 03:04, item 39, CFP

KDnuggets : News : 2003 : n04 : item39

CFP

From: Nathalie Japkowicz
Date: 11 Feb 2003
Subject: Workshop: Learning from Imbalanced Data Sets, deadline May 1

ICML-KDD'2003 Workshop:
Learning from Imbalanced Data Sets II
Thursday, August 21, 2003
Washington, DC

Organizers:

Nitesh Chawla, Business Analytic Solutions, CIBC (chawla@csee.usf.edu)
Nathalie Japkowicz, University of Ottawa (nat@site.uottawa.ca)
Aleksander Kolcz, America Online, Inc. (ark@pikespeak.uccs.edu)

Workshop Page:

www.site.uottawa.ca/~nat/Workshop2003/workshop2003.html

Workshop Description:

Overview:

Recent years brought increased interest in applying machine learning techniques to difficult "real-world" problems, many of which are characterized by imbalanced learning data, where at least one class is under-represented relative to others. Examples include (but are not limited to): fraud/intrusion detection, risk management, medical diagnosis/monitoring, bioinformatics, text categorization and personalization of information. The problem of imbalanced data is often associated with asymmetric costs of misclassifying elements of different classes. Additionally the distribution of the test data may differ from that of the learning sample and the true misclassification costs may be unknown at learning time.

The AAAI-2000 Workshop on "Learning from Imbalanced Data Sets" provided the first venue where this important problem was explicitly addressed and has been received with much interest. The related ICML-2000 Workshop on "Cost-Sensitive Learning" provided another venue for addressing the problem of asymmetric costs of different classes and features. Although much awareness of the issues related to data imbalance has been raised, many of the key problems still remain open and are in fact encountered more often, especially when applied to massive datasets. We believe that it would be of value to the machine learning community to not only examine the progress achieved in this area over the last three years but also discuss the current school of thought on research in learning from imbalanced datasets. Based on our understanding of class imbalance problem, the following topics of discussion are proposed (but not limited to):

sampling (under-, over-, progressive, active)
post-processing of learned models
accounting for class imbalance via inductive bias
one-sided learning
handling uncertainty of target distribution and misclassification costs
handling varying amounts (class dependent) of label noise

Submission deadline: May 1, 2003

For additional information, see the web page.

KDnuggets : News : 2003 : n04 : item39

PREVIOUS | NEXT