KDnuggets News 99:18, item 6, Publications

KDnuggets : Newsletter : 1999 Issues : 99:18 Contents :

KDnuggets 99:18, item 6, Publications:

Previous | Contents | Next

Date: Thu, 26 Aug 1999 14:18:15 -0700 (PDT)
From: Steve Minton jairmail@ISI.EDU
Subject: recent JAIR article, "Identifying Mislabeled Training Data"

Readers of this mailing list may be interested in the following
article which was recently published by JAIR.

Brodley, C.E. and Friedl, M.A. (1999)
  "Identifying Mislabeled Training Data" , Volume 11, pages 131-167.

   Available in PDF, PostScript and compressed PostScript.
   For quick access via your WWW browser, use this URL:
http://www.jair.org/abstracts/brodley99a.html
   More detailed instructions are below.

   Abstract: This paper presents a new approach to identifying and
   eliminating mislabeled training instances for supervised learning. The
   goal of this approach is to improve classification accuracies produced
   by learning algorithms by improving the quality of the training data.
   Our approach uses a set of learning algorithms to create classifiers
   that serve as noise filters for the training data.  We evaluate single
   algorithm, majority vote and consensus filters on five datasets that
   are prone to labeling errors.  Our experiments illustrate that
   filtering significantly improves classification accuracy for noise
   levels up to 30 percent.  An analytical and empirical evaluation of
   the precision of our approach shows that consensus filters are
   conservative at throwing away good data at the expense of retaining
   bad data and that majority filters are better at detecting bad data at
   the expense of throwing away good data.  This suggests that for
   situations in which there is a paucity of data, consensus filters are
   preferable, whereas majority vote filters are preferable for
   situations with an abundance of data.

The article is available via:

 -- World Wide Web: The URL for our World Wide Web server is
http://www.jair.org/
    For direct access to this article and related files try:
http://www.jair.org/abstracts/brodley99a.html

Previous | Contents | Next

KDnuggets : Newsletter : 1999 Issues : 99:18 Contents :