KDnuggets : News : 2007 : n15 : item29 < PREVIOUS | NEXT >

Publications


Subject: Identifying and Overcoming Common Data Mining Mistakes

by Doug Wielenga, SAS Institute Inc., Cary, NC, in SAS Global Forum 2007

ABSTRACT: Due to the large amount of data typically involved, data mining analyses can exacerbate some common modeling problems and create a number of new ones. These problems can greatly increase the time that it takes to develop useful models and can hamper the development of potentially superior models.

This paper discusses how to identify and overcome several common modeling mistakes. The presentation begins by providing insights into common mistakes in data preparation; it then follows the data flow of a typical predictive modeling analysis through setting variable roles, creating and using data partitions, performing variable selection, replacing missing values, building different types of models, comparing resulting models, and scoring those models using SAS(r) Enterprise Miner(tm). The paper concludes with a discussion of common issues with cluster analysis and association/sequence analysis. Applying these techniques can greatly decrease the time it takes to build useful models and improve the quality of the models that are created.

Here is Identifying and Overcoming Common Data Mining Mistakes paper (pdf).

Bookmark using any bookmark manager!


KDnuggets : News : 2007 : n15 : item29 < PREVIOUS | NEXT >

Copyright © 2007 KDnuggets.   Subscribe to KDnuggets News!