KDnuggets : News : 2009 : n07 : item37 | PREVIOUS | NEXT |
PublicationsFrom: Bruce RatnerDate: Mon, 06 Apr 2009 Subject: Variable Selection Methods in Regression: Do they produce bad models? Variable Selection Methods in Regression: Many Statisticians Know Them, But Few Know They Produce Poorly Performing Models Variable selection in regression - identifying the best subset among many variables to include in a model - is arguably the hardest part of model building. Many variable selection methods exist. Many statisticians know them, but few know they produce poorly performing models. The wanting variable selection methods are a miscarriage of statistics because they are developed by debasing sound statistical theory into a misguided pseudo-theoretical foundation. The purpose of this article is three-fold: 1) To review five widely used variable selection methods, itemize some of their weaknesses, and answer why they are used; 2) to present a well-defined enhanced variable selec-tion method, which is a prominent by-product of the GenIQ Model�, a machine-learning regres-sion technique. 3) Lastly, because a free-form GenIQ model is concurrently built during the enhanced variable selection - to introduce the GenIQ Model for building database marketing regression models, which seek to maximize cum lift, a measure of model predictiveness of identifying the up-per performing individuals. http://www.geniq.net/res/variable-selection-methods-produce-bad-models.html |
KDnuggets : News : 2009 : n07 : item37 | PREVIOUS | NEXT |
Copyright © 2009 KDnuggets. Subscribe to KDnuggets News!