KDnuggets : News : 2007 : n08 : item12 < PREVIOUS | NEXT >

Software


Subject: Reduced Error Logistic Regression is not your Grandfather's Logistic Regression

Rice Analytics, a SAS Alliance Partner, is exclusively offering Reduced Error Logistic Regression. Currently available software is a set of macros that require SAS� version 9.1 Base SAS and SAS/STAT. In partnership with SAS, orders are also currently being taken for GUI desktop or multi-user GUI network software for those that want a GUI version rather than SAS macros.

Reduced Error Logistic Regression was developed by Rice Analytics to overcome problems related to excessive error in other forms of Logistic Regression. This method is based upon the famous result by Nobel Laureate Daniel McFadden that the Logit formula that arises in Logistic Regression necessarily implies that the unexplained model utility is distributed according to Extreme Value Type I behavior. Reduced Error Logistic Regression explicitly models this unobserved portion of utility as Extreme Value Type I error.

Reduced Error Logistic Regression is a fully specified maximum likelihood model that includes parameters to handle both model and error components. The key assumption concerns "symmetrical error constraints". These "symmetrical error constraints" force a solution where the probabilities of positive and negative error are symmetrical across all cross product sums that are the basis of maximum likelihood logistic regression. As the number of independent variables increases, it becomes more and more likely that this symmetrical assumption is accurate. With these symmetrical constraints, the algebra shows that the resulting Logit error estimate is Extreme Value Type I. Because this error component can be reliably estimated and subtracted out with a large enough number of variables, the resulting model parameters are strikingly error-free and do not overfit the data.

Models based upon extremely small sample sizes and very large numbers of variables are possible. For example, it is possible to get reliable and valid Logit regression coefficients in a Reduced Error Logistic Regression model based upon a sample size of 100 observations and 200 independent variables. With this number of variables, it would take a sample size of well beyond 10,000 observations for Standard Logistic Regression to achieve similar accuracy. This is also dramatic improvement over other forms like Hierarchical Bayes Logistic Regression.

Another striking feature to Reduced Error Logistic Regression is that it is basically immune to variable selection error problems. The Logit coefficients in a model based upon a subset of variables are almost perfectly correlated to the Logit coefficients for these same variables in a model based upon the full set of variables. This feature would only be expected if the error due to the unexplained portion of utility was largely removed from the solution. Due to this same feature, outlier error and "missing at random" imputation are handled automatically.

Due to computational efficiencies, this enterprise-level software can rapidly accommodate thousands of variables and automatically produce a most probable model based upon the subset of most important variables. Reduced Error Logistic Regression allows:

  • Simultaneous Categorical and Continuous Independent Variables
  • Modeling of Interactions to Whatever Order is Specified
  • Modeling of Non-Linear Components of Independent Variables (up to the 4th order polynomial)
  • Binomial and Multinomial Models for Categorical Dependent Variables
  • Ordered Logit Models for Ordered Dependent Variables or Small-Interval-Categorized Continuous Dependent Variables (point estimates can be derived)
  • Multilevel, Repeated Measures and Survival Designs including Individual Level Estimates
For many current Logistic or even Linear Regression applications, users can expect significantly better accuracy and speed of model building. Please visit www.RiceAnalytics.com to learn more.

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc.

Bookmark using any bookmark manager! What's this?


KDnuggets : News : 2007 : n08 : item12 < PREVIOUS | NEXT >

Copyright © 2007 KDnuggets.   Subscribe to KDnuggets News!