KDnuggets Home » News » 2010 » Apr » Publications » Paradox of Overfitting  ( < Prev | 10:n09 | Next > )

The Paradox of Overfitting


 
  
An overfitted model is one that approaches reproducing the training data on which the model is built.


Overfitting, a problem akin to model inaccuracy, is as old as model building itself, as it is part and parcel of the modeling process. An overfitted model is one that approaches reproducing the training data on which the model is built - by "capitalizing on the idiosyncrasies" of the training data. The model brings about the complexity of the idiosyncrasies by including in the model extra unnecessary variables, interactions, and variable construction(s), none that are part of the sought-after predominant pattern in the data. Resultantly, a major characteristic of an overfitted model: The model has too many variables; it is too complex.

Read the rest at

www.geniq.net/res/The-Paradox-of-Overfitting.html


KDnuggets Home » News » 2010 » Apr » Publications » Paradox of Overfitting  ( < Prev | 10:n09 | Next > )