Golden Rules for Data Mining

Gregory Piatetsky-Shapiro answers:

  • Focus on what is actionable.
  • Prepare and clean the data carefully.
  • Verify data analysis steps.
  • Use multiple data mining and machine learning methods.
  • Beware of "false predictors" (also called "information leakers") fields that appear to predict the outcome too well and are actually recording events that happened after the outcome happened. Find and eliminate them.
  • If the results are too good to be true, you probably have found false predictors.
  • Examine the results carefully and repeat and refine the knowledge discovery process until you are confident.
  • Did I emphasize that you should be beware of "false predictors"?