KDnuggets : News : 2005 : n10 : item3 < PREVIOUS | NEXT >

Features

From: Gregory Piatetsky-Shapiro
Date: 16 May 2005
Subject: Measuring Quality of Data Mining Limericks

Thanks to the few brave souls who were not afraid to send their poetry to the KDnuggets Limerick contest!

For millenia, poets have not been able to come up with numeric quality measures for their work. This lead to disagreements, fights, Aegean wars, and other unfortunate events.

However, data mining has a proud tradition of developing quality measures for rules, lift, data, etc. I relied on this research to develop what is probably the first objective and quantitative "limerick quality measure", which is:

  • one point for each data mining related word used (e.g. data, Bayes, regression).
  • five points for a data mining related idea (e.g. neural networks generally better than logistic regression for predictive tasks) or pun (e.g. was Bayes naive?)
Perhaps this will start a completely new direction in both poetry and text mining. In the meantime, I used the "Data Mining Limerick Quality Measure" to rank the submitted limericks.

The winner is

There once was a data miner
Who was quite the model designer
He used Naive Bayes with gumption
And even with the independence assumption
Managed to predict an attriter!

    Adam Lynton, Australia, 15 points (10 words and the idea that Naive Bayes works well for the attrition prediction despite the independence assumption (:-) ).


2nd place for


There once was a data miner
Who claimed, "I'm a Forty-Niner."
His main obsession
Was logistic regression
But neural networks predicted finer.

    Ross Bettinger, USA, 11 points (6 words and the idea that neural networks are better than logistic regression for predictive tasks).



3rd place is shared by

There once was a data miner
Who had an encounter with Bayes Sire
The answer they sought
Was he naive or not
And decided to settle the bet on a fiver

    Vishwa Vinay, UK, 9 points (4 words and 5 points for the pun on the Naive Bayes).




There once was a data miner,
Who clamored for data finer,
Her results were awash
Though her techniques were so posh,
"Give me," said she, "a table wider!"

John Stultz, USA, 9 points (4 words and the idea of wider table, i.e. more columns, is better for predictive modeling)



Honorable mention to

There once was a data miner
Who had moved on from 'geek' to 'high flier'
Clustered, classified it all
In kernel spaces, big and small
Bending probability with a good prior

    Vishwa Vinay, UK, 8 points.



There once was a data miner
Who lived his life like an outlier
He didnt have a single friend
Because his work day would never end
And as for the wife, he didn't mind her

    Vishwa Vinay, UK, 3 points.




Data mining haiku results will be announced in the next issue.

KDnuggets : News : 2005 : n10 : item3 < PREVIOUS | NEXT >

Copyright © 2005 KDnuggets.   Subscribe to KDnuggets News!