KDnuggets : News : 2002 : n05 : item1    ( next)

Features


From: Gregory Piatetsky-Shapiro

Date: Mar 4, 2002

Subject: Poll Results: How do you usually store metadata?

Metadata is information about the dataset, e.g. field types, length, ordered vs categorical, missing values, etc. The previous KDnuggets poll asked: How do you usually store dataset metadata?

Based on 118 votes, the results were:

in a header line      21%
in a relational DB    39%
in a commercial data
mining tool format    15%
in PMML		       2%
c45 .names files       9%
other		      14%
I was alarmed by comments of one of the readers, who wrote
"Who has time for metadata?"

While metadata maintenance does require time, I view it as absolutely essential for automating steps in the knowledge discovery process and making sure you have correct and reproducible results.

Metadata maintenance is as essential as car maintenance -- if you forget to fill up your car and don't inflate your tires, very soon your car will run out of gas or have an accident. Likewise, your data mining without right meta-data is likely to give sub-optimal or even dead wrong results.

See other comments and full poll results at www.kdnuggets.com/polls/2002/metadata_format.htm


KDnuggets : News : 2002 : n05 : item1    ( next)

Copyright © 2002 KDnuggets.   Subscribe to KDnuggets News!