KDnuggets : News : 2003 : n10 : item16 < PREVIOUS | NEXT >

Publications

From: Joseph Mullat
Date: 13 May 2003
Subject: An Approach to Data Mining Using Monotone Systems

Following Ramakrishan and J. Gehrke, 2nd ed, 2000, "Data mining consists of finding interesting trends or patterns in large datasets, in order to guide decisions about future activities." Sometimes, a trend or pattern in a dataset can be observed on much smaller data-subset, which in some sense distinguishes itself better than any other thinkable data-subset. Unfortunately, this simple idea leads to non-scalable algorithms, when, in general, the running time to find the best data-subset grows in proportion to the number of all thinkable data-subsets, so-called exponential NP-hard time table. Monotonic Systems - the MS-framework is a kind of technique, which ensures the algorithms scalability on the expense of monotonic constraint upon data-subsets "goodness criteria"; hereby the technique inherits its name.

An example of MS-framework implementation on Survey Datasets, which we hope is understandable for the general audience, may be downloaded at

http://www.datalaundering.com/download/cleaning.pdf.

However, for those, who are interested in mathematical purity, a series of articles on much higher level of abstraction is available for download as well:

http://www.datalaundering.com/monotone.htm or http://www.datalaundering.com/furtherd.htm .


KDnuggets : News : 2003 : n10 : item16 < PREVIOUS | NEXT >

Copyright © 2003 KDnuggets.   Subscribe to KDnuggets News!