KDnuggets : Newsletter : 1999 Issues : 99:10 Contents :

KDnuggets 99:10, item 6, Publications:

Previous | Contents |  Next

Date: Tue, 20 Apr 1999 22:50:36 -0700 (PDT)
From: Hillol Kargupta hillol@eecs.wsu.edu
Subject: Distributed Data Mining Paper

The following distributed data mining paper is currently available from 
http://www.eecs.wsu.edu/~hillol/pubs.html


Title:   Distributed Multivariate Regression Using Wavelet-based
	 Collective Data Mining

Authors: Daryl E. Hershberger and Hillol Kargupta
	 School of EECS, Washington State University 
	 Technical Report EECS 99-002

Abstract:
This paper presents a method for distributed multivariate regression
using wavelet-based Collective Data Mining (CDM). The method
seamlessly blends machine learning and information theory with the
statistical methods employed in multivariate regression to provide an
effective data mining technique for use in a distributed data and
computation environment. Evaluation of the method in terms of model
accuracy as a function of appropriateness of the selected wavelet
function, relative number of non-linear cross-terms, and sample size
demonstrates that accurate multivariate regression models can be
generated from distributed, heterogeneous, data sets with minimal data
communication overhead compared to that required to aggregate a
centralized data set. Application of this method to Linear
Discriminant Analysis, which is closely related to multivariate
regression, produced classification results on the Iris data set that
are comparable to those obtained with centralized data analysis.

Hillol Kargupta
School of EECS, Washington State University
http://www.eecs.wsu.edu/~hillol

Previous | Contents |  Next


KDnuggets : Newsletter : 1999 Issues : 99:10 Contents :

Copyright © 1999 KDnuggets