KDnuggets Home » News » 2010 » Jan » Events » SFBayACM Feb 22: Algorithmic/Statistical Large-Scale Data Analysis  ( < Prev | 10:n02 | Next > )

SFBayACM Feb 22: Algorithmic and Statistical Perspectives on Large-Scale Data Analysis


 
  
In recent years,ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are also useful in practice for solving large-scale scientific and Internet data analysis problems


Feb 22, SFBayACM.org, Data Mining SIG: Algorithmic and Statistical Perspectives on Large-Scale Data Analysis, Mountain View, CA, at LinkedIn.  Presented by Michael Mahoney of Stanford University

Cost: Free and open to all who wish to attend, but membership is only $20/year. Anyone may join our mailing list at no charge, and receive announcements of upcoming events.

Speaker: Michael W. Mahoney, Stanford University

TITLE: "Algorithmic and Statistical Perspectives on Large-Scale Data Analysis"

DESCRIPTION:
Computer scientists and statisticians have historically adopted quite different views on data and thus on data analysis. In recent years, however, ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are also useful in practice for solving large-scale scientific and Internet data analysis problems.

After reviewing these two complementary perspectives on data, I will describe two recent examples of improved algorithms that used ideas from both areas in novel ways. The first example has to do with improved methods for structure identification from large-scale DNA SNP data, a problem which can be viewed as trying to find good columns or features from a large data matrix. The second example has to do with selecting good clusters or communities from a data graph, or demonstrating that there are none, a problem that has wide application in the analysis of social and information networks. Understanding how statistical ideas are useful for obtaining improved algorithms in these two applications may serve as a model for exploiting complementary algorithmic and statistical perspectives in order to solve applied large-scale scientific and Internet data analysis problems more generally.


KDnuggets Home » News » 2010 » Jan » Events » SFBayACM Feb 22: Algorithmic/Statistical Large-Scale Data Analysis  ( < Prev | 10:n02 | Next > )