KDnuggets : News : 2009 : n07 : item38 < PREVIOUS | NEXT >

Publications


Subject: The Data Mining Renaissance

GigaOM, Gary Orenstein | Friday, April 10, 2009

We are in the midst of a data mining renaissance.

Traditionally, data warehousing implementations were large, complex and expensive, meaning only the top-ranking companies could afford them. Teradata pioneered the initial market for corporate data warehousing solutions and still maintains a segment lead, something HP’s CEO Mark Hurd knows all too well. More recent entrants into the data warehousing and intelligence market, like Netezza, have emerged with cost-effective, appliance-based approaches. Others in this arena include Greenplum, recent Microsoft acquisition DATAllegro and, of course IBM, Oracle and SAP.

But the web changed the way we radiate and consume information and in doing so, created a new opportunity to measure and monetize it. Faced with more user data, logging information, and web content than anyone thought one system could handle, the major web companies developed highly scaled data warehousing solutions themselves. Armed with these tools, they improved customer resonance by building better recommendation engines, more targeted advertising networks and more intricate campaigns.

The preferred architectural model for this web-derived data warehouse - a combination of low-cost server hardware, distributed systems and open-source software - set off an innovation path that outpaced the commercial market. What once was a very expensive proposition - to amass the computer power, storage capacity and network bandwidth to run a high-end data warehouse and analytics engine - is now readily available on-site or in the cloud.

The current software favorite for large-scale data processing and analytics is Hadoop, an open-source implementation of MapReduce.

Read more.


KDnuggets : News : 2009 : n07 : item38 < PREVIOUS | NEXT >

Copyright © 2009 KDnuggets.   Subscribe to KDnuggets News!