KDnuggets Home » News » 2010 » Jul » Publications » Hadoop: Tools for Petabyte Analytics

Enterprise-Grade Hadoop: Tools for Petabyte Analytics


 
  
Hadoop is the cornerstone of next-generation technologies known as No SQL. Hadoop, like many No SQL technologies, is still primarily an open-source community and has not yet made the critical transition into a mature enterprise-grade analytics market segment.


Date:

Commercializing Enterprise-Grade Hadoop: Tools for Harnessing Petabyte Analytics

James KobielusJames Kobielus, Information Management Blogs, July 6, 2010

Hadoop is riding the hype wave right now. You'll find many IT professionals who know just enough about Hadoop to be dangerous in a cocktail party setting, but not enough for their own comfort to respond to grilling from the chief technology officer or the geekier business executives.

...

So what exactly is Hadoop, and, just as important, why should enterprise analytics professionals care? As I discussed it in my Forrester report late last year on in-database analytics, Hadoop is primarily for advanced analytics in cloud environments. At heart the Apache Hadoop project defines an analytic processing pushdown workflow model and distributed analytic file store for analyzing unstructured information sets. But the range of subprojects goes well beyond that to encompass a distributed columnar database, data warehousing infrastructure, MapReduce interface, ad-hoc query language, data collection capability, and other utilities for development and management of Hadoop clusters.

Consider Hadoop the cornerstone of the sprawling collection of next-generation technologies known as No SQL. Hadoop, like many No SQL technologies, is still primarily an open-source community and has not yet made the critical transition into a mature enterprise-grade analytics market segment. However, Forrester has seen increasing incorporation of Hadoop technologies and interfaces in recent months into the solution portfolios enterprise data warehousing (EDW) vendors-most notably, Teradata, IBM, Aster Data, Greenplum, and Vertica-and business intelligence (BI) vendors such as Pentaho and MicroStrategy.

We see most startup activity in the Hadoop space-from companies such as Cloudera and Datameer--focusing on solutions that are completely open: open source, open standards, open architectures, and open to service-oriented integration into any SaaS/cloud-based analytics environment. If you want to check out the range and maturity of Hadoop adoptions built on open-source codebases, visit the Hadoop "powered by" wiki.

Read more.


KDnuggets Home » News » 2010 » Jul » Publications » Hadoop: Tools for Petabyte Analytics