Hadoop Maturity Survey: The Tipping Point

AtScale first global Hadoop maturity survey finds Hadoop value greatly increases with nodes deployed; its use for ETL is frequently a transition stage to higher-value Data Science applications.

AtScaleAtScale teamed up with Cloudera, MapR and Tableau to release the first global Apache Hadoop maturity survey. Industry adoption of Hadoop gives an insight into how the new paradigm of leveraging data to inform business processes is progressing. These insights are realised as Hadoop is one of the most popular tools for scalable, distributed processing of large datasets across commodity computer clusters.

Early Stages but Increasing Engagement

So far the largest of its kind, the survey saw 2,200 respondents from North and Latin America, Asia and Europe take part. Over 70% of those surveyed currently use Hadoop and this figure is set to rise with the remainder of respondents planning to incorporate the platform into their development cycles over the next year. Most current users are still in early stages of development as over half of respondents (56%) have deployed 25 nodes or less. Again, increased future use is predicted with 76% of all current users planning to increase their engagement over the next quarter. atscale-hadoop-purposes

Vectors of Value

The survey cites several factors as key in realising tangible value from the use of Hadoop. Increased node deployment, self-service access, maturity of use, deployment to production and the driver of adoption all effect the realisation of value. The presence of a direct executive mandate alone was reported to increase value realised by 20% and even greater were the effects of deploying the software to production, as this was found to double reported value. atscale-hadoop-node-value

From ETL to BI

Interestingly, current users of Hadoop primarily employ it’s functionality for ETL, whereas those planning to adopt the distributed processing system intend to leverage it for business intelligence. This trend is perhaps indicative that ETL is a transitionary stage ultimately leading to deploying data science applications and performing analyses in order to gain actionable insights for business processes.

The most popular business intelligence tools differed for companies that have adopted Hadoop and those who have not. For Hadoop users Tableau was found to be most prevalent whereas for those yet to adopt the platform Excel won out. Tableau is a very powerful instrument for visualisation, perhaps suggesting Hadoop users can easier investigate data trends, as visualisation is a cornerstone in any insight distillation process.

Looking by industry, Online, Telco, and Manufacturing companies are the most mature in their Hadoop usage. atscale-hadoop-industry-maturity Finally

The survey is one of the first of its kind and provides tremendous insights into the use of Hadoop across different verticals and a granular understanding of how the platform is being adopted for distributed data processing globally. But, as there are many different aspects of Hadoop (MapReduce, HDFS, Yarn, etc.) further insights could be garnered from understanding did the companies surveyed leverage one aspect of the environment atomically or adopt the platform as a whole. Get your own copy here.

For further information, Wayne Eckerson - renowned for his work on analytics and business intelligence - will be hosting a webinar on Tuesday, Oct 6 in which he will break down the survey and discuss the findings in detail.