Interview: Anil Gadre, MapR on What it takes to Automate Data-to-Action?

We discuss how analytics can impact the business “as-it-happens”, merging business analytics with production operations, transition challenges, and recently announced partnership with Teradata.

Twitter Handle: @hey_anmol

anil-gadre-maprAnil Gadre is the SVP of Product Management at MapR. Prior to MapR, Anil was the EVP of Product Management at Silver Spring Networks, responsible for product strategy, planning and marketing of networking and software products focused on the Smart Grid for the energy industry.

Before that, Anil was with Sun Microsystems, a Fortune 200 technology leader, serving as EVP of The Application Platform Software organization and had previously been the Chief Marketing Officer leading global branding, demand creation and an extensive developer ecosystem program.

He has a BSEE from Stanford University, and an MM degree from the Kellogg School at Northwestern University.

Here is my interview with him:

Anmol Rajpurohit: Q1. Analytics impacting the business "as-it-happens" definitely sounds tempting, but only a few companies have been able to do it successfully. What factors separate winners from losers in the quest for a real-time data-to-action cycle?

as-it-happensAnil Gadre: First of all, this is a continuum and not a specific end state. What that means is that everyone will move towards a more “as-it-happens” or real-time process in their companies. The technology will certainly enable it, but the much bigger issue is the related business processes and decision making that has to go faster now that there is more insight in real time. Finally, different businesses have very different needs for what “as-it-happens” means. What might be very slow for some might be just fine in other businesses.

AR: Q2. Automating data-to-action through merging business analytics and production operations is no trivial endeavor. What approach do you recommend towards achieving this goal? What are the key milestones that can be referred to for tracking progress against this goal?

data-to-actionAG: There is no one single answer to this issue because it is very dependent on the processes inside a business for acquiring, ingesting, processing, and using the data. It is also possible to significantly speed up the data-to-action cycle without needing to have a single integrated cluster. Having said that, we are increasingly seeing customers who are pushing the envelope, planning to do both analytics and operational work in the same cluster because of the time savings resulting from not having to copy and move data around. The data is just “there” for both analytics and operations. The milestones for such an initiative should be based and driven by the business needs and desired outcomes, and not by the technology.

AR: Q3. The journey towards real-time analytics implies major architectural changes to the underlying data infrastructure. While the long-term benefits are clear, there is no denying that this change comes with various risks such as potential business downtime, uncertain short-term ROI (for money, time and effort), change management, etc. What steps can an organization take to minimize these risks and have a smooth transition?

project-managementAG: Actually, there is less major architectural change than people may be worried about. This is one reason why our customers are able to get value quickly and not be stuck in long-term projects. Making sure the transition goes smoothly comes down to the well-known basics of project planning: clear scope, enough detail in the tasks, and of course, a great project manager!

AR: Q4. MapR recently announced the integration of its Hadoop distribution with Teradata QueryGrid. What were the key motivations behind this partnership? How do you compare this solution against the option of DIY (do-it-yourself) deploying a lambda architecture within an organization (for real-time and batch-oriented needs)?

teradata-maprAG: MapR is to Hadoop what Teradata is to data warehousing: best in class in terms of performance, reliability, and production success. Therefore, we share a lot of the same customers with Teradata across major industries who wanted to have better integration from their Hadoop and data warehousing environments. Teradata calls this the “Unified Data Architecture” and technologies like QueryGrid give analysts fast SQL access to any data – whether it is stored in Hadoop, Teradata Aster, or the Teradata data warehouse – on the fly, at run-time, so it’s really fast.

In a way, the MapR integration with Teradata using QueryGrid enables multi-tiered “Lambda Architecture” in terms of supporting streaming, interactive, and batch within a hybrid and well-orchestrated environment which reduces data deduplication and optimizes performance for various workloads. For example, we share a joint customer in financial services who supports a real-time contextual call center application. Real-time information from the website streams through Apache Storm for transformation within the MapR Distribution including Hadoop, and then it’s integrated with customer information (e.g., most recent transactions, customer lifetime value, propensity to churn scores) coming from Teradata and served up in a customer 360 application for the call service agent.

Second part of the interview will be published soon.

anmol-rajpurohitAnmol Rajpurohit is a Software Development Intern at Salesforce. He is a data science researcher with extensive interest in innovative applications of analytics, machine learning, and information retrieval. He is former MDP Fellow and graduate mentor at UCI-Calit2. His novel analytics solution for online education was the runner-up at UCLA Developers' Contest 2014. He has presented his research work at various conferences including IEEE Big Data 2013. He is currently a graduate student (MS, Computer Science) at UC, Irvine.