Interview: Joseph Sirosh, Microsoft on Azure ML and the Emerging Data Science Economy

We discuss what distinguishes Azure ML from its intense competition, the online machine learning university, current maturity level of Big Data solutions, important skills for data scientists and more.

Joseph SiroshJoseph Sirosh recently joined Microsoft from Amazon where he was the VP for the Global Inventory Platform and CTO of the core retail business. In this role he had responsibility for the science and software behind Amazon’s supply chain and order fulfillment systems, as well as the central Machine Learning group which he built and led.

During his 9 years at Amazon, he managed a variety of teams including forecasting, inventory, supply chain and fulfillment, fraud prevention systems, data warehouse and a novel data-driven seller lending business. Prior to Amazon, Joseph worked for Fair Isaac Corporation as VP of R&D. Joseph is passionate about Machine Learning and its applications and has been active in the field since 1990.

First part of interview.

Here is second and last part of my interview with him:

Anmol Rajpurohit: Q5. How do you differentiate Microsoft Azure Machine Learning from other competitive cloud-based analytics services such as those offered by Amazon and IBM?

DifferentiationJoseph Sirosh: We are really differentiated in our ability to build production-ready machine learning APIs with a few clicks. No other solution provides the ability to use drag and drop visual tools and open source code like R along with sophisticated machine learning algorithms and build reliable cloud hosted APIs that are ready to be hooked into production applications. Democratization of complex technology is something Microsoft is known for and machine learning is an area that really benefits from this expertise.

In a broader sense, our differentiation is that Azure Machine Learning is part of a broad, connected data platform that includes services for every stage of the data lifecycle. This includes not only machine learning, but also Hadoop services such as Azure HDInsight, data orchestration and ETL services such as Azure Data Factory, cloud-based complex event processing such as Azure Streaming Analytics, and business intelligence tools such as Power BI. This combination enables advanced analytics at an entirely new level of ease and sophistication.

AR: Q6. At the Worldwide Partner Conference (WPC) 2014, Microsoft launched the new online Machine Learning University. What are the key short-term and long-term goals of that initiative? 

JS: The goals of the initiative are to make machine learning more accessible and user friendly so that any developer anywhere in the world can use it as another tool in application building. I've been in this space a long time and haven’t seen Machine Learning Univeristyadvances in ease of use in generations. In the short term, we are empowering our partners and early customers of Azure Machine Learning with Machine Learning University. In the long term, we expect developers, data scientists, and students to get practical training for Azure Machine Learning through this avenue.

AR: Q7. You have been involved with Machine Learning for a long time. According to you what are the key reasons that we are currently witnessing an immense interest in the field of Machine Learning (which is more than half a century old)?

Machine LearningJS: The explosion of data has a lot to do with it. And the growth in the number of scenarios that benefit from machine learning – from fraud detection to search ranking and ad targeting online. Software like R has also helped empower a lot of data scientists.

While machine learning has been around for a long time, usage was primarily restricted to people with deep skills and deep pockets. The cloud changes this dynamic completely.

Now, you can run compute for pennies on the dollar on the cloud and connect to systems and services that were previously stand-alone. The explosion of online data opens up opportunities for our customers to glean insights and make good choices to improve their business.

AR: Q8. With the focus of Big Data conversations shifting from "Promising Potential" to "Delivered Value", what are your thoughts on the current maturity level of Big Data solutions? What major obstacles have been conquered and what are the key challenges (or opportunities) in near future?

data science economyJS: I believe we are in the early phase of creating the infrastructure to handle big data. We are currently in the phase where we are analyzing current data; the next phase will be about predicting future outcomes, which is where solutions such as Azure Machine Learning come in. I recently gave a keynote at the Strata conference in New York about the emerging data science economy. My hypothesis is that we will eventually have millions of analytic APIs in the cloud, just as there is an application economy today with millions of apps. It will be ultimately up to the data scientists and advanced analytic developers themselves to embrace easy cloud tools like Azure Machine Learning and pursue this possibility to make it real.

AR: Q9. You are aggressively hiring for talented scientists and engineers for Machine Learning and Data Science. Besides technical acumen, what are the key skills that you are looking for? What characteristics of an individual help you identify whether the person would be a right fit for your team? Skills

JS: The key skills are a real understanding of data science, a great ability to develop services in the cloud, and facility with tools such as R, Python and Hadoop. I also think a key to success on our team, and really for any emerging market like this, is customer focus and a bias for action. I want everyone on my team, not just the data scientists, but sales, marketing, everyone, to be asking, “What is the customer problem we are really solving?” Then, have a strong bias for action, for building prototypes and iterating with the customer towards an end-solution in an agile manner. Running Lean

AR: Q10. On a personal note, what are the books that you have been reading lately and would like to recommend?

JS: A book I really liked recently is “Running Lean: Iterate from Plan A to a Plan That Works” by Ash Maurya. It’s a great book about an iterative and lean approach to building a successful startup. The agile methodologies mentioned in that book are a great guide to any product innovator.