Predictive Analytics Marketplaces: Viable Solution to the Data Scientist Shortage Crisis?
We examine strengths and weaknesses of several approaches for alleviating the shortage of data scientists.
The current shortage of data scientists is a widely discussed and well documented topic. It was thrust into the limelight with the May 2011 Mckinsey Global Institute report, which claimed that “by 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions”.
These estimates may yet turn out to be on the higher side (read the recent KDnuggets article “How Many Data Scientists are out there?” for an in-depth analysis), but the shortage is real. Sensing an opportunity, the industry has responded with two approaches:
Approach One: Create More Data Scientists
This is the more straight forward approach. Many universities and training schools have begun to offer degree and certificate programs in Data Science.
See KDnuggets Analytics, Data Minining, Data Science Certificates for a comprehensive list. Further, the catalogs of companies offering massive open online courses (MOOCs) such as Udemy, Pluralsite, Coursera and Udacity now include courses in Data Science.
Approach Two: Predictive Analytics Tools for Non-Data Scientists
Many companies (startups as well as established ones) are offering tools that are trying to empower semi-technical business users to do predictive analytics on their own i.e. unassisted by professional data scientists. These tools are targeted at users who do not have formal training or expertise in areas such as statistical analysis, data mining, predictive analytics and machine learning, but are reasonably proficient with Microsoft Excel like solutions.
These companies often state their mission as “democratizing predictive analytics” and refer to their offerings variously as the “last mile of business analytics” or “Excel for data science” or “self-service predictive analytics”. In all cases they all have the same fundamental approach to solving the data scientist shortage issue: swell the population of those capable of doing data science by converting semi-technical business users into data scientists through tools that have a shorter learning curve.
While the two listed approaches are indeed doing their part to address the data science shortage issue, there are not sufficient by themselves. We need to explore additional approaches.
Approach Three: Make Data Scientists More Productive with Predictive Analytics Marketplaces
Marketplaces, enable data scientist to promote, showcase and sell pre-built, proven predictive analytics solutions to a global community of buyers. At the risk of oversimplification, one can think of these marketplaces as ‘Apple iTunes’ or’ Google Play’ specifically optimized, and engineered, for predictive analytic solutions. This KDnuggets article lists some of the marketplaces currently operating.
Build Once, Reuse Many Times
The current situation is that data scientists build one-off, custom models from scratch every time. This is truer for consultants who build custom solutions for their customers. A data scientist will often build from scratch, even if he or she had built a similar model, for a similar business problem, for another customer previously. As a result over the course of their careers, much of the work of data scientists is highly duplicative making data scientists inefficient when considering their overall productivity.
A prediction marketplace enables a data scientist to create a model just once and then repeatedly customize the model, rather than start from scratch each time. With an efficient marketplace, over the same period, a data scientist can build many more different models for different customers, compared to repeatedly recreating the same models from scratch.
We also expect that a marketplace will enable data scientists to more easily monetize their intellectual property which hopefully will encourage more data scientists to enter the marketplace to produce more innovative predictive analytics solutions.
Reduce Risk of Failure
The efforts of data scientists are largely considered to be wasted when their prediction models fail to deliver on the desired business value. According to some experts almost half of all predictive analytics projects suffer this fate. Predictive analytics marketplaces can significantly reduce this risk by enabling data scientists to start with a proven pre-built model that has fewer unknowns than a model built entirely from scratch.
Predictive Analytics Marketplaces: Can They Really Make a Difference?
At this early stage it is difficult to know how much of an impact predictive analytics marketplaces can have in addressing the data scientist shortage issue. A lot depends on whether:
- Data scientists can build reusable prediction models i.e. models that are tolerant to deviations in the datasets and therefore can be used with datasets that differ from each other slightly, without requiring extensive customization.
- Marketplaces can develop sustainable business models.
- Marketplaces are accepted by the global community of data scientists, which depends partly on marketing, evangelism and execution.
It is early days and we have ways to go, but the journey promises to be exciting and certainly worth the effort.
Madhu M. Reddy is the CEO & Founder of Snap Analytx. Previously, he held senior positions at Microsoft, Oracle, and HP.