Survey: Machine Learning Projects Still Routinely Fail to Deploy

Eric Siegel highlights the chronic under-deployment of ML projects, with only 22% of data scientists saying their revolutionary initiatives usually deploy, and a lack of stakeholder visibility and detailed planning as key issues, in his industry survey and book "The AI Playbook."



How often do machine learning projects reach successful deployment? Not often enough. There's plenty of industry research showing that ML projects commonly fail to deliver returns, but precious few have gauged the ratio of failure to success from the perspective of data scientists – the folks who develop the very models these projects are meant to deploy.

Following up on a data scientist survey that I conducted with KDnuggets last year, this year's industry-leading Data Science Survey run by ML consultancy Rexer Analytics addressed the question – in part because Karl Rexer, the company’s founder and president, allowed yours truly to participate, driving the inclusion of questions about deployment success (part of my work during a one-year analytics professorship I held at UVA Darden).

The news isn't great. Only 22% of data scientists say their "revolutionary" initiatives – models developed to enable a new process or capability – usually deploy. 43% say that 80% or more fail to deploy.

Across all kinds of ML projects – including refreshing models for existing deployments – only 32% say that their models usually deploy.

Here are the detailed results of that part of the survey, as presented by Rexer Analytics, breaking down deployment rates across three kinds of ML initiatives:
 

Survey: Machine Learning Projects Still Routinely Fail to Deploy

 

Key:

  • Existing initiatives: Models developed to update/refresh an existing model that's already been successfully deployed
  • New initiatives: Models developed to enhance an existing process for which no model was already deployed
  • Revolutionary initiatives: Models developed to enable a new process or capability

 

The Problem: Stakeholders Lack Visibility and Deployment Isn't Fully Planned For

 

In my view, this struggle to deploy stems from two main contributing factors: endemic under-planning and business stakeholders lacking concrete visibility. Many data professionals and business leaders haven’t come to recognize that ML’s intended operationalization must be planned in great detail and pursued aggressively from the inception of every ML project.

In fact, I've written a new book about just that: The AI Playbook: Mastering the Rare Art of Machine Learning Deployment. In this book, I introduce a deployment-focused, six-step practice for ushering machine learning projects from conception to deployment that I call bizML.

An ML project’s key stakeholder – the person in charge of the operational effectiveness targeted for improvement, such as a line-of-business manager – needs visibility into precisely how ML will improve their operations and how much value the improvement is expected to deliver. They need this to ultimately greenlight a model's deployment as well as to, before that, weigh in on the project's execution throughout the pre-deployment stages.

But ML's performance often isn't measured! When the Rexer survey asked, "How often does your company / organization measure the performance of analytic projects?" only 48% of data scientists said "Always" or "Most of the time." That's pretty wild. It ought to be more like 99% or 100%.

And when performance is measured, it's in terms of technical metrics that are arcane and mostly irrelevant to business stakeholders. Data scientists know better, but generally don’t abide – in part since ML tools generally only serve up technical metrics. According to the survey, data scientists rank business KPIs like ROI and revenue as the most important metrics, yet they list technical metrics like lift and AUC as the ones most commonly measured.

Technical performance metrics are “fundamentally useless to and disconnected from business stakeholders,” according to Harvard Data Science Review. Here’s why: They only tell you the relative performance of a model, such as how it compares to guessing or another baseline. Business metrics tell you the absolute business value the model is expected to deliver – or, when evaluating after deployment, that it has proven to deliver. Such metrics are essential for deployment-focused ML projects.

 

The Semi-Technical Understanding Business Stakeholders Need

 

Beyond access to business metrics, business stakeholders also need to ramp up. When the Rexer survey asked, "Are the managers and decision-makers at your organization who must approve model deployment generally knowledgeable enough to make such decisions in a well-informed manner?" only 49% of respondents answered "Most of the time" or "Always."

Here's what I believe is happening. The data scientist's "client," the business stakeholder, often gets cold feet when it comes down to authorizing deployment, since it would mean making a significant operational change to the company's bread and butter, its largest scale processes. They don't have the contextual framework. For example, they wonder, "How am I to understand how much this model, which performs far shy of crystal-ball perfection, will actually help?" Thus the project dies. Then, creatively putting some kind of a positive spin on the "insights gained" serves to neatly sweep the failure under the rug. AI hype remains intact even while the potential value, the purpose of the project, is lost.

On this topic – ramping up stakeholders – I'll plug my new book, The AI Playbook, just one more time. While covering the bizML practice, the book also upskills business professionals by delivering a vital yet friendly dose of semi-technical background knowledge that all stakeholders need in order to lead or participate in machine learning projects, end to end. This puts business and data professionals on the same page so that they can collaborate deeply, jointly establishing precisely what machine learning is called upon to predict, how well it predicts, and how its predictions are acted upon to improve operations. These essentials make or break each initiative – getting them right paves the way for machine learning’s value-driven deployment.

It’s safe to say that it’s rocky out there, especially for new, first-try ML initiatives. As the sheer force of AI hype loses its ability to continually make up for less realized value than promised, there'll be more and more pressure to prove ML's operational value. So I say, get out ahead of this now – start instilling a more effective culture of cross-enterprise collaboration and deployment-oriented project leadership!

For more detailed results from the 2023 Rexer Analytics Data Science Survey, click here. This is the largest survey of data science and analytics professionals in the industry. It consists of approximately 35 multiple choice and open-ended questions that cover much more than only deployment success rates – seven general areas of data mining science and practice: (1) Field and goals, (2) Algorithms, (3) Models, (4) Tools (software packages used), (5) Technology, (6) Challenges, and (7) Future. It is conducted as a service (without corporate sponsorship) to the data science community, and the results are usually announced at the Machine Learning Week conference and shared via freely available summary reports.
 

This article is a product of the author’s work while he held a one-year position as the Bodily Bicentennial Professor in Analytics at the UVA Darden School of Business, which ultimately culminated with the publication of The AI Playbook: Mastering the Rare Art of Machine Learning Deployment.

 
 

Eric Siegel, Ph.D. is a leading consultant and former Columbia University professor who helps companies deploy machine learning. He is the founder of the long-running Machine Learning Week conference series, the instructor of the acclaimed online course 'Machine Learning Leadership and Practice – End-to-End Mastery,' executive editor of The Machine Learning Times, and a frequent keynote speaker. He wrote the bestselling Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, which has been used in courses at hundreds of universities, as well as The AI Playbook: Mastering the Rare Art of Machine Learning Deployment. Eric’s interdisciplinary work bridges the stubborn technology/business gap. At Columbia, he won the Distinguished Faculty award when teaching the graduate computer science courses in ML and AI. Later, he served as a business school professor at UVA Darden. Eric also publishes op-eds on analytics and social justice.