How to Detect and Overcome Model Drift in MLOps
This article has a look at model drift, and how to detect and overcome it in production MLOps.
Machine learning (ML) is widely regarded as the cornerstone of digital transformation, yet ML models are the most susceptible to the changing dynamics of a digital landscape. ML models are defined and optimized by the variables and parameters available at the time period in which they are created.
Let us look at the case of an ML model created to track spam emails based on a generalized template of spam emails that may have been proliferating at the time. With this baseline in place, the ML model is able to identify and stop these sorts of emails, thus preventing potential phishing attacks. However, as the threat landscape changes and cybercriminals become smarter, more sophisticated and realistic emails have replaced the old ones. When faced with these newer hacking attempts, ML detection systems operating on variables from prior years will be ill-equipped to classify these new threats properly. This is just one example of Model Drift.
Model Drift (or model decay) is the degradation of an ML model’s predictive ability. Caused by changes in the digital environment, and the subsequent changes in variables such as concept and data, model drift is prominent in ML models simply by the nature of the machine language model as a whole.
The assumption that all future variables will remain the same as those that were prevalent when the ML model was created is a fertile breeding ground for model drift in MLOps.
For instance, if a model runs in a static environment using static data then its performance shouldn’t degrade as the data predicted is coming from the same distribution which was used during training. However, if the model exists in a constantly changing and dynamic environment, involving too many variables, then the model performance is expected to differ as well.
Types of Model Drift
Model drift can be broadly classified into two main types based on changes in either the variables or the predictors: Concept drift and Data Drift respectively.
1. Concept Drift – When the statistical attributes of target variables in a model change, concept drift occurs. Simply put, if the very nature of the model’s variables change then the model cannot function as intended.
2. Data Drift – The most common type of model drift, occurs when the statistical properties of certain predictors change. As the variables change, the model is subject to failure as a result. A model that might work during one time period might not see the same efficacy when applied to a different environment, simply because the data is not tailored to the changing variables.
In the tussle between concept drift vs data drift, upstream data changes also play a prominent role in model drift. As all the requisite data moves through a data pipeline, features not being generated and changes in units (such as measurements) can also lead to missing values which will hamper ML model operations.
Addressing Model Drift
Early detection of model drift is critical when it comes to maintaining model accuracy. This is because the model accuracy decreases as time passes and the predicted values continue to deviate further from the actual ones. The further this process goes, the more irreplaceable damage is done to the model as a whole. Hence, catching the problem early on is essential. The F1 score, where both the precision and recall capabilities of the model are evaluated for their accuracy is a quick way to detect if anything is awry.
Similarly, based on the model’s purpose, a variety of other metrics will matter depending on the situation. An ML model designed for medical usage will require a different set of metrics when compared to an ML model designed for business operations. However, the end result is the same: whenever a specified metric drops below a set threshold, then there’s a high chance that model drift is happening.
However, in several cases measuring the accuracy of a model isn’t possible – particularly when there is difficulty in getting the predicted and actual data, which remains one of the major challenges in scaling ML models. In this case, refitting models based on past experiences can help to create a predictive timeline for when drift might occur in a model. With this in mind, the models can be redeveloped at regular intervals to deal with an impending model drift.
Keeping the original model intact can also be used as a baseline from which new models can be created that will improve and correct the predictions of the previous baseline model.
However, when the data changes with time, weighing data based on current changes can be important. By ensuring that the models give more weight to recent data changes, vs less weight to older ones, the ML model will become more robust and build a neat little database to manage potential future drift-related changes.
Creating Sustainable Machine Learning Models
There is no catch-all method to ensure that model drift is detected and addressed in a timely manner. Whether it is through scheduled model retraining or through real-time machine learning; creating a sustainable machine learning model is a challenge in and of itself.
However, the advent of MLOps has simplified the process of retraining models more often and within shorter intervals. It has enabled data teams to automate the model retraining and the most surefire approach to trigger the process is by scheduling. With automated retraining, companies can fortify the existing data pipeline with new and fresh data within a specific time frame. The good thing is that it doesn’t require any specific code changes or rebuilding of the pipeline. If a company, however, discovers a new feature or algorithm which was not previously available during model training, then including it while deploying the retrained model can significantly enhance model accuracy.
When deciding the frequency with which models need to be retrained, there are several variables to consider. Sometimes waiting for the problem to show itself becomes the only real option (particularly if there is no past history to work with going forward). In other situations, models should be retrained based on patterns tied to seasonal changes in variables. What remains constant in this sea of change, however, is the importance of monitoring. Regardless of schedules or the business domains; constant monitoring at regular intervals is and always will be the best way to detect model drift.
While the challenges of managing, detecting, and addressing model drift across thousands of machine learning models may seem daunting, Machine Learning Operationalization Solutions from service providers such as Sigmoid can give you the edge you need to face these issues head-on. Sigmoid’s MLOps practice provides the right mix of data science, data engineering, and DataOps expertise, required to operationalize and scale machine learning to deliver business value, and build an effective AI strategy.
To know more about how we help data-driven companies to accelerate time to business value for AI projects and overcome the challenges of model drift, click on the link here.
Bio: Bhaskar Ammu is a Senior Data Scientist at Sigmoid. He specializes in designing data science solutions for clients, building database architectures, and managing projects and teams.
Original. Reposted with permission.
- MLOps is an Engineering Discipline: A Beginner’s Overview
- Easy MLOps with PyCaret + MLflow
- When to Retrain an Machine Learning Model? Run these 5 checks to decide on the schedule