Software Interfaces for Machine Learning Deployment
While building a machine learning model might be the fun part, it won't do much for anyone else unless it can be deployed into a production environment. How to implement machine learning deployments is a special challenge with differences from traditional software engineering, and this post examines a fundamental first step -- how to create software interfaces so you can develop deployments that are automated and repeatable.
By Luigi Patruno, Data Scientist and the founder of ML in Production.
In our previous post on machine learning deployment, we introduced what it means to deploy a machine learning model. We learned that in order to make the predictions from a trained model available to users and other software systems we need to consider a number of factors including how frequently predictions should be generated and whether predictions should be generated on a single sample of data or batch of samples at a time. In this post, we’ll begin to examine how to implement the deployment process.
Whereas many blog posts rush directly to implementing Flask APIs or using workflow schedulers, we’re going to start at a more fundamental level. We’ll begin by discussing software interfaces, which can be thought of as the boundaries between pieces of software. An analogy is that a piece of software is a puzzle piece, and an entire software system is the completed puzzle. When properly designed, interfaces allow you to connect many different software components, leading to large and complex projects.
In terms of ML deployment, well-constructed interfaces facilitate reproducible, automated, plug-and-play deployments. A good interface lets you easily roll out model updates, version control the models you deploy, and more.
Let’s get started!
What’s an Interface?
Imagine a manager who assigns an employee the task of creating a report. A good manager might say: "I need you to produce a report with the following charts and figures. To produce that report, use customer transaction data." The manager has explicitly defined the desired outcome (the report) and hinted at a methodology (use of the customer transaction data).
In contrast, a bad manager might do any of the following:
- Not specify the input – Ask for the report but not specify which data to use or hint at whom the employee should speak with to discover appropriate datasets.
- Not make the deliverable clear – Give the employee a bunch of data but not tell the employee what should be produced.
- Micromanage – Tell the employee what tools to use to produce the report, what steps to follow, and promise him that any deviance from this plan will be met by swift and firm punishment.
Software interfaces are like managers. A good interface explicitly states the necessary inputs and the output it produces. For example, an interface implemented as a function will list all required arguments and what’s returned by the function. Interfaces can be thought of as the "boundaries" between separate chunks of software that define how different pieces of software communicate with one another. When interfaces are constructed well, different software, even software written by developers working on separate teams or companies, can communicate and work in tandem.
Software engineers are taught to focus on the interfaces they develop rather than how the functions are implemented. The implementation is important, but can always be updated. But it’s significantly harder to update an interface after it’s released, especially if your interface is external facing. Therefore time invested defining an interface is time well spent.
A Basic Interface for Machine Learning Models
How would a software engineer think about what a machine learning model actually does? Abstractly speaking, a model accepts data, acts on that data in some way, and then returns a result. It’s really that simple. How the model is acting on the data could be incredibly involved, like a forward pass of a convolutional neural network applying convolutions to tensors of image data, but these are implementation details.
The boundary of a machine learning model is made up of the inputs to the model, i.e. the features, and the output(s) the model predicts. Therefore a well-constructed interface must be built with both the input features and predicted outputs in mind. To illustrate, let’s define this interface with a simple function:
This function takes as its input a model and a set of input_features and returns a prediction. Notice that we haven’t implemented the function, i.e., we haven’t written how the function combines the model and the features to generate the prediction. We’ve simply created a contract or a promise – we guarantee the function will return a prediction if the caller provides a model and input_features.
Multiple Interfaces for Machine Learning Models
The predict() method we defined accepts a single feature vector and returns a single prediction. How do we know this? The documentation states that input_features is a numpy array of shape (1, n) where n is the dimension of the feature vector. This is great if your model expects to predict a single instance at a time, but not so great if the model is also expected to predict on batches of samples. You could work around this by writing for-loops, but it’s unlikely that a loop will be very efficient. Instead, we should define another method that directly handles the batch case. Let’s call it predict_batch:
This method defines a contract whereby it promises to return a batch of predictions if a model and batch of input features are provided. Again, we haven’t implemented the method – that’s left to the developer of the method. The developer may choose to use a loop and call predict over and over. Or the developer may do something else. This is irrelevant for the purposes of deployment. What does matter is that we have 2 interfaces: one that predicts a sample and another that predicts a batch of samples.
Machine Learning Object Oriented Programming – MLOOP
So far we’ve ignored the model parameter required by both the predict and predict_batch methods. Let me explain why this is problematic for machine learning.
Most engineers developing machine learning models today want to use the best tool available. If the engineer is building a classic model, like logistic regression or random forest, the engineer might choose to use scikit-learn. But for deep learning, that engineer might choose to use Tensorflow or PyTorch. Even within classical ML, the engineer may opt for the xgboost implementation of gradient boosted trees. The model objects from each library have slightly different APIs. And we can’t predict what APIs future ML libraries will implement. This would make the implementations of our interfaces very messy. For instance, we DO NOT want our implementation to look like this:
This implementation would be hard to maintain and would make it difficult to debug runtime errors. Also, imagine what would happen if we wanted to pass additional parameters to predict when using one model but not another. For instance, what if we wished to pass additional parameters, only predicting with an sklearn model. The number of arguments to the function would grow, but these parameters would be useless for non-sklearn models. How would we describe that in the documentation? These are just a few reasons why object-oriented programming, creating classes and objects, is preferred.
Our interface is composed of two methods: predict and predict_batch. Let’s define a base class with these two methods:
This base class acts as a template for our data science team. If a data scientist wants to use scikit-learn models, he just needs to subclass the Model class and implement the necessary methods. If another data scientist wants to use Tensorflow, no problem, just create a Tensorflow subclass! To illustrate, let’s create the sklearn subclass:
Since sklearn Predictors expect 2D input, we reshaped the input_features argument in the predict method. This is a key benefit of the object-oriented approach. We can define the interface that is relevant for the types of problems we’re solving AND take advantage of excellent 3rd party machine learning libraries!
And the benefits don’t stop there. We can add additional methods that simplify our ML workflows. For example, once a model has been trained, we typically need a way to serialize the model and then deserialize it at inference time. Hence, we can add two methods, serialize() and deserialize(), to our interface. We can even provide default implementations of these methods in the base Model class and create library-specific implementations in the subclasses.
Additional examples of useful interface methods include moving serialized models from a local filesystem to some model store or remote filesystem like S3. There’s no limit to the methods you can add.
Creating good interfaces upfront will save your machine learning team A LOT of time as you take on additional projects by making deployments automated and repeatable.
Original. Reposted with permission.
Bio: Luigi Patruno is a data scientist and machine learning consultant. He is currently the Director of Data Science at 2U, where he leads a team of data scientists responsible for building machine learning models and infrastructure. As a consultant, Luigi helps companies generate value by applying modern data science methods to strategic business and product initiatives. He founded MLinProduction.com to collect and share best practices for operationalizing machine learning and he's taught graduate courses in statistics, data analysis, and big data engineering.
- What Does it Mean to Deploy a Machine Learning Model?
- MLOps for production-level machine learning
- Deployment of Machine learning models using Flask