The Absolute Basics of MLOps

This article is for people who don’t know a thing about MLOps or want to refresh their memory.



The Absolute Basics of MLOps
Image by Author

 

This article is for people who don’t know a thing about MLOps or want to refresh their memory. You’ve probably been hearing about MLOps while you’re scrolling through LinkedIn, reading blogs, looking at AI conferences, etc.

Let’s get started.

 

What is MLOps?

 

MLOps stands for machine learning Operations and is a combination of machine learning, DevOps, and Data Engineering. For the point of this article, I will define each.

machine learning allows models to learn and improve using past experience by exploring the data and identifying patterns with little human intervention. 

DevOps is the combination of developing and building the right operations to increase the efficiency, speed, and security of software development 

Data Engineering focuses on the design and building of pipelines that can transform and transport data into a format so that it can be reached by other tech experts such as Data Scientists or other end users.

These 3 come together in order to deploy and maintain machine learning systems in a reliable and efficient way.

 

What is the Aim of MLOps?

 

When working with models, it can get messy. It can be time-consuming to run, it can get problematic when you’re trying to reach other members of your team, and can cause major problems in communication.

 

Brings Data Scientists into the Picture

 

Data Scientists sometimes don’t work with other tech experts on the team. Their roles and responsibilities differ and they sometimes don’t need to ever communicate. However, over time as models have been developed, collaboration with Data Scientists is imperative as they are responsible for curating, cleaning, and gathering insight from the datasets which is then further used in the building of AI models. 

 

Collaboration

 

When you include all tech experts on the team to come together to collaborate on a project, you will naturally see an increase in the development of your model. There will be faster deployment times, better management of the model, and validation - all due to the fact that different skills were brought together. 

 

Manage your ML Lifecycle

 

With the right MLOps architecture in place, you and other experts on the team are able to track, version, re-use, and audit every aspect or asset of your machine learning model lifecycle. This not only improves its reliability in an efficient way but also provides transferable knowledge that can be applied in the future.

 

Phases of MLOps

 

The MLOps process includes three broad phases:

  1. Designing the ML-powered application
  2. ML Experimentation and Development
  3. ML Operations

 

Designing the ML-powered Application

 

This phase is the start of every project - an understanding of the problem at hand or what is trying to be solved. During this phase, you will give yourself a better business understanding, and then move on to understanding the data to then determine how to design the ML-powered application.

The key components for this phase of MLOps are:

  • Gathering your data
  • Data analysis
  • Data preparation
  • Model development 
  • Model training

During the design phase of the model, you will be looking at what data you have available, its limitations, and the functions of our ML model. These will act as the building blocks to help design the architecture of the ML application and ensure we’re one step closer to solving our problem. 

 

ML Experimentation and Development

 

Now we move on to the next phase which is purely focusing on verifying the validity of our ML application. It is advisable to apply the Proof-of-Concept for ML Model method during this stage. The Proof-Of-Concept method is used in order to help with the validation process by further examining scalability, technical abilities, limitations, etc. 

The key components for this phase of MLOps are:

  • Model validation 
  • Model serving 
  • Model monitoring 
  • Model re-training

These components will help us identify which machine learning algorithm is most suitable for our problem at hand which is stable enough to be able to run smoothly during production.

 

ML Operations

 

The last phase is all about delivering the machine learning model into production. In order to do this, there need to be certain DevOps practices that need to be established.

The key components for this phase of MLOps are:

  • Testing
  • Versioning
  • Monitoring

 

Implementing MLOps

 

There are 3 levels in which you can implement MLOps:

  1. Manual process
  2. ML pipeline automation
  3. CI/CD pipeline automation

 

Manual Process

 

This process is entirely Data Scientist driven, therefore it is a manual process. If your models rarely change/trained or you are at the start of implementing ML - this process is probably for you. 

It is a very experimental and iterative process due to its manual nature - therefore every phase such as data transformation, validation, model testing, and training are all done manually. The most typical tools used in manual processes are Jupyter Notebooks.

 

ML Pipeline Automation

 

This includes a process that does not require manual execution - your model needs to be trained automatically. During this process, when there is new data available, the pipeline is aware and triggers a response for the model to be retrained. This is known as continuous training.

This process can be adopted by operations that reside in a constantly changing environment and continuously require these indicators to be addressed. 

 

CI/CD Pipeline Automation

 

This stage provides the most well-performing and fastest reliable ML model deployments in which you require a robust automated CI/CD system. With this CI/CD system put in place, it allows your data scientists and other tech experts to dig deeper to explore new ideas around feature engineering, model architecture, and hyperparameters. 

The difference between this process and the process previously is that data, the machine learning model and all its training pipeline components are built, tested and deployed automatically. 

 

Conclusion

 

This article was to help you understand the absolute basics of MLOps: what it is, why it is used, the key concept, and how you can implement it. I hope this was a good and easy breakdown!

 
 
Nisha Arya is a Data Scientist and Freelance Technical Writer. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.