Published: May 28, 2023

What is MLOps?

An intro to Machine Learning Operations


Just like Software Engineering and DevOps building Machine Learning systems is a process. Once you progress past trivial or academic uses of ML models and into real production uses; deploying, updating, maintaining, and monitoring these models becomes necessary and important. The people, processes, and tech involved in the above steps are referred to as Machine Learning Operations, or MLOps. If “deploying, updating, maintaining, and monitoring” sounds similar to DevOps it’s because it is!! In fact Microsoft defines MLOps as:

the application of DevOps principles to AI-infused applications

Purpose

The purpose of an Operational Process (like a Software Development Lifecycle, or DevOps) is to make the creation and update of software (or in this case ML models): easier, faster, more resilient, and more secure. Software is rarely created once and never touched again. Nor should ML models be deployed and never updated. In each case, environments change, businesses change, data changes, and our systems must change as well. Additionally, and in most cases, it is not an individual involved in these changes but rather a team. So, having a process in place to manage change and create high quality results is key.

Like software engineering, creating ML systems is complex:

  • Data needs to be gathered, analyzed, and prepared.
  • Models need to be developed, trained, and evaluated.
  • Models need to be deployed, monitored, and updated.
  • Software systems need to USE the models.

There are:

  • Data Engineers
  • Data Scientists
  • ML Engineers
  • Software Engineers

all at different points in different processes. Operational consideration NEEDS to be applied.

Process

Just like DevOps, automated processes, or pipelines, are used to reduce lift and return deterministic results. These pipelines are often some combination of cloud resources, configurations, and code. In MLOps, pipelines can process data, extract features, tune hyperparameters, evaluate models, deploy the model upon completion, and monitor/alert throughout.

In future blog posts, I will cover pipelines and the Python code to build them, cloud resources like Sagemake and Azure Machine Learning, and even IaC tools to create the resources.

Outcome

DevOps is employed to increase the [security, resiliency, speed, scale, …] of software. With good MLOps, models can be effectively created, efficiently scaled, and securely used resulting in better application of ML in software.