Setting up an MLOps pipeline with Microsoft Azure

How to build end-to-end CI/CD pipeline for your ML workflows by leveraging Microsoft’s Azure Machine Learning platform

Published in

Heartbeat

10 min readJan 3, 2022

With ML as a Service (MLaaS) platforms increasingly becoming a common phenomenon rather than the novelty they once were, it has become easier for smaller scale enterprises and the individual to build and deploy productive ML solutions instead of having a dedicated team of data scientists or machine learning engineers to that cause.

In this article, let’s take a look at Microsoft’s Azure Machine Learning service and its vast multitude of integrated tools and solutions¹ to help you or your team develop and deploy the perfect ML workflows.

Introduction

MLOps is the collection of developmental practices that aims to deploy and maintain machine learning models in a production setting efficiently and reliably. Just as DevOps aims to shorten a software system’s development life cycle and provide continuous delivery with high software quality, MLOps is a parallel of DevOps in the ML world² , meaning that the software system here has one or multiple machine learning oriented components.

An efficiently implemented DevOps cycle can enable a team of developers to deploy new features faster and thus finish their projects faster while cutting out on drawbacks such as manual tasks, inability to test, and risky deployments. DevOps integrates development and operations by using CI/CD (continuous delivery/continuous deployment) workflows that facilitate super-fast deployment times by continuously delivering code into production, and ensuring an ongoing flow of new features and bug fixes via efficient delivery methods.

In the CI part, incremental code changes are made frequently and reliably while automation of build-and-test workflows ensure that code changes being merged into a repo are reliable. The production code is then delivered quickly and seamlessly as a part of the CD process. An example of a CI/CD workflow could be something as simple as a code review being done between teams to enable collaboration³.

In contrast, an MLOps pipeline usually integrates additional steps that are concerned with the training and testing of a machine learning model such as:

· Data collection

· Feature engineering

· Model training and optimization

· Endpoint deployment and monitoring

However, the basic traits of a DevOps pipeline can be applied to any machine learning application workflow. Some subtle differences would be that validation checks for performance of the model in terms of accuracy rather than bugs in the usage of the software.

There are more moving parts that developers need to keep track of and they are mostly associated with the underlying data. The architecture of an MLOps system usually includes data science tools or analytical engines which perform ML computations⁴, orchestrating the movement of machine learning models, data and outcomes between the systems.

“While code is carefully crafted in a controlled development environment, data comes from that unending entropy source known as ‘the real world.’” ⁵

Although, it’s easy to emphasize that MLOps is essentially a specific implementation of DevOps oriented towards machine learning based workflows, it is still important to consider that the two face a vast gap. This is because, unlike a traditional DevOps pipeline which deals mainly with carefully structured/doctored/curated code, we also have the “data” component to deal with in an ML-infused world — and we all know how messy that can tend to get.

Azure Machine Learning

Azure Machine Learning or Azure ML is a cloud-based service by Microsoft that gives developers the tools and resources for creating and managing machine learning solutions⁶. Touted as a “Enterprise-grade machine learning (ML) service for the end-to-end ML lifecycle,” it forms a significant part of Microsoft’s Azure cloud that offers data scientists and ML engineers a highly integrated development environment with a variety of tools to build, deploy and manage ML models.

Some of the special capabilities you can leverage in Azure are:

Integrated notebooks — The latest release brings features such as compute and kernel switching, offline notebook editing and instant Git source control.

Drag-and-drop machine learning — An easy to use UI helps perform activities such as data transformation, model training, and evaluation for easy creation and publishing of machine learning pipelines.

Automation of ML workflows — Using model interpretability to understand how the model was built and reverse engineer the ML lifecycle if required.

Scalable Reinforcement learning — Scale reinforcement learning to powerful compute clusters, support multiple-agent scenarios and access open-source reinforcement learning algorithms, frameworks, and environments.

Git and GitHub — Using Git integration to track work and GitHub Actions support to implement ML workflows.

Autoscaling compute — Using scalable compute to distribute training and to rapidly test, validate and deploy models while sharing CPU and GPU clusters across a workspace to automatically scale machine learning needs.

Hybrid and multicloud support — Can run machine learning on existing Kubernetes clusters on premises, in multicloud environments and at the edge with tools such as Azure Arc.

Enterprise-grade security — Azure also features new features to build and deploy models more securely with network isolation and end-to-end private IP capabilities, role-based access control for resources and actions, custom roles, and managed identity for compute resources.

MLOps in Azure Machine Learning

Azure Machine Learning makes use of multiple ML pipelines to stitch together all the steps involved in your model training process. An ML pipeline can contain any number of steps from data preparation to feature extraction to hyperparameter tuning to model evaluation. Azure Machine Learning also supports reusable software environments that allow you to track and reproduce your projects’ software dependencies as they evolve. These also ensure that builds are reproducible without manual software configurations.

Azure ML introduces model registration and allows you to store and version your models in the Azure cloud, in your workspace. A registered model is a logical container for one or more files that make up your model. For example, if you have a model that is stored in multiple files, you can register them as a single model in your Azure Machine Learning workspace. After registration, you can then download or deploy the registered model and receive all the files that were registered.

Azure also supports model profiling that helps you to predetermine the CPU and memory requirements of the service that will be created when you deploy your model. Profiling tests the service that runs your model and returns information such as the CPU usage, memory usage, and response latency. It also provides a CPU and memory recommendation based on the resource usage.

Lets take a look at some standard capabilities that Azure provides for MLOps:

Reusable ML pipelines — Machine Learning pipelines allow you to define repeatable and reusable steps for processes such as data preparation, training, and scoring processes.

Reusable software environments for training and deploying models.

Model portability — You can register, package, and deploy models from anywhere. i.e. device/platform independent ML. You can also track associated metadata required to use the model.

Notify and alert on events in the ML lifecycle — For example, experiment completion, model registration, model deployment, and data drift detection.

Monitor ML applications for operational and ML-related issues — You can compare model inputs between training and inference, explore model-specific metrics, and provide monitoring and alerts on your ML infrastructure.

Automate the end-to-end ML lifecycle with Azure Machine Learning and Azure Pipelines — Using pipelines allows you to frequently update models, test new models, and continuously roll out new ML models alongside your other applications and services.

The full-scale ML Ops cycle (Source : Microsoft)

Building your own E2E pipeline with CI/CD using Azure ML

Let us go over a simple tutorial where you can learn how to use the CI/CD workflow to build your very own ML model⁷. This little tutorial will cover two basic parts of an MLOps cycle:

Creating an Azure Machine Learning workspace in Azure
Setup an end-to-end demo pipeline in Azure DevOps

To get started, head over to the Azure ML Github repository and create a fork into your own account. This will allow you to experiment and make changes.

Most projects fail before they get to production. Check out our free ebook to learn how to implement an MLOps lifecycle to better monitor, train, and deploy your machine learning models to increase output and iteration.

Step 1: Creating an Azure Machine Learning workspace

A workspace can be thought of as a container for Azure ML assets such as compute, storage, data, models, etc. If you already have an Azure account, you can create a new workspace by heading over to the creation page in the Azure Portal. Feel free to pick a region and workspace edition of your choice.

Step 2: Create and set up a new project in Azure DevOps

Navigate to Azure DevOps and sign-in. If it’s your first time, you might be asked to create a new organization. Once created, you can create a new project and give it a name of your choice:

Next, you need to connect Azure DevOps to your Azure subscription, to be more precise, to your Machine Learning workspace’s resource group. For this, go to the Project Settings and select Service Connections:

From there, create a new Service Connection of type Azure Resource Manager:

Lastly, give it a name (the repo uses azmldemows as default), and enter the details of your resource:

Step 3: Importing YAML pipeline to Azure DevOps

Next, select Pipelines and create a new pipeline. Select that you want to access code from GitHub:

Select the repository fork you’ve forked earlier (this should show as <your github username>/pipelines-azureml). Next, let’s point the new pipeline to an existing pipeline YAML in your GitHub repo. Lastly, we need to select from where the pipeline definition should come. In this example, I am having it point to pipelines/diabetes-train-and-deploy.yml:

Final Step: Review the pipeline

Take a look at the variablesService connection or Machine Learning workspace etc., and make sure they are named correctly. Once the configuration looks right, give it a test run. By clicking on the pipline job, you can retrieve more details:

And that’s it! Your first model should be training by now. Congratulations on building and running your ML pipeline with Azure.

MLOps Best Practices

Just like with DevOps, it is good approach to follow certain best practices when using MLOps to make deployments into a production environment. Some of these are pretty much the same as in a standard DevOps workflow, whereas others are more machine learning and data specific:

Data scientists should work in branches pulled off of a master branch.
When code is pushed to the Git repo, trigger a CI (continuous integration) pipeline.
First run: Provision infrastructure-as-code (ML workspace, compute targets, datastores).
For new code: Every time new code is committed to the repo, run unit tests, data quality checks, train model.

Some ML-specific steps in your CI process should be:

Training a model — run training code / algorithm and output a model file which is stored in the run history.
Evaluating a model — compare the performance of newly trained model with the model in production. If the new model performs better than the production model, the a new pipeline can be deployed. If not, this can be skipped.
Registering a model — take the best model and register it with the Azure ML Model registry. This allows us to version control it.

Conclusion

MLOps empowers data scientists and ML engineers to help bring models into production and can enable you to track, version, audit, refine, and re-use every piece in your ML lifecycle while using tools and technologies to streamline this process. MLOps tries to take the concept of a DevOps pipeline and applies it to a Machine Learning experiment to make the process more productive.

You can also get started with Comet’s MLOps platform for free! With our recently launched Model Production Monitoring, Comet Artifacts, and Panels, you can monitor, reproduce, and collaborate on models at all stages of production.

References for this post:

Azure Machine-learning operations (MLOps): https://azure.microsoft.com/en-ca/services/machine-learning/mlops/

2. Exploring DevOps And Its Parallels To MLOps: https://datainsightgroup.ca/history-of-devops-and-its-parallels-to-mlops/

3. Enabling CI/CD for Machine Learning project with Azure Pipelines: https://www.azuredevopslabs.com/labs/vstsextend/aml/

4. MLRun: Introduction to MLOps framework: https://www.analyticsvidhya.com/blog/2021/07/mlrun-introduction-to-mlops-framework/

5. ML Ops: Machine Learning as an Engineering Discipline: https://laptrinhx.com/ml-ops-machine-learning-as-an-engineering-discipline-1542524926/

6. Compare the machine learning products and technologies from Microsoft: https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/data-science-and-machine-learning

7. MLOps: Model management, deployment, lineage, and monitoring with Azure Machine Learning: https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletters (Deep Learning Weekly and the Comet Newsletter), join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.