Artificial Intelligence Engineering – The Microsoft Azure Solution (Part 2 of 3)
In part 1 of this series, we covered the challenges our client in the entertainment industry was facing for automating machine learning models. We designed an AI (Artificial Intelligence) engineering solution for them on Microsoft Azure to reduce the time to value of their ML (Machine Learning) models.
Having worked on operationalizing machine learning on various platforms we were well aware that AI Engineering or MLOps (Machine Learning Operations) is still young. Designing a complete automation architecture will pose challenges. We knew very well that the design process would not be straight forward and we would need to tap into various other Azure service offerings.
We started with sorting out the priorities for automation and at the most basic level we divided the approach into three parts
- The data processing pipelines
- The build and deployment process
- The training process
Our focus was to reduce the compounding inefficiencies imposed at each stage due to changes in real-world data, the rudimentary manual deployment methods and incompatibilities of the process, from data curation to development, deployment and lastly training.
For data processing pipelines. The team was already testing Microsoft Azure synapse pipelines and we noticed some minor shortcomings like lack of integration with GitHub or other repositories, template as in Azure Data Factory, Azure monitor integration and lack of data lineage. We recommended the use of Azure Synapse if the data pipeline processing was to be carried out separately from ML pipelines otherwise Azure Data factory was a better option for complete operationalization.
The build and release pipeline was predominantly developed done using Azure machine learning services and the benefits were obvious.
The team liked the use of pre-built machine learning algorithms to create baseline models, Hyperdrive usage for hyperparameter tuning and the option of comparing the performance of different models and experiment runs.
The flexibility Azure machine learning afforded for carrying out training over Azure Compute Instances with sample datasets created from Azure Synapse saved a lot of time.
We noticed a few shortcomings in the service, some of which were:
- Azure ML can just trigger its pipelines and is unable to provide a full cycle automation
- It can only trigger training with data drifts
To alleviate these issues we explored the option of using Microsoft Azure DevOps in conjunction with Microsoft Azure Machine Learning.
Integration of Azure DevOps services with Azure machine learning yielded some good results.
Azure DevOps could trigger data processing and machine learning pipelines successively order and could also trigger training upon code commits and data shifts.
We were able to configure alerts and approvals at each stage of the cycle so production release could be controlled. Also, updated models could be only be released if they outperformed the original production models.
It also proved to be easy to validate the attached training AKS cluster consumption and memory levels during heavy workload runs.
The team loved this approach because it provided them with the flexibility to work in collaboration over the entire machine learning iteration cycle. The data engineer was able to prepare data easily for the data scientist without disrupting the current production ETL processes.
The data scientist got a collaborative environment for development with the ability to track experiments and was able to validate models with real data in their own containerized environments. The operationalization streamlined deployment and expanded testing with greater visibility into the artifacts and logs resulting in repeatable processes and shorter deployment cycles.
Our take on Azure Machine Learning
Azure machine learning services offer the box services for a complete model pipeline creation, the experiment runs and training cluster allocation reducing the development cycle time for baseline models. It offers only basic machine learning models and needs to work on adding deep learning or advanced models into Azure ML. Currently, users can develop advanced machine learning models by using R or python which is integrated with Azure machine learning.
It may still be far from being the seamless, end-to-end machine learning platform but these shortcomings are compensated with Azure DevOps services integration. Only that integrating Azure DevOps could be an added complexity layer on top of Azure ML in case DevOps services are non-existing within organizations.
The future of Azure machine learning looks very promising, and Microsoft continuously keeps enhancing Azure machine learning services. We can’t wait to see a prefabricated environment with complete DevOps features built right into Azure ML. Another feature that we would love to have in Azure machine learning is the flexibility to trigger training with code commits and approvals, the addition of pre-built advanced machine learning models and the ability to provision and monitor hybrid clusters within Azure Machine learning.
In part 3 we will be discussing ML (Machine Learning) Automation solution on Google Cloud.