Realizing Machine Learning anywhere with Azure Kubernetes Service and Arc-enabled Machine Learning – Microsoft


We are thrilled to announce the general availability of Azure Machine Learning (Azure ML) Kubernetes compute, including support of seamless Azure Kubernetes Service (AKS) integration and Azure Arc-enabled Machine Learning.


With a simple cluster extension deployment on AKS or Azure Arc-enabled Kubernetes (Arc Kubernetes) cluster, Kubernetes cluster is seamlessly supported in Azure ML to run training or inference workload. In addition, Azure ML service capabilities for streamlining full ML lifecycle and automation with MLOps become instantly available to enterprise teams of professionals. Azure ML Kubernetes compute empowers enterprises ML operationalization at scale across different infrastructures and addresses different needs with seamless experience of Azure ML CLI v2, Python SDK v2 (preview), and Studio UI. Here are some of the capabilities that customers can benefit

  • Deploy ML workload on customer managed AKS cluster and gain more security and controls to meet compliance requirements.
  • Run Azure ML workload on Arc Kubernetes cluster right where data lives and meets data residency, security, and privacy compliance, or harness existing IT investment.
  • Use Arc Kubernetes cluster to deploy ML workload or aspect of ML lifecycle across multiple public clouds.
  • Fully automated hybrid workload in cloud and on-premises to leverage different infrastructure advantages and IT investments.



The IT-operations team and data-science team are both integral parts of the broader ML team. By letting the IT-operations team manage Kubernetes compute setup, Azure ML creates a seamless compute experience for data-science team who does not need to learn or use Kubernetes directly. The design for Azure ML Kubernetes compute also helps IT-operations team leverage native Kubernetes concepts such as namespace, node selector, and resource requests/limits for ML compute utilization and optimization. Data-science team now can focus on models and work with productivity tools such as Azure ML CLI v2, Python SDK v2, Studio UI, and Jupyter notebook.


It is easy to enable and use an existing Kubernetes cluster for Azure ML workload with the following simple steps:



IT-operation team. The IT-operation team is responsible for the first 3 steps above: prepare an AKS or Arc Kubernetes cluster, deploy Azure ML cluster extension, and attach Kubernetes cluster to Azure ML workspace. In addition to these essential compute setup steps, IT-operation team also uses familiar tools such as Azure CLI or kubectl to take care of the following tasks for the data-science team:

  • Network and security configurations, such as outbound proxy server connection or Azure firewall configuration, Azure ML inference router (azureml-fe) setup, SSL/TLS termination, and no-public IP with VNET.
  • Create and manage instance types for different ML workload scenarios and gain efficient compute resource utilization.
  • Trouble shooting workload issues related to Kubernetes cluster.


Data-science team. Once the IT-operations team finishes compute setup and compute target(s) creation, data-science team can discover list of available compute targets and instance types in Azure ML workspace to be used for training or inference workload. Data science specifies compute target name and instance type name using their preferred tools or APIs such as Azure ML CLI v2, Python SDK v2, or Studio UI.




Separation of responsibilities between the IT-operations team and data-science team. As we mentioned above, managing …….


Posted on

Leave a Reply

Your email address will not be published. Required fields are marked *