Skip to main content

Why use Azure Kubernetes Service with Prefect?

The data automation tool Prefect is used to build, run, and monitor dataflows (data pipelines) for ETL, training machine learning models etc. Prefect has the flexibility to run these flows on many different compute targets such as Kubernetes, GCP Vertex and AWS ECS. In this article we will deep dive into some of the benefits of using Azure Kubernetes Service (AKS) specifically.

Node pools: GPUs and high memory

Some workloads that you run with Prefect might have special hardware requirements. These tasks could include large in-memory data manipulations (such as manipulating large Pandas dataframes) or training machine learning models with GPUs. The standard setting for an AKS cluster is to have one node pool where each node is of the same VM type. However, you can also add multiple node pools with different VM types in order to fulfill the varying hardware requirements of your various tasks. In the Job YAML for the KubernetesRun config (KubernetesJob in Prefect 2.0), you can add tolerances and you can apply taints to the various node pools. Together, these features allow you to specify which node pool the flow should run on, giving you a high degree of freedom to configure the hardware requirements for your flows on a flow-by-flow basis.

Node pools: Auto-scaling and scale-to-zero

Another benefit of using node pools is the ability to define different auto-scale requirements for each node pool so only the needed VMs are running. AKS has two types of node pools: System and User.

System node pools serve the primary purpose of hosting critical system pods such as CoreDNS and metrics-server. User node pools serve the primary purpose of hosting your application pods. However, application pods can be scheduled on system node pools if you wish to only have one pool in your AKS cluster. Every AKS cluster must contain at least one system node pool with at least one node.

I suggest that you utilize User node pools for the Prefect workloads that you plan to run because you can set the auto-scale to scale them to zero when no work needs to be done. This gives you the ability to take advantage of very large VMs in your cluster without having to pay for them when you don't need them.

Pod Identity with Managed Identity

Often, you may have to connect to cloud resources from your Prefect flows. If these are Azure based resources that support Azure Active Directory, you will need an identity to connect. Usually, you would use a service principal, but AKS has a concept called pod-managed identity to assign a predefined identity to a pod at start-up. The benefit of this feature is that you don’t need to supply any passwords. Therefore, you won't need to rotate the service principal's password. You simple specify which identity to use in the Job YAML, and then give this identity the necessary permissions on the Azure resource to which you are connecting.

Azure Key Vault

If you are using Prefect Cloud, you can take advantage of the built-in key store for secrets such as API keys. However, if you want extra functionality, such as giving access to non-Prefect users or hardening security with network restrictions, you can use Azure Key Vault. Using Azure Key Vault combines nicely with the pod identity so you can have password-less authentication when retrieving the secrets from the vault.

Comments

Get articles directly in your inbox.

Sign up for the Transpose newsletter and stay up to date with our latest posts.

Transpose Blog Learnings from the Danish data industry Deep dives and quick insights into everyday data challenges. About us