If you followed this series, you have learned how to scale your applications using the Horizontal Pod Autoscaler (HPA) or KEDA, Kubernetes Event-driven Autoscaling. Both approaches can automatically scale out and scale in your application but also suffer from the same shortcoming. When the existing nodes have no resources left, no new pods can be scheduled.
Today, I would like to tell you how to automatically scale your Azure Kubernetes Cluster to add or remove nodes using the cluster autoscaler.
This post is part of “Microservice Series - From Zero to Hero”.
Working without automatic Cluster scaling
You can find the code of the demo on GitHub.
In Azure Kubernetes Service - Getting Started, I have created a new Azure Kubernetes Cluster with one node. This cluster works fine if you ignore the fact that one node does not provide high availability, but since the creation of the cluster, I have added more and more applications. When I have to scale one of my applications to several pods, the node runs out of resources and Kubernetes can’t schedule the new pods.
To get the error message why the pod can’t be started, use the following command:
Replace kedademoapi-68b66664cb-jjhvg with the name of one of your pods that can not be started and enter the namespace where your pods are running. You will see the error message at the bottom of the output.
Verify your Worker Nodes
If you are using Azure Kubernetes Service, you have two options to verify how many worker nodes are running. First, open your AKS cluster in the Azure portal and navigate to the Node pools pane to see how many nodes are running at the moment. As you can see, my cluster has only one node:
The second option to verify the number of nodes is using the command line. Use the following command to display all your nodes:
This command will display one worker node.
Configure AKS Cluster Autoscaler
In an earlier post, I have created a YAML pipeline in Azure DevOps to create my AKS cluster using Azure CLI. The code looks as follows:
The cluster autoscaler can be easily enabled and configured using the enable-cluster-autoscaler flag and setting the minimum and maximum node count.
The cluster autoscaler has a wide range of settings that can be configured using the cluster-autoscaler-profile flag. For a full list of all attributes and their default values, see the official documentation. The default values are usually good, except that I would like to scale down faster. Therefore, I change two settings of the cluster autoscaler profile:
Test the Cluster Autoscaler
The cluster autoscaler sees pods that can not be scheduled and adds a new node to the cluster. Open the Node pools pane of your AKS cluster in the Azure portal and you will see that your cluster is running two nodes now.
Using the CLI also shows that your cluster has two nodes now.
Use the following command to see that all pods got scheduled on one of the two nodes (replace kedademoapi-test with your K8s namespace):
This command displays all your pods in the given namespace and shows on which node they are running.
Conclusion
Modern applications must react quickly to traffic spikes and scale out accordingly. This can be easily achieved using the Kubernetes Horizontal Pod Autoscaler or KEDA. These approaches only schedule more pods and your cluster can easily run out of space on its worker nodes. The cluster autoscaler in Azure Kubernetes Services helps you when the cluster runs out of resources and can automatically add new worked nodes to your cluster. Additionally, the cluster autoscaler also removes underutilized nodes and therefore can help you to keep costs to a minimum.
You can find the code of the demo on GitHub.
This post is part of “Microservice Series - From Zero to Hero”.
Comments powered by Disqus.