Before you begin
You’ll need the following to use HPA in your Talos cluster:- A running Kubernetes cluster on Talos: If you don’t have one yet, see the Getting Started or Production Cluster guides to create a cluster.
- Metrics Server deployed: Make sure the Metrics Server is running in your cluster. See the Deploy Metrics Server guide for instructions.
Step 1: Deploy the workload
Start by deploying the sample workload that you’ll use for scaling:Step 2: Deploy the Horizontal Pod Autoscaler
Next, create and apply the HPA configuration. If you’d like a deeper look at how HPAs work, check out the Kubernetes guide on Horizontal Pod Autoscaling.example-workload Deployment based on CPU usage.
It monitors the average CPU utilization across all pods and tries to keep it around 50%.
If CPU usage goes above that target, the HPA increases the number of replicas (up to 10).
If it drops, it scales down (but never below 1).