> ## Documentation Index > Fetch the complete documentation index at: https://docs.siderolabs.com/llms.txt > Use this file to discover all available pages before exploring further. # Autoscale Your Cluster with Cluster AutoScaler in AWS > Configure Cluster Autoscaler for Talos Linux clusters running on AWS using Omni export const version = 'v1.13'; This guide shows you how to enable automatic scaling for your Talos Linux cluster on AWS using Cluster Autoscaler and Omni. ## Prerequisites Before you begin you must have: * AWS CLI configured * `kubectl`, `talosctl`, and `helm` installed ## Step 1: Create IAM role for Cluster Autoscaler Cluster Autoscaler uses the IAM role attached to the EC2 instances where it runs. In this guide, the Cluster Autoscaler will be configured to run on the control plane nodes, so the IAM role must be attached to the control plane once its created. To create the IAM role and attach it to your control plane machines, you need: * An IAM policy that defines the permissions required by Cluster Autoscaler * An IAM role that uses the policy * An instance profile that allows EC2 instances to assume the IAM role ### 1.1: Define environment variables First, define the variables used throughout the IAM setup: ```bash theme={null} CLUSTER_NAME=cluster-autoscaler-aws ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text) AUTOSCALER_ROLE_NAME="${CLUSTER_NAME}-autoscaler-role" AUTOSCALER_POLICY_NAME="${CLUSTER_NAME}-ClusterAutoscalerPolicy" AUTOSCALER_INSTANCE_PROFILE_NAME="${CLUSTER_NAME}-autoscaler-instance-profile" ``` ### 1.2: Create IAM policy Next, create an IAM policy that grants Cluster Autoscaler permission to: * Adjust Auto Scaling Group capacity * Discover tagged node groups * Describe EC2 and ASG resources The policy is scoped using AWS resource tags so it only manages Auto Scaling Groups associated with this cluster. ```bash theme={null} cat < trust-policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } EOF ``` Create the IAM role using the trust policy: ```bash theme={null} aws iam create-role \ --role-name $AUTOSCALER_ROLE_NAME \ --assume-role-policy-document file://trust-policy.json ``` Attach the Cluster Autoscaler policy to the IAM role: ```bash theme={null} aws iam attach-role-policy \ --role-name $AUTOSCALER_ROLE_NAME \ --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/$AUTOSCALER_POLICY_NAME ``` Now create an instance profile so the role can be associated with EC2 instances: ```bash theme={null} aws iam create-instance-profile \ --instance-profile-name $AUTOSCALER_INSTANCE_PROFILE_NAME aws iam add-role-to-instance-profile \ --instance-profile-name $AUTOSCALER_INSTANCE_PROFILE_NAME \ --role-name $AUTOSCALER_ROLE_NAME echo "Waiting for IAM instance profile propagation..." sleep 20 ``` ## Step 2: Launch control plane With IAM configured, you can now launch the control plane machines. These control plane instances are not managed by an Auto Scaling Group. They are created manually and will run Cluster Autoscaler. ### 2.1: Define environment variables Start by defining the AWS region, Talos version, architecture, instance type, and the number of control plane machines to create. For high availability, we recommend creating three control plane machines. ```bash theme={null} AWS_REGION=$(aws configure get region) TALOS_VERSION=v1.12.4 ARCH=amd64 INSTANCE_TYPE=t3.small CONTROL_PLANE_NO=3 ``` ### 2.2: Retrieve the official Talos AMI Fetch the Talos AWS AMI for your region and architecture from the official Talos release metadata. If you need to customize your AMI—for example, by adding custom labels or extensions, you must create your own AMI and bake those customizations into it. For more information, refer to the [Register AWS Machines in Omni](../../omni-cluster-setup/registering-machines/how-to-register-an-aws-ec2-instance) documentation. ```bash theme={null} AMI=$(curl -sL https://github.com/siderolabs/talos/releases/download/${TALOS_VERSION}/cloud-images.json \ | jq -r '.[] | select(.region == "'"$AWS_REGION"'") | select(.arch == "'"$ARCH"'") | .id') echo "Using AMI: $AMI" ``` ### 2.3: Generate control plane join configuration Generate the join configuration that registers the Talos nodes with Omni on boot. Encode it for use as EC2 user data: ```bash theme={null} USER_DATA=$(omnictl jointoken machine-config) USER_DATA_B64=$(echo "$USER_DATA" | base64) ``` ### 2.4: Launch three control plane instances Launch the control plane EC2 instances using: * The Talos AMI * The IAM instance profile created in **Step 1** * The join configuration as user data ```bash theme={null} aws ec2 run-instances \ --region $AWS_REGION \ --image-id $AMI \ --instance-type $INSTANCE_TYPE \ --count $CONTROL_PLANE_NO \ --iam-instance-profile Name=$AUTOSCALER_INSTANCE_PROFILE_NAME \ --user-data "$USER_DATA" \ --tag-specifications 'ResourceType=instance,Tags=[{Key=role,Value=autoscaler-controlplane-machine}]' ``` After the instances are launched, they will appear under Machines in the Omni dashboard. From there, you can assign them to a cluster. We do not recommend horizontally autoscaling control plane machines. If your control plane needs more capacity, scale vertically instead. ## Step 3: Create Machine Classes A Machine Class defines a pool of infrastructure that Omni can use when creating cluster nodes. In this step, you’ll create separate Machine Classes for the control plane and worker nodes. ### 3.1: Create the control plane Machine Class To define a Machine Class for your control plane nodes: 1. Create the control plane machine class definition: ```bash theme={null} cat < controlplane-machine-class.yaml metadata: namespace: default type: MachineClasses.omni.sidero.dev id: cluster-autoscaler-controlplane spec: matchlabels: - omni.sidero.dev/platform = aws # Change the label to match your machine EOF ``` This command creates a Machine Class named `cluster-autoscaler-controlplane` that matches machines labeled `omni.sidero.dev/platform = aws`. If you are using custom labels, or prefer to create a Machine Class based on a different machine label, replace `omni.sidero.dev/platform = aws` with your preferred label. The label you specify must already exist on the machines you want this Machine Class to match. In this example, the label corresponds to the default platform label automatically applied to machines created in AWS. 2. Apply the definition: ```bash theme={null} omnictl apply -f controlplane-machine-class.yaml ``` 3. Verify that it was created: ```bash theme={null} omnictl get machineclasses ``` ### 3.2: Create the worker Machine Class Next, repeat the process for the worker nodes: 1. Create the worker machine class definition:: ```bash theme={null} cat < worker-machine-class.yaml metadata: namespace: default type: MachineClasses.omni.sidero.dev id: cluster-autoscaler-worker spec: matchlabels: - omni.sidero.dev/platform = aws # Change the label to match your machine EOF ``` 2. Apply the definition: ```bash theme={null} omnictl apply -f worker-machine-class.yaml ``` 3. Verify: ```bash theme={null} omnictl get machineclasses ``` ## Step 4: Create the cluster Next, create a cluster that uses the Machine Classes you defined in Step 3. To create a cluster: 1. Run this command to create a cluster template: ```bash theme={null} cat < cluster-template.yaml kind: Cluster name: $CLUSTER_NAME kubernetes: version: v1.34.1 talos: version: ${TALOS_VERSION} --- kind: ControlPlane machineClass: name: cluster-autoscaler-controlplane size: 3 --- kind: Workers machineClass: name: cluster-autoscaler-worker size: unlimited EOF ``` 2. Apply the template: ```bash theme={null} omnictl cluster template sync -f cluster-template.yaml ``` 3. Download the cluster's `kubeconfig` once the cluster becomes healthy: ```bash theme={null} omnictl kubeconfig -c $CLUSTER_NAME ``` 4. Monitor your cluster status from your Omni dashboard or by running: ```bash theme={null} kubectl get nodes --watch ``` ## Step 5: Enable KubeSpan (required for hybrid or on-prem autoscaling) If your autoscaled worker nodes are not launched in the same private AWS network as your control plane nodes (for example, in hybrid cloud or on-prem environments), you must enable KubeSpan. KubeSpan creates an encrypted WireGuard mesh between cluster nodes. This allows nodes running in different networks to securely discover and communicate with each other. To enable KubeSpan, add the following patch to the `Cluster` document section of your cluster template: ```yaml theme={null} patches: - name: kubespan-enabled inline: machine: network: kubespan: enabled: true cluster: discovery: enabled: true ``` Your cluster template should now look similar to this: ```yaml theme={null} kind: Cluster name: $CLUSTER_NAME kubernetes: version: v1.34.1 talos: version: ${TALOS_VERSION} patches: - name: kubespan-enabled inline: machine: network: kubespan: enabled: true cluster: discovery: enabled: true --- kind: ControlPlane machineClass: name: cluster-autoscaler-controlplane size: 3 --- kind: Workers machineClass: name: cluster-autoscaler-worker size: unlimited ``` Re-apply the template: ```bash theme={null} omnictl cluster template sync -f cluster-template.yaml ``` ## Step 6: Create Launch Template and Auto Scaling Group (workers) Cluster Autoscaler scales worker machines by adjusting the size of an AWS Auto Scaling Group (ASG). To enable this, you need to create: * A Launch Template, which defines how worker nodes are configured and launched * An Auto Scaling Group, which uses the Launch Template to create and terminate worker nodes * Tags, which allow Cluster Autoscaler to automatically discover and manage the Auto Scaling Group The commands in this section will use your Talos worker AMI and AWS networking configuration to create these resources. ### 6.1: Create Launch Template The Launch Template defines which AMI and instance type your worker machines will use: ```bash theme={null} LAUNCH_TEMPLATE_NAME="talos-ca-launch-template" AUTO_SCALING_GROUP_NAME="talos-ca-asg" aws ec2 create-launch-template \ --launch-template-name $LAUNCH_TEMPLATE_NAME \ --launch-template-data "{ \"ImageId\":\"$AMI\", \"InstanceType\":\"$INSTANCE_TYPE\", \"IamInstanceProfile\": { \"Name\": \"$AUTOSCALER_INSTANCE_PROFILE_NAME\" }, \"UserData\":\"$USER_DATA_B64\" }" ``` ### 6.2: Create Auto Scaling Group Run this command to create a autoscaling group: ```bash theme={null} VPC_ID=$(aws ec2 describe-instances \ --filters Name=tag:role,Values=autoscaler-controlplane-machine \ --query "Reservations[*].Instances[*].VpcId" \ --output text) SUBNET_IDS=$(aws ec2 describe-subnets \ --filters Name=vpc-id,Values=$VPC_ID \ --query 'Subnets[*].SubnetId' \ --output text | tr '\t' ',') aws autoscaling create-auto-scaling-group \ --auto-scaling-group-name $AUTO_SCALING_GROUP_NAME \ --launch-template LaunchTemplateName=$LAUNCH_TEMPLATE_NAME \ --min-size 1 \ --max-size 5 \ --desired-capacity 1 \ --vpc-zone-identifier "$SUBNET_IDS" ``` ### 6.3: Tag the Auto Scaling Group for Cluster Autoscaler These tags allow Cluster Autoscaler to discover and manage the node group: ```bash theme={null} aws autoscaling create-or-update-tags \ --tags \ ResourceId=$AUTO_SCALING_GROUP_NAME,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/enabled,Value=true,PropagateAtLaunch=true \ ResourceId=$AUTO_SCALING_GROUP_NAME,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/$CLUSTER_NAME,Value=true,PropagateAtLaunch=true ``` ### 6.4: Verify the Auto Scaling Group created a worker node Once the Auto Scaling Group is created, it would automatically launch one worker machine to match its desired capacity. To confirm AWS created an instance: ```bash theme={null} aws autoscaling describe-auto-scaling-groups \ --auto-scaling-group-names $AUTO_SCALING_GROUP_NAME \ --query 'AutoScalingGroups[0].Instances[*].InstanceId' \ --output table ``` Then verify that the node joins your Kubernetes cluster: ```bash theme={null} kubectl get nodes --watch ``` ## Step 7: Install Cluster Autoscaler Cluster Autoscaler runs as a Kubernetes Deployment inside your cluster. It continuously monitors unscheduled pods and adjusts your Auto Scaling Group capacity when additional nodes are required. Run this to install Cluster Autoscaler using Helm and configure it to automatically discover and manage your AWS Auto Scaling Groups. ```bash theme={null} helm repo add autoscaler https://kubernetes.github.io/autoscaler helm repo update helm install cluster-autoscaler autoscaler/cluster-autoscaler \ -n kube-system \ --set cloudProvider=aws \ --set awsRegion=$AWS_REGION \ --set autoDiscovery.clusterName=$CLUSTER_NAME \ --set rbac.create=true \ --set nodeSelector."node-role\.kubernetes\.io/control-plane"="" \ --set "tolerations[0].key=node-role.kubernetes.io/control-plane" \ --set "tolerations[0].operator=Exists" \ --set "tolerations[0].effect=NoSchedule" ``` ## Step 8: Verify Cluster Autoscaler is working Confirm that the Cluster Autoscaler pod is running: ```bash theme={null} kubectl -n kube-system get pods \ -l "app.kubernetes.io/instance=cluster-autoscaler" ``` ## Step 9: Test automatic scaling Deploy a workload that requires additional capacity: ```yaml theme={null} cat <