amdgpu driver at boot.
To make those GPUs available to Kubernetes workloads, you can deploy the ROCm GPU Operator.
This guide shows how to enable AMD GPU support on your Talos nodes, apply any tuning your hardware might need, and install ROCm inside your cluster.
Before You Begin
You’ll need:- A Talos Linux cluster running v1.11 or later.
- At least one node with an AMD GPU.
- Basic familiarity with editing and applying Talos machine configuration.
-
The following Talos system extensions:
- siderolabs/amdgpu
- siderolabs/amd-ucode
amdgpu driver included with Talos.
Some newer GPUs may require additional tuning, which we’ll cover later in this guide.
Enable AMD GPU support
Enable AMD GPU firmware and driver support by patching your worker node configuration with these system extensions configuration below:Optional: GPU tuning
Some hardware may require additional kernel arguments or memory tuning, particularly newly released or high-performance GPUs. Example configuration for an AMD AI 395+ (Strix Halo) system:amd_iommu=off: Disables AMD IOMMU initialization. Useful when passthrough or PCI initialization causes issues.amdgpu.gttsize: Increases the GPU GTT memory size for workloads that allocate large buffersttm.pages_limit: Raises the TTM memory limit for large model workloads.
Deploy the ROCm GPU Operator
With GPU support enabled at the OS level, you can deploy the ROCm GPU Operator to surface GPU resources to Kubernetes workloads. Add the ROCm Helm repository:<node-name> placeholder with the name of your node:
Troubleshooting
Issues can show up at different layers depending on your hardware, kernel version, or virtualization platform. The following sections outline the most common problems and how to diagnose them.GPU not detected
If the GPU does not appear:- Confirm that systems extensions are installed:
- Review kernel logs:
- Check PCI visibility:
ROCm operator issues
If the operator fails to initialize, inspect logs:- System extensions are active
- The GPU is visible in Talos.
- GPU firmware matches ROCm expectations