The Talos Linux discovery service allows nodes in a cluster to automatically find and identify each other. Without discovery, nodes have no built-in way to learn about other cluster members, including their IP addresses and connection endpoints. When discovery is enabled, this information is shared and kept up to date across all nodes. This allows Talos to form a cluster and, when enabled, establish encrypted KubeSpan tunnels and support KubePrism peer endpoint discovery. Discovery works through a registry, a backend that nodes publish their connection information to and read peer information from. Talos supports two registry types:Documentation Index
Fetch the complete documentation index at: https://docs.siderolabs.com/llms.txt
Use this file to discover all available pages before exploring further.
- Service registry: Nodes publish to and read from an external discovery service. This is enabled by default and does not depend on Kubernetes or etcd, so it continues to work even when Kubernetes is unavailable.
- Kubernetes registry: Nodes publish discovery data as annotations on Kubernetes
Noderesources. This is disabled by default.
Video walkthrough
To see a live demo of cluster discovery, see the video below:Registries
By default, Talos uses theservice registry. The kubernetes registry is disabled by default. Peers are aggregated from all enabled registries.
To disable a registry, set disabled: true in the cluster configuration. For example, to disable the service registry:
Kubernetes registry
The Kubernetes registry stores discovery data as annotations on KubernetesNode resources:
Service registry
The service registry uses a public external discovery service to exchange encrypted information about cluster members. Sidero Labs maintains a public instance athttps://discovery.talos.dev/. Organizations that require private infrastructure can self-host the discovery service under a commercial license.
Cluster members use a globally unique shared key to coordinate basic connection information, the set of possible endpoints (IP:port pairs). Talos refers to this as affiliate data. All affiliate data is encrypted by Talos Linux before being sent to the discovery service and can only be decrypted by cluster members. The discovery service never has access to the encryption key.
When KubeSpan is enabled, affiliate data also includes the node’s WireGuard public key.
- Affiliate data is encrypted with AES-GCM encryption.
- Endpoint data is separately encrypted with AES in ECB mode, allowing endpoints from different sources to be deduplicated server-side.
What changes when discovery is disabled
Talos can operate with discovery disabled, but this affects several features and behaviours:- KubeSpan and KubePrism require discovery and do not function correctly without it.
- Initial cluster bootstrap and recovery may take longer, as peer and control plane endpoints are not available from discovery.
- Endpoint resolution falls back to Kubernetes API availability, for example via
kubectl get endpoints kubernetes, which requires a functioning API server and load balancer. - Worker nodes become more dependent on control plane availability during failures, as they cannot rely on discovery registries to obtain peer endpoint information.
https://discovery.talos.dev is unavailable, discovery can be disabled or replaced with a privately operated discovery service.
Discovery service behaviour during outages
Talos nodes periodically refresh their discovery data to prevent it from expiring due to the TTL. The discovery service uses a hardcoded TTL of 30 minutes, which cannot be configured by users. As long as the discovery service is reachable, records are continuously renewed. During a short outage, Talos uses its last known in-memory discovery state. Existing connections, including KubeSpan tunnels, continue to function using cached data. If a node reboots while the discovery service is unavailable, it loses all in-memory state and cannot publish its information or retrieve peer data until the service becomes available again. If the outage exceeds the TTL, all discovery records expire. When the service comes back online, it may return an empty dataset. Nodes receiving this update drop their existing peer information, which can temporarily disrupt KubeSpan connectivity. Recovery is automatic, nodes republish their data, peer information is rebuilt, and connectivity is restored without manual intervention.When KubeSpan is enabled, WireGuard keys are generated on boot and not persisted to disk. A rebooted node must publish its new public key via the discovery service before peers can establish tunnels to it.
Inspect discovery resources
Talos exposes three resources for inspecting the state of cluster discovery. Each represents a different stage of the membership process: a node starts as an identity, becomes an affiliate when it shares the cluster credentials, and becomes a member when it is confirmed to belong to the cluster.Identities
Each node has a unique identity, a base62-encoded random 32-byte value, that serves as itsAffiliate identifier. Base62 encoding allows the ID to be URL-safe without requiring URL-encoded base64.
To retrieve the local node’s identity:
Affiliate identifier. It is stored in the STATE partition in node-identity.yaml and is preserved across reboots and upgrades, but regenerated if the node is reset.
Affiliates
An affiliate is a node that shares the same cluster ID and secret. Nodes with matching values are treated as potential cluster members and can exchange encrypted discovery data. Nodes from different clusters cannot see or decrypt each other’s affiliate data. Use this resource to see what nodes the discovery registries are aware of:cluster-raw namespace. Affiliates prefixed with k8s/ came from the Kubernetes registry and those prefixed with service/ came from the discovery service: