Registries
Discovery works through a registry, a backend that nodes publish their connection information to and read peer information from. Talos supports two registry types:- Service registry: Nodes publish to and read from an external discovery service. This is configured with a
DiscoveryServiceConfigdocument, enabled by default, and does not depend on Kubernetes or etcd, so it continues to work even when Kubernetes is unavailable. - Kubernetes registry: Nodes publish discovery data as annotations on Kubernetes
Noderesources. This is deprecated and disabled by default.
Service registry
The service registry uses a public external discovery service to exchange encrypted information about cluster members. Sidero Labs maintains a public instance athttps://discovery.talos.dev/. Organizations that require private infrastructure can self-host the discovery service under a commercial license.
Cluster members use a globally unique shared key to coordinate basic connection information, the set of possible endpoints (IP:port pairs). Talos refers to this as affiliate data. All affiliate data is encrypted by Talos Linux before being sent to the discovery service and can only be decrypted by cluster members. The discovery service never has access to the encryption key.
When KubeSpan is enabled, affiliate data also includes the node’s WireGuard
public key.
- Affiliate data is encrypted with AES-GCM encryption.
- Endpoint data is separately encrypted with AES in ECB mode, allowing endpoints from different sources to be deduplicated server-side.
DiscoveryServiceConfig document. A freshly generated machine configuration includes one named default that points at the public discovery service:
endpoint set to your own instance:
DiscoveryServiceConfig documents, each with a unique name and endpoint. Endpoints must use the http://, https:// or grpc:// scheme. Nodes publish to and read from every configured discovery service, so the cluster continues to discover peers as long as at least one of them is reachable:
DiscoveryServiceConfig documents from the machine configuration. With no service registry and no Kubernetes registry, member discovery is effectively disabled.
Kubernetes registry
The Kubernetes registry has noDiscoveryServiceConfig equivalent and should not be used in new clusters. It can only be enabled through the deprecated .cluster.discovery configuration block, where it is disabled by default:
The
DiscoveryServiceConfig documents and the legacy .cluster.discovery configuration block are mutually exclusive. A machine configuration must not contain both..cluster.discovery block must also configure the service registry there (under registries.service) rather than with a DiscoveryServiceConfig document:
Node resources:
What changes when discovery is disabled
Discovery is disabled by removing allDiscoveryServiceConfig documents from the machine configuration (and not configuring the legacy .cluster.discovery block). Talos can operate with discovery disabled, but this affects several features and behaviours:
- KubeSpan and KubePrism require discovery and do not function correctly without it.
- Initial cluster bootstrap and recovery may take longer, as peer and control plane endpoints are not available from discovery.
- Endpoint resolution falls back to Kubernetes API availability, for example via
kubectl get endpoints kubernetes, which requires a functioning API server and load balancer. - Worker nodes become more dependent on control plane availability during failures, as they cannot rely on discovery registries to obtain peer endpoint information.
https://discovery.talos.dev is unavailable, discovery can be disabled or replaced with a privately operated discovery service.
Discovery service behaviour during outages
Talos nodes periodically refresh their discovery data to prevent it from expiring due to the TTL. The discovery service uses a hardcoded TTL of 30 minutes, which cannot be configured by users. As long as the discovery service is reachable, records are continuously renewed. During a short outage, Talos uses its last known in-memory discovery state. Existing connections, including KubeSpan tunnels, continue to function using cached data. If a node reboots while the discovery service is unavailable, it loses all in-memory state and cannot publish its information or retrieve peer data until the service becomes available again. If the outage exceeds the TTL, all discovery records expire. When the service comes back online, it may return an empty dataset. Nodes receiving this update drop their existing peer information, which can temporarily disrupt KubeSpan connectivity. Recovery is automatic, nodes republish their data, peer information is rebuilt, and connectivity is restored without manual intervention.When KubeSpan is enabled, WireGuard keys are generated on boot and not
persisted to disk. A rebooted node must publish its new public key via the
discovery service before peers can establish tunnels to it.
Inspect discovery resources
Talos exposes three resources for inspecting the state of cluster discovery. Each represents a different stage of the membership process: a node starts as an identity, becomes an affiliate when it shares the cluster credentials, and becomes a member when it is confirmed to belong to the cluster.Identities
Each node has a unique identity, a base62-encoded random 32-byte value, that serves as itsAffiliate identifier. Base62 encoding allows the ID to be URL-safe without requiring URL-encoded base64.
To retrieve the local node’s identity:
Affiliate identifier. It is stored in the STATE partition in node-identity.yaml and is preserved across reboots and upgrades, but regenerated if the node is reset.
Affiliates
An affiliate is a node that shares the same cluster ID and secret. Nodes with matching values are treated as potential cluster members and can exchange encrypted discovery data. Nodes from different clusters cannot see or decrypt each other’s affiliate data. Use this resource to see what nodes the discovery registries are aware of:cluster-raw namespace. Affiliates prefixed with k8s/ came from the Kubernetes registry and those prefixed with service/ came from the discovery service: