Documentation Index
Fetch the complete documentation index at: https://docs.siderolabs.com/llms.txt
Use this file to discover all available pages before exploring further.
Talos Linux is a minimal operating system designed to run Kubernetes workloads efficiently and securely.
However, there are cases where you need access to the underlying system for troubleshooting or diagnostics.
Some troubleshooting tasks require tools that are not included in the default Talos image — nor should they be, since they aren’t needed for normal cluster operation.
Talos provides a debug shell that starts a privileged container using an image you supply, then drops you into that shell.
When you exit, the container is removed and the cluster continues operating normally.
There are two ways to bring the container image to use for the debug shell:
- let the Talos machine pull the image from the registry;
- push the image to the machine using the API.
Pulling from a registry is the simplest approach, but requires network access to that registry.
If you are troubleshooting a connectivity issue, use the API to push the image directly to the machine instead.
Pulling the image from the registry
Pass the image reference as an argument to talosctl debug:
$ talosctl debug docker.io/library/alpine:latest
/ # ls /host/
bin dev lib mnt proc run sys tmp var
boot etc lib64 opt root sbin system usr
/
Pushing the debug image to the machine
First, build the image required (e.g. install required packages):
Note: if you have the image already built and stored in your local Docker image store, you can skip the build step and directly save it to a tarball.
# Dockerfile
FROM alpine:3.23
RUN apk add --no-cache iputils
Then, build the image and store it to the local Docker image store:
docker build --tag alpine-with-tools:v1 .
Next, save the image to a tarball:
docker save alpine-with-tools:v1 -o ./alpine-with-tools.tar
Note: the architecture of the image should match the architecture of the machine to debug, otherwise the image will fail to start.
Finally, pass the tarball path to talosctl debug:
$ talosctl debug ./alpine-with-tools.tar
image imported docker.io/library/alpine-with-tools:v1 from ./alpine-with-tools.tar
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
valid_lft forever preferred_lft forever
inet 169.254.116.108/32 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
...
Anatomy of the debug container
The debug container runs privileged with host PID and network namespaces, giving it access to the host’s processes and network interfaces.
The host filesystem is mounted at /host, so you can browse host files from within the container.
Host devices are available under the regular /dev path.
The pwru tool traces network packets through the Linux network stack, making it useful for diagnosing network issues.
First, build the pwru image from source, since it is not available in public container registries:
git clone https://github.com/cilium/pwru.git
cd pwru
docker build -f Dockerfile --tag pwru:latest .
docker save pwru:latest -o pwru.tar
Second, pwru needs to decode function names from kernel pointers, which Talos restricts by default.
Temporarily relax this restriction by applying the following machine configuration patch:
machine:
sysctls:
kernel.kptr_restrict: 1
Finally, use the pwru image to start the debug shell:
$ talosctl debug ./pwru.tar
image imported docker.io/library/pwru:latest from ./pwru.tar
/ #
As an example, suppose a firewall rule is blocking access to port 5005 and you want to diagnose it with pwru.
apiVersion: v1alpha1
kind: NetworkRuleConfig
name: test
portSelector:
ports:
- 5005
protocol: tcp
ingress:
- subnet: 192.168.0.0/16
Attempting to reach port 5005 from outside the cluster results in a silent block:
$ nc -vz <machine-ip> 5005
# hangs for a while
Back in the pwru debug shell, launch the pwru tool to trace the packets to port 5005:
/ # pwru 'port 5005'
2026/03/31 13:22:21 INFO Attaching kprobes via=kprobe
3013 / 3013 [------------------------------------------------------------------------------------------------------------------------------------------------------------] 100.00% 214 p/s
2026/03/31 13:22:35 INFO Attached ignored=0
2026/03/31 13:22:35 INFO Listening for events..
SKB CPU PROCESS NETNS MARK/x IFACE PROTO MTU LEN TUPLE FUNC
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) inet_gro_receive
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) tcp4_gro_receive
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) __skb_gro_checksum_complete
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) tcp_gro_pull_header
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) tcp_gro_receive
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) ip_rcv_core
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) nf_hook_slow
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) ip_sabotage_in
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) ipv4_conntrack_defrag
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) ipv4_conntrack_in
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) nf_conntrack_in
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) nf_conntrack_tcp_packet
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) nf_checksum
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) nf_ip_checksum
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) nft_do_chain_inet
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) sk_skb_reason_drop(SKB_DROP_REASON_NETFILTER_DROP)
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) __kfree_skb
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) skb_release_head_state
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) skb_release_data
0xffff8c37e629c800 1 <empty>:0 4026531833 0 enp0s2:8 0x0800 1500 60 172.20.0.1:46236->172.20.0.5:5005(tcp) skb_free_head
The sk_skb_reason_drop(SKB_DROP_REASON_NETFILTER_DROP) entry shows the packet being dropped by netfilter — the Linux kernel’s firewall framework.
This confirms the issue lies in the firewall rules on the machine, and further investigation should focus there.