This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

05/06/2025, 6:44 PM

This message was deleted.

creamy-pencil-82913

05/06/2025, 7:20 PM

it is normal for the kubelet not to be able to reach the apiserver when it first starts, as the kubelet itself runs the apiserver as a static pod. You’ll need to provide more information, or look through the logs to find a more relevant error.

creamy-pencil-82913

05/06/2025, 7:20 PM

In general, I would look in /var/log/pods to confirm that etcd, kube-apiserver, kube-scheduler, and so on are running without errors.

creamy-pencil-82913

05/06/2025, 7:21 PM

also, check the output of

kubectl get pod -A -o wide

to see what all is running

powerful-librarian-10572

05/07/2025, 9:37 AM

if you also check status of the rke2-server.service and rancher-system-agent.service ? Is there any error in there?

nice-breakfast-55063

05/07/2025, 11:30 AM

thanks for your answers. no errors in both services. as of now i think the issue must lie within DNS.

kubectl logs

of my

cattle-cluster-agent

shows an error (it hangs in CrashLoopBack as well):

Copy code

#dnsPolicy: ClusterFirst
INFO: Using resolv.conf: search cattle-system.svc.cluster.local svc.cluster.local cluster.local <http://mydomain.com|mydomain.com> nameserver 10.43.0.10 options ndots:5
ERROR: <https://myrancher.mydomain.com/ping> is not accessible (Could not resolve host: <http://myrancher.mydomain.com|myrancher.mydomain.com>)

#dnsPolicy: Default
INFO: Using resolv.conf: search <http://mydomain.com|mydomain.com> nameserver 172.1.1.1 nameserver 172.1.1.2
ERROR: <https://myrancher.mydomain.com/ping> is not accessible (Could not resolve host: <http://myrancher.mydomain.com|myrancher.mydomain.com>)

Resolving

<http://myrancher.mydomain.com|myrancher.mydomain.com>

from another pod in kube-system works. found a few open/known issues related to DNS, where #16454 comes closest to what i seem to have. according to this i changed

/etc/hosts

and modified the core-dns configmap - in both cases adding

<http://myrancher.mydomain.com|myrancher.mydomain.com>

as host. i did try different configurations when creating the cluster as well (cilium & calico as cni, leave all network settings all to defaults). No change in behaviour until now.

nice-breakfast-55063

05/07/2025, 12:38 PM

state of pods in kube-system

Copy code

NAME                                                   READY   STATUS             RESTARTS         AGE
cilium-g8vdw                                           1/1     Running            0                61m
cilium-operator-6dfff79f58-4fxqx                       0/1     Pending            0                61m
cilium-operator-6dfff79f58-n562h                       1/1     Running            0                61m
cloud-controller-manager-rzlxdefdevrke01               1/1     Running            0                61m
etcd-rzlxdefdevrke01                                   1/1     Running            0                60m
helm-install-rke2-cilium-mqdht                         0/1     Completed          0                61m
helm-install-rke2-coredns-vp7lh                        0/1     Completed          0                61m
helm-install-rke2-ingress-nginx-mcm6s                  0/1     CrashLoopBackOff   13 (4m48s ago)   61m
helm-install-rke2-metrics-server-llsgm                 1/1     Running            14 (5m10s ago)   61m
helm-install-rke2-runtimeclasses-hwjzd                 1/1     Running            14 (5m25s ago)   61m
helm-install-rke2-snapshot-controller-crd-smxzk        0/1     CrashLoopBackOff   13 (5m ago)      61m
helm-install-rke2-snapshot-controller-pzsk6            0/1     CrashLoopBackOff   13 (4m57s ago)   61m
kube-apiserver-rzlxdefdevrke01                         1/1     Running            0                61m
kube-controller-manager-rzlxdefdevrke01                1/1     Running            0                61m
kube-proxy-rzlxdefdevrke01                             1/1     Running            0                60m
kube-scheduler-rzlxdefdevrke01                         1/1     Running            0                61m
rke2-coredns-rke2-coredns-6ff85b696b-d82zg             0/1     Running            0                72s
rke2-coredns-rke2-coredns-autoscaler-6898c795f-884wh   1/1     Running            18 (5m36s ago)   61m

powerful-librarian-10572

05/07/2025, 12:38 PM

coredns being dead may not help the cluster agent to resolve hosts

nice-breakfast-55063

05/07/2025, 12:43 PM

it says

Copy code

maxprocs: Updating GOMAXPROCS=1: using minimum allowed GOMAXPROCS
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[WARNING] plugin/kubernetes: starting server with unsynced Kubernetes API
.:53
[INFO] plugin/reload: Running configuration SHA512 = c18591e7950724fe7f26bd172b7e98b6d72581b4a8fc4e5fc4cfd08229eea58f4ad043c9fd3dbd1110a11499c4aa3164cdd63ca0dd5ee59651d61756c4f671b7
CoreDNS-1.12.0
linux/amd64, go1.22.8 X:boringcrypto, 51e11f166
[INFO] plugin/kubernetes: pkg/mod/k8s.io/client-go@v0.31.2/tools/cache/reflector.go:243: failed to list *v1.Namespace: Get "<https://100.195.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0>": dial tcp 100.195.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
[INFO] plugin/kubernetes: pkg/mod/k8s.io/client-go@v0.31.2/tools/cache/reflector.go:243: failed to list *v1.Service: Get "<https://100.195.0.1:443/api/v1/services?limit=500&resourceVersion=0>": dial tcp 100.195.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
[INFO] plugin/kubernetes: pkg/mod/k8s.io/client-go@v0.31.2/tools/cache/reflector.go:243: failed to list *v1.EndpointSlice: Get "<https://100.195.0.1:443/apis/discovery.k8s.io/v1/endpointslices?limit=500&resourceVersion=0>": dial tcp 100.195.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error

powerful-librarian-10572

05/07/2025, 1:36 PM

Seems theres a CNI issue somewhere, good luck 😮

creamy-pencil-82913

05/07/2025, 3:42 PM

You disabled kube-proxy. How are you expecting to reach service clusterip addresses without kube-proxy?

powerful-librarian-10572

05/07/2025, 3:43 PM

did he? it says : disable-kube-proxy: false

creamy-pencil-82913

05/07/2025, 4:38 PM

ah right, sorry I was on mobile and the wrapping makes it hard to read. false is the default so usually there is no reason to set it if it’s false. Since this is a single-node cluster things should be pretty simple. Is there perhaps a local firewall on this node that needs to be disabled?

nice-breakfast-55063

05/08/2025, 7:53 AM

thanks for both your input. will be able to look more deeply into it next week.

30 Views

Open in Slack

Previous Next