This message was deleted.
# rke2
a
This message was deleted.
c
it is normal for the kubelet not to be able to reach the apiserver when it first starts, as the kubelet itself runs the apiserver as a static pod. You’ll need to provide more information, or look through the logs to find a more relevant error.
In general, I would look in /var/log/pods to confirm that etcd, kube-apiserver, kube-scheduler, and so on are running without errors.
also, check the output of
kubectl get pod -A -o wide
to see what all is running
p
if you also check status of the rke2-server.service and rancher-system-agent.service ? Is there any error in there?
n
thanks for your answers. no errors in both services. as of now i think the issue must lie within DNS.
kubectl logs
of my
cattle-cluster-agent
shows an error (it hangs in CrashLoopBack as well):
Copy code
#dnsPolicy: ClusterFirst
INFO: Using resolv.conf: search cattle-system.svc.cluster.local svc.cluster.local cluster.local <http://mydomain.com|mydomain.com> nameserver 10.43.0.10 options ndots:5
ERROR: <https://myrancher.mydomain.com/ping> is not accessible (Could not resolve host: <http://myrancher.mydomain.com|myrancher.mydomain.com>)

#dnsPolicy: Default
INFO: Using resolv.conf: search <http://mydomain.com|mydomain.com> nameserver 172.1.1.1 nameserver 172.1.1.2
ERROR: <https://myrancher.mydomain.com/ping> is not accessible (Could not resolve host: <http://myrancher.mydomain.com|myrancher.mydomain.com>)
Resolving
<http://myrancher.mydomain.com|myrancher.mydomain.com>
from another pod in kube-system works. found a few open/known issues related to DNS, where #16454 comes closest to what i seem to have. according to this i changed
/etc/hosts
and modified the core-dns configmap - in both cases adding
<http://myrancher.mydomain.com|myrancher.mydomain.com>
as host. i did try different configurations when creating the cluster as well (cilium & calico as cni, leave all network settings all to defaults). No change in behaviour until now.
state of pods in kube-system
Copy code
NAME                                                   READY   STATUS             RESTARTS         AGE
cilium-g8vdw                                           1/1     Running            0                61m
cilium-operator-6dfff79f58-4fxqx                       0/1     Pending            0                61m
cilium-operator-6dfff79f58-n562h                       1/1     Running            0                61m
cloud-controller-manager-rzlxdefdevrke01               1/1     Running            0                61m
etcd-rzlxdefdevrke01                                   1/1     Running            0                60m
helm-install-rke2-cilium-mqdht                         0/1     Completed          0                61m
helm-install-rke2-coredns-vp7lh                        0/1     Completed          0                61m
helm-install-rke2-ingress-nginx-mcm6s                  0/1     CrashLoopBackOff   13 (4m48s ago)   61m
helm-install-rke2-metrics-server-llsgm                 1/1     Running            14 (5m10s ago)   61m
helm-install-rke2-runtimeclasses-hwjzd                 1/1     Running            14 (5m25s ago)   61m
helm-install-rke2-snapshot-controller-crd-smxzk        0/1     CrashLoopBackOff   13 (5m ago)      61m
helm-install-rke2-snapshot-controller-pzsk6            0/1     CrashLoopBackOff   13 (4m57s ago)   61m
kube-apiserver-rzlxdefdevrke01                         1/1     Running            0                61m
kube-controller-manager-rzlxdefdevrke01                1/1     Running            0                61m
kube-proxy-rzlxdefdevrke01                             1/1     Running            0                60m
kube-scheduler-rzlxdefdevrke01                         1/1     Running            0                61m
rke2-coredns-rke2-coredns-6ff85b696b-d82zg             0/1     Running            0                72s
rke2-coredns-rke2-coredns-autoscaler-6898c795f-884wh   1/1     Running            18 (5m36s ago)   61m
p
coredns being dead may not help the cluster agent to resolve hosts
n
it says
Copy code
maxprocs: Updating GOMAXPROCS=1: using minimum allowed GOMAXPROCS
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[WARNING] plugin/kubernetes: starting server with unsynced Kubernetes API
.:53
[INFO] plugin/reload: Running configuration SHA512 = c18591e7950724fe7f26bd172b7e98b6d72581b4a8fc4e5fc4cfd08229eea58f4ad043c9fd3dbd1110a11499c4aa3164cdd63ca0dd5ee59651d61756c4f671b7
CoreDNS-1.12.0
linux/amd64, go1.22.8 X:boringcrypto, 51e11f166
[INFO] plugin/kubernetes: pkg/mod/k8s.io/client-go@v0.31.2/tools/cache/reflector.go:243: failed to list *v1.Namespace: Get "<https://100.195.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0>": dial tcp 100.195.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
[INFO] plugin/kubernetes: pkg/mod/k8s.io/client-go@v0.31.2/tools/cache/reflector.go:243: failed to list *v1.Service: Get "<https://100.195.0.1:443/api/v1/services?limit=500&resourceVersion=0>": dial tcp 100.195.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
[INFO] plugin/kubernetes: pkg/mod/k8s.io/client-go@v0.31.2/tools/cache/reflector.go:243: failed to list *v1.EndpointSlice: Get "<https://100.195.0.1:443/apis/discovery.k8s.io/v1/endpointslices?limit=500&resourceVersion=0>": dial tcp 100.195.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
p
Seems theres a CNI issue somewhere, good luck 😮
c
You disabled kube-proxy. How are you expecting to reach service clusterip addresses without kube-proxy?
p
did he? it says : disable-kube-proxy: false
c
ah right, sorry I was on mobile and the wrapping makes it hard to read. false is the default so usually there is no reason to set it if it’s false. Since this is a single-node cluster things should be pretty simple. Is there perhaps a local firewall on this node that needs to be disabled?
n
thanks for both your input. will be able to look more deeply into it next week.