melodic-tiger-82590
01/15/2024, 10:40 AMkubelet-arg
to use it:
apiVersion: <http://kubelet.config.k8s.io/v1beta1|kubelet.config.k8s.io/v1beta1>
kind: KubeletConfiguration
shutdownGracePeriod: 2m
shutdownGracePeriodCriticalPods: 10s
kubelet-arg:
- config=/etc/rancher/rke2/kubelet.yaml
I see the systemd inhibit lock taken by kubelet
in systemd-inhibit --list
. When I then execute systemctl reboot
I see the node changes status to NotReady
instantly. But I don't see the pods on that node terminating, when running kubectl get pods
.
I also verified, if the pods are actually terminated on that node by executing /var/lib/rancher/rke2/bin/crictl ps
and they are! In the kubelet logs I see some connection issues to apiserver:
E0115 11:39:15.784024 2080 kubelet.go:1899] failed to "KillPodSandbox" for "af6511b8-46f2-43bf-a590-0b9a130a4e98" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for sandbox \"b5fca2db2fd9a302ac72de1ea781e1da38ef0a488e862c441fe156489133930e\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: Get \"<https://10.43.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default>\": dial tcp 10.43.0.1:443: connect: connection refused"
E0115 11:39:19.185411 2080 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"xxx\": Get \"<https://127.0.0.1:6443/api/v1/nodes/xxx?timeout=10s>\": dial tcp 127.0.0.1:6443: connect: connection refused"
Has anybody used the graceful node shutdown feature with rke2 and had similar problems?