Hello. I'm trying to enable the graceful node shut...
# rke2
m
Hello. I'm trying to enable the graceful node shutdown feature on a rke2 cluster. I created a config file for kubelet added a flag in
kubelet-arg
to use it:
Copy code
apiVersion: <http://kubelet.config.k8s.io/v1beta1|kubelet.config.k8s.io/v1beta1>
kind: KubeletConfiguration
shutdownGracePeriod: 2m
shutdownGracePeriodCriticalPods: 10s
Copy code
kubelet-arg: 
  - config=/etc/rancher/rke2/kubelet.yaml
I see the systemd inhibit lock taken by
kubelet
in
systemd-inhibit --list
. When I then execute
systemctl reboot
I see the node changes status to
NotReady
instantly. But I don't see the pods on that node terminating, when running
kubectl get pods
. I also verified, if the pods are actually terminated on that node by executing
/var/lib/rancher/rke2/bin/crictl ps
and they are! In the kubelet logs I see some connection issues to apiserver:
Copy code
E0115 11:39:15.784024    2080 kubelet.go:1899] failed to "KillPodSandbox" for "af6511b8-46f2-43bf-a590-0b9a130a4e98" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for sandbox \"b5fca2db2fd9a302ac72de1ea781e1da38ef0a488e862c441fe156489133930e\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: Get \"<https://10.43.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default>\": dial tcp 10.43.0.1:443: connect: connection refused"
E0115 11:39:19.185411    2080 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"xxx\": Get \"<https://127.0.0.1:6443/api/v1/nodes/xxx?timeout=10s>\": dial tcp 127.0.0.1:6443: connect: connection refused"
Has anybody used the graceful node shutdown feature with rke2 and had similar problems?
193 Views