sticky-summer-13450
11/29/2022, 3:08 PMTerminating
state and I want to know whether it's k3s, k8s, or me.
Example: I have cluster with 1 server
node and several worker
nodes, and I have workloads spread across the workers. Lets say a worker node dies - maybe it's never going to return.
$ kubectl get pods --context kube001 --all-namespaces -o=wide |grep Terminating
kube-system traefik-9c6dc6686-jdt9f 1/1 Terminating 0 24d 10.42.1.4 kube002 <none> <none>
active-mq active-mq-6665f5d8b9-ztwnq 1/1 Terminating 0 15d 10.42.1.82 kube002 <none> <none>
Some of the pods get stuck in the terminating state and don't get replaced on other worker nodes. This means the cluster is no-longer respecting the declarative state.
Is this a problem specific to me, a problem specific to k3s, a problem with k8s, or something else?$ kubectl describe pod --context kube001 --namespace active-mq active-mq-6665f5d8b9-ztwnq
Name: active-mq-6665f5d8b9-ztwnq
Namespace: active-mq
Priority: 0
Service Account: default
Node: kube002/10.64.8.117
Start Time: Sun, 13 Nov 2022 15:07:50 +0000
Labels: app=active-mq
pod-template-hash=6665f5d8b9
Annotations: <http://kubernetic.com/restartedAt|kubernetic.com/restartedAt>: 2021-04-30T16:34:24+01:00
Status: Terminating (lasts 2d5h)
Termination Grace Period: 30s
...
In this example the pod has been in that state for more than 2 days.creamy-pencil-82913
11/29/2022, 6:40 PMsticky-summer-13450
11/29/2022, 6:44 PMit can’t show as “terminated” because Kubernetes doesn’t know anything about that node.
So - why have some pods terminated correctly and some show that they are terminating. It's inconsistent.creamy-pencil-82913
11/29/2022, 7:35 PMThe node controller does not force delete pods until it is confirmed that they have stopped running in the cluster. You can see the pods that might be running on an unreachable node as being in theorTerminating
state. In cases where Kubernetes cannot deduce from the underlying infrastructure if a node has permanently left a cluster, the cluster administrator may need to delete the node object by hand. Deleting the node object from Kubernetes causes all the Pod objects running on the node to be deleted from the API server and frees up their names.Unknown
sticky-summer-13450
11/30/2022, 9:00 AM