This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

11/29/2022, 3:08 PM

This message was deleted.

sticky-summer-13450

11/29/2022, 3:27 PM

Copy code

$ kubectl describe pod --context kube001 --namespace active-mq active-mq-6665f5d8b9-ztwnq
Name:                      active-mq-6665f5d8b9-ztwnq
Namespace:                 active-mq
Priority:                  0
Service Account:           default
Node:                      kube002/10.64.8.117
Start Time:                Sun, 13 Nov 2022 15:07:50 +0000
Labels:                    app=active-mq
                           pod-template-hash=6665f5d8b9
Annotations:               <http://kubernetic.com/restartedAt|kubernetic.com/restartedAt>: 2021-04-30T16:34:24+01:00
Status:                    Terminating (lasts 2d5h)
Termination Grace Period:  30s
...

In this example the pod has been in that state for more than 2 days.

creamy-pencil-82913

11/29/2022, 6:40 PM

if it’s never coming back you should delete it from the cluster

creamy-pencil-82913

11/29/2022, 6:40 PM

it can’t show as “terminated” because Kubernetes doesn’t know anything about that node. It’s not reporting in, so showing that it’s terminated would be incorrect. It could still be running but just unable to reach the server.

creamy-pencil-82913

11/29/2022, 6:41 PM

Kubernetes can’t reason about anything happening on a node that’s not checking in.

sticky-summer-13450

11/29/2022, 6:44 PM

I hope the node will come back, but currently Harvester/Longhorn has eaten the root disk. So maybe I should remove the node and it'll join again.

sticky-summer-13450

11/29/2022, 6:48 PM

it can’t show as “terminated” because Kubernetes doesn’t know anything about that node.

So - why have some pods terminated correctly and some show that they are terminating. It's inconsistent.

creamy-pencil-82913

11/29/2022, 7:35 PM

This is covered in the Kubernetes documentation. https://kubernetes.io/docs/concepts/architecture/nodes/#node-status

The node controller does not force delete pods until it is confirmed that they have stopped running in the cluster. You can see the pods that might be running on an unreachable node as being in the
Terminating
or
Unknown
state. In cases where Kubernetes cannot deduce from the underlying infrastructure if a node has permanently left a cluster, the cluster administrator may need to delete the node object by hand. Deleting the node object from Kubernetes causes all the Pod objects running on the node to be deleted from the API server and frees up their names.

creamy-pencil-82913

11/29/2022, 7:36 PM

Kubernetes is what is known as “eventually consistent”. If some part of the cluster is unavailable and can’t be reasoned about, that information can’t be updated until it either comes back, or is manually deleted.

👍 1

sticky-summer-13450

11/30/2022, 9:00 AM

I deleted the node and the VMs got themselves out of the Terminating state. Thank you. I really need to get the "cattle not pets" mantra locked even harder into my brain. The node is a VM that's playing up in Harvester so I didn't think that deleting the node from kubernetes was the right thing to do. Thanks for correcting me.

👍 1

482 Views

Open in Slack

Previous Next