https://rancher.com/ logo
Title
a

average-autumn-93923

07/05/2022, 5:25 PM
Here’s a funny one… I have two master nodes that disagree about which one is ready.
root@dp6448:~# k3s kubectl get nodes
NAME     STATUS     ROLES                       AGE   VERSION
dp6448   Ready      control-plane,etcd,master   69d   v1.23.4+k3s1
dp6449   NotReady   control-plane,etcd,master   69d   v1.23.4+k3s1

root@dp6449:~# k3s kubectl get nodes
NAME     STATUS     ROLES                       AGE   VERSION
dp6448   NotReady   control-plane,etcd,master   69d   v1.23.4+k3s1
dp6449   Ready      control-plane,etcd,master   69d   v1.23.4+k3s1
If I ask them to
kubectl describe node
each other, I get
Conditions:
  Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason              Message
  ----             ------    -----------------                 ------------------                ------              -------
  MemoryPressure   Unknown   Tue, 05 Jul 2022 10:32:22 +0000   Tue, 05 Jul 2022 10:33:42 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
  DiskPressure     Unknown   Tue, 05 Jul 2022 10:32:22 +0000   Tue, 05 Jul 2022 10:33:42 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
  PIDPressure      Unknown   Tue, 05 Jul 2022 10:32:22 +0000   Tue, 05 Jul 2022 10:33:42 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
  Ready            Unknown   Tue, 05 Jul 2022 10:32:22 +0000   Tue, 05 Jul 2022 10:33:42 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
Everything other than that seems fine… etcd is healthy (one of the nodes is the leader and the other is not), they’re just both surfacing that this node isn’t ready. Confused as to how this is possible with healthy etcd.
c

creamy-pencil-82913

07/05/2022, 5:27 PM
You really shouldn’t have a two-node etcd cluster; you should always have an odd number of etcd nodes. I suspect that your cluster has become corrupted somehow.
🎯 1
a

average-autumn-93923

07/05/2022, 8:39 PM
Yeahhh there’s supposed to be three there but one broke. We’ve also seen this behavior in 3+ node clusters; I’ll check back here next time it happens