This message was deleted.
# rke
a
This message was deleted.
w
@fast-piano-59234 any thoughts? You helped point me in the direction to get further on this issue and this is where I currently am stuck.
f
Sorry was on PTO, this is an RKE1 cluster in Rancher right?
If you believe the cluster is in a weird state, I would start there and run etcd troubleshooting as described here https://ranchermanager.docs.rancher.com/troubleshooting/kubernetes-components/troubleshooting-etcd-nodes and possibly other k8s related steps on https://ranchermanager.docs.rancher.com/troubleshooting/kubernetes-components/. After the cluster is healthy, we can diagnose if it still fails to add a new node
w
@fast-piano-59234 our etcd looks normal, I tried some of those commands for health check and alarms. We did not get any errors from etcd from our existing 2 control nodes (etcd). Our cluster is saying waiting for control-1 to finish provisioning but that control node has been deleted for a few weeks now. It also still sees control-1 as our kube-controller-manager leader, if that makes any difference...
stuck in an updating state
f
This seems to be an issue with Rancher not having the correct state of the cluster, causing it to wait on something that will never be done/finish. We might need to manually intervene to fix this situation but I can't advice without more details like Rancher version, Rancher install info (HA etc), Rancher k8s version on the cluster, debug/trace logs from Rancher pods, and depending on the output from this, custom resource dump from the cluster to check its state vs what is really there. If you can also include the actual nodes that do exists + kubectl get nodes output so I can match the current state with what Rancher is thinking.
w
I am going to send this info direct to your DMs if that is okay with you
220 Views