Hi 👋
I run in a real world example about why we run multiple instances of control plane and etcd:
• I run rancher 2.5.15 with 2 controlplane and 3 etcd (and a dozen of big worker nodes)
• One node with a controlplane and etcd get unstable and I planned its replacement
• While setup the next node, the node owning the third etcd copy just blown (disk crashed)
So right now, the cluster has the current status:
• for some reason rancher try to rotate certificates
• the node with controlplane and etcd try to get elected leader, but can't reach the unstable node (reboot kube containers in loop) to validate the election
Is there a way to stop the in progress cert rotation?
Should I remove the unstable node from the cluster (so the remaning controlplane and etcd can get elected since there won't be any other candidates) ? Thank you