05/15/2023, 1:39 PM
Hi 👋 I run in a real world example about why we run multiple instances of control plane and etcd: • I run rancher 2.5.15 with 2 controlplane and 3 etcd (and a dozen of big worker nodes) • One node with a controlplane and etcd get unstable and I planned its replacement • While setup the next node, the node owning the third etcd copy just blown (disk crashed) So right now, the cluster has the current status: • for some reason rancher try to rotate certificates • the node with controlplane and etcd try to get elected leader, but can't reach the unstable node (reboot kube containers in loop) to validate the election Is there a way to stop the in progress cert rotation? Should I remove the unstable node from the cluster (so the remaning controlplane and etcd can get elected since there won't be any other candidates) ? Thank you