This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

11/05/2024, 3:22 PM

This message was deleted.

miniature-notebook-6405

11/05/2024, 3:52 PM

Gonna play around with the node roles and get more information, one etcd and one control plane the rest workers maybe I can "repair" workers

miniature-notebook-6405

11/05/2024, 4:46 PM

Will try this Poor Shop's Resilience Mix: One control node isolated, the rest etcd + worker. I should be able to cycle out worker nodes and will trying blowing away the control node and reinstalling... https://discuss.kubernetes.io/t/why-should-i-separate-control-plane-instances-from-etcd-instances/17706/2

creamy-pencil-82913

11/05/2024, 6:41 PM

etcd can’t split brain, it won’t even run without quorum. The only way you’d get something like that is if you did a cluster-reset or a snapshot restore and basically intentionally split the cluster in two, with two separate etcd clusters.

miniature-notebook-6405

11/05/2024, 6:45 PM

hi @creamy-pencil-82913 thanks for your reply. I only did what I documented here. Maybe there was some residual stuff on the "repaired node" that in effect resulted in a split brain, I ran the node registration script again on the repaired node to re-join after I had (inadequately) cleaned it out.

miniature-notebook-6405

11/05/2024, 9:53 PM

Changed my mind, going to isolate worker nodes. No etcd on those nodes for sure.

2 Views

Open in Slack

Previous Next