https://rancher.com/ logo
Title
r

rapid-church-43569

05/26/2023, 7:00 AM
Hi everybody, we have some problems with a rancher installed rke2 cluster on custom nodes. The setup is working. We after that did some HA tests and when the Rancher server was rebooted the user cluster broke. The etcd on the user cluster lost its quorum and 2 of 3 master nodes crashed. Can someone explain why this happens ? In my understanding a user cluster should survive a reboot of rancher. Any thoughts ?
c

creamy-pencil-82913

05/26/2023, 2:36 PM
You'd need to check logs on the cluster to see why it lost quorum. The downstream cluster isn't dependent on rancher.
Are you running on shared hardware? Etcd is very sensitive to disk latency.