Hi everyone. I have a cluster that runs locally on...
# rke2
b
Hi everyone. I have a cluster that runs locally on vSphere that exploded one day, and we had to delete the cluster from Rancher and provision a new one, and then restore the etcd backup of that cluster onto it. I managed to get the cluster back to being healthy and were able to add nodes, but sometimes when rke2-server would restart on the node it would complain of the
/var/lib/rancher/rke2/server/cred/passwd
being newer than the datastore. I would delete this and restart rke2-server to bring those control plane nodes back up. Now I went to scale up the control planes, and the new node does nothing (just sits there with rancher-system-agent waiting for secrets), and the bootstrap node went into an error state complaining about bootstrap data being encrypted with a different token. All of the places I can think of to look have what I expect to be the correct token (it matches what is on the single healthy control plane node). Does anyone have an idea of what I am missing?