This message was deleted.
# general
a
This message was deleted.
p
I've faced this issue when removing the last working node in a cluster. I've yet to find a better solution than deleting the cluster and restarting from scratch
@few-coat-36487
๐Ÿ™ 1
Does your cluster still has other nodes available while you're tying to register again this particular one?
f
So, I don't have any other nodes, because I'm in the process of creating my cluster and I'm encountering quite a few problems, since I'm quite new to Rancher. So, for the moment, I only have one node. Could this be a reason for the problem? What's the minimum number of nodes?
p
(please write in english for everyone to understand)
So, I don't have any other nodes, because I'm in the process of creating my cluster and I'm encountering quite a few problems, since I'm quite new to Rancher. So, for the moment, I only have one node. Could this be a reason for the problem? What's the minimum number of nodes?
Basically one node for testing is enough, and i started fighting seriously against rancher two weeks ago so i totally understand your feeling.
But yeah, if you have to reset your node for Y reason, re-do your cluster. i believe it has something about (as stated in the message) the etcd control plane being initialized in the cluster, but your node having no other controller to learn from
f
Alright, I'll do what you said, destroy and recreate the cluster hoping that it works.
p
When you add your first node to a brand new cluster, etcd will start from scratch. Also, rancher-system-agent-uninstall followed by rke2-uninstall is enough before re-registering your node
Also please be aware that the minimal "production-ready" nodes is 3, by design
๐Ÿ‘ 1
f
Okay, I tried with two nodes (and a fresh cluster), and the first machine that registers locks up with an error. But when I delete it, the second node goes into error, as if there's an error queue. I'm going to check the logs.
p
Huh, before registering your second node, the first node should be available
f
Okay, good thing to know.
p
Also just to be sure, i recommend building a cluster on fresh installed nodes. I lost about 8h fighting (and losing) to make calico works on old nodes, while everything goes flawless on fresh installs
By fresh install, i don't mean uninstalling and reinstalling would break the thing, just that those node should be installed to become rke2 nodes and nothing else
f
Reset everything ๐Ÿ˜• Fresh almalinux... okay, I will do that on Friday. I'm taking my day off. Thanks for the help!
p
I wont be here on friday, i won't come back until monday. Take care and good luck !
๐Ÿ‘ 1
f
One last question what happens if there's a problem on the node and there are data on it? Kubernetes isn't really viable if you want to put a MinIO or Longhorn to store data. If every time there's a problem, you have to reset the node.
p
If your cluster is properly setup and has made a quorum (requiring those 3 nodes i talked about earlier), Longhorn is such a superior solution. When the node is marked as failed after the quorum fails to comunicate with it, Longhorn will (supposedly) terminate the pods and it will be brought back up on another node.
Longhorn just manages all the data "magically". It has some quirks but is the sole reason i am migrating my whole production cluster to rancher 2.
f
Okay, so last Friday I reinstalled the nodes nevertheless, it didn't work, and I was stuck with that on a fresh install. https://rancher-users.slack.com/archives/C01PHNP149L/p1715333103453829. I will retry a manual uninstall from rke2 last time I tried, it didn't seem to change anything.