Hi, I have a custom rke2 cluster under ubuntu serv...
# rke2
b
Hi, I have a custom rke2 cluster under ubuntu servers. I have 5 worker nodes and 3 etcd/control plane nodes. I'm trying join a four etcd/control plane server but it stops in provisioning state and waiting for node ref... Some log that I should look for seeing what is happening? Thanks beforehand
h
check
journalctl -u rke2-server
on the new node you are trying to join Also, you need odd number nodes for HA, so perhaps consider adding 2 more nodes to existing 3 nodes. with 5 CP/etcd nodes you can reboot 2 at a time with 4(or 3) CP/etcd nodes you can only reboot 1 at a time
b
thank you very much for replying (I'm Spanish, so I'm sorry if my written English is bad)
journalctl for rke2-server shows No entries, I think that there isn't nothing running about rke2, but rancher-system-agent is running ok, when I run journalctl -u rancher-system-agent it shows:
abr 24 143042 node_name rancher-system-agent[18201]: E0424 143042.917977 18201 memcache.go:206] couldn't get resource list for management.cattle.io/v3: abr 24 143042 node_name rancher-system-agent[18201]: time="2025-04-24T143042+01:00" level=info msg="Starting /v1, Kind=Secret controller"
there isn't firewall in the nodes
I have 3 CP/etcd nodes and 5 workers. We have 4 CP/etcd but we had a problem and I had to re-add the fourth CP/etcd and it isn't joining now
m
Puedes proveer todo el registro de
journalctl -u rancher-system-agent
. Lo puedes proveer via un paste bin para que sea mejor revisar. Gracias! https://paste.opensuse.org/
b
thank you very much
I've seen now that it shows an error
I've not pasted all because this message is repeated all the time
rancher-system-agent is running
error that I see now is: panic: error while connecting to Kubernetes cluster: the server has asked for the client to provide credentials
m
By chance did you remove/re-use the node from the cluster that you are trying to add as a controlplane? Haven't seen this error before. Are there any logs from
journalctl -u rke2-server.service
?
b
yes, it is a reused node (previously cleaned), although I have set up a new server from scratch and the same thing happens to it. The rke2 service never appears in both cases.
m
When cleaned, did you remove/rm the /etc/rancher and /var/lib/rancher directories? The uninstall script does not do it and remnants from the previous install will persist.
b
when I go to rancher gui - cluster management - my cluster - related resources tab, it shows a component in error, type mgmt cluster