Hello, I am trying to create a new rke2 cluster f...
# rke2
j
Hello, I am trying to create a new rke2 cluster from rancher, but I can't manage to have the worker nodes working Currently using : - Harvester 1.4.2 - Rancher 2.11.1 - rke2 v1.32.3+rke2r1 with cilium If I create a cluster with 3 nodes per role, the Control Plane and Etcd switch to Running, but the workers remains in the state "Waiting for Node Ref". The cluster has the message "waiting for control plane to be available" I already check the logs from rancher, capi, rke2-server from etcd and controlplane, rancher-system-agent, ... Not sure about what to do next, looking for inputs/help Let me know if you need any other information ! Thanks
m
Check the rke2-server logs on the hosts.
journalctl -u rke2-server
j
On the etcd nodes, nothing except :
Copy code
Apr 28 18:01:52 test-cluster-etcd-fhmbf-j8lrp rke2[1884]: time="2025-04-28T18:01:52Z" level=info msg="Reconciling ETCDSnapshotFile resources"
Apr 28 18:01:52 test-cluster-etcd-fhmbf-j8lrp rke2[1884]: time="2025-04-28T18:01:52Z" level=info msg="Reconciliation of ETCDSnapshotFile resources complete"
On the control plane nodes, nothing except the initial setup logs, but no error logs ... And on the worker nodes, no service rke2-server or rke2-agent, and the service rancher-system-agent is stuck on :
Copy code
Apr 27 20:06:50 test-cluster-worker-qq8lr-hp9r9 systemd[1]: Started rancher-system-agent.service - Rancher System Agent.
Apr 27 20:06:50 test-cluster-worker-qq8lr-hp9r9 rancher-system-agent[1665]: time="2025-04-27T20:06:50Z" level=info msg="Rancher System Agent version v0.3.12 (e4876a6) is starting"
Apr 27 20:06:50 test-cluster-worker-qq8lr-hp9r9 rancher-system-agent[1665]: time="2025-04-27T20:06:50Z" level=info msg="Using directory /var/lib/rancher/agent/work for work"
Apr 27 20:06:50 test-cluster-worker-qq8lr-hp9r9 rancher-system-agent[1665]: time="2025-04-27T20:06:50Z" level=info msg="Starting remote watch of plans"
Apr 27 20:06:50 test-cluster-worker-qq8lr-hp9r9 rancher-system-agent[1665]: time="2025-04-27T20:06:50Z" level=info msg="Starting /v1, Kind=Secret controller"
m
Can the pods/node dns resolve your Rancher endpoint?
j
Yes, they can resolve the rancher endpoint, and the ports 80/443 are reachable. Otherwise the etcd and controlplane nodes probably wouldn't show as running
1
Created an issue related to this : https://github.com/rancher/rancher/issues/50065
1