Hi, I am using rancher cluster provisioning to cre...
# rke2
b
Hi, I am using rancher cluster provisioning to create and update my downstream cluster via rancher. After a change to the machine cloudconfig (vmware), especially the cloud-init part, the created VMs don't install rke2. On the Rancher UI It shows "Waiting for probes: calico, kubelet" for the new node, but somehow the node doesn't finish setting up rke2. The rancher systemd agent is running on the node, and shows no obvious error. Can you hint me where I can look on the node to find what's not working during setup? I'm using Rancher 2.9.3 and my VMs are ubuntu 24.04 templates on vmWare.
I found following error in rke2-agent journal
level=error msg="failed to get CA certs: Get \"<https://127.0.0.1:6444/cacerts>\": EOF"
I had the same error on another node and after I rebooted the node, the rke2 installation succeeded. But I want to understand what changed by rebooting. It shouldn't be a connection problem, since it was working before the update.
For anyone interested: My change to the cloud-init was setting sysctl parameters via a conf.d file and reloading sysctl with
sysctl --system
in the runCmd stage of cloud-init. Somehow this screwed up the rke2 setup/systemd-service. When reloading the sysctl change with
sysctl -p <filepath>
in runCmd it now works again, and the rke2 setup completes successfully.