clever-address-67932
07/20/2025, 2:49 PMrancher-system-agent.service
started and running without any error. However, no RKE2 get installed. On Rancher UI, it shows the node as Waiting for Node Ref
. Debug log on Rancher Server show something related to the node:
[DEBUG] Searching for providerID for selector rke.cattle.io/machine=acef2fd1-52a8-4f86-a56e-168d6ab5896e in cluster fleet-default/proxmox, machine custom-390116fc9c53: {"Code":{"Code":"Forbidden","Status":403},"Message":"clusters.management.cattle.io \"c-m-bbm7s6lm\" is forbidden: User \"u-nxbq6gtuep\" cannot get resource \"clusters\" in API group \"management.cattle.io\" at the cluster scope","Cause":null,"FieldName":""} (get nodes)
I tried to delete the node, clean it, and rerun the registration script again. I repeat that sequence multiple times, then it works once (RKE2 installed and nodes change to state and working). It appears to be random.
Could anyone point me how I can debug further? Tried to turn on DEBUG mode on rancher-system-agent
as well but nothing seems to be related.
Thankssparse-vr-27407
07/21/2025, 7:28 AMkubectl get clusters
.
My first bet is that maybe between the cleans something leftover remained in Rancher (old credentials, certificates belong to the cleaned cluster etc) and it tries to use that on the new instance.
I would try to wipe the node (drop the vm disks, save vm, re-open settings and add new disk, start install). Dropping and re-adding the disk ensures they get a new UUID and a clean MBR / EFI, so zero chance of matching anything before.
I usually create a snapshot of such vms before the first boot to avoid this procedure and have a guaranteed clean rollback state.
After the node wipe and reinstall, i would go through Rancher with a microscope scanning for any leftovers, and after the cleanup i'd re-try the RKE2 cluster creation.clever-address-67932
07/21/2025, 1:10 PMu-nxbq6gtuep
in the Rancher Local K8s(I guess) ?clever-address-67932
07/22/2025, 4:35 AM