https://rancher.com/ logo
Title
s

steep-manchester-31195

12/22/2022, 9:59 AM
Hi everyone! Quick question, I accidentally started
rke2-server
service on an
rke2-agent
service node. It converted the node in the cluster to a control-plane node, which it shouldn't be. I removed the node (
kubectl delete node
), rebooted it and started rke2-agent again, which added it. As good as everything is fine now, but control-plane pods are still started on the node... I tried deleting the pods , but they always come back:
cloud-controller-manager-agent-001        0/1     Error       2              9m17s
etcd-agent-001                            0/1     Completed   14             5m10s
kube-apiserver-agent-001                  0/1     Error       19             5m10s
kube-controller-manager-agent-001         1/1     Running     2 (34m ago)    3m7s
kube-scheduler-agent-001                  1/1     Running     2 (34m ago)    3m6s
I thought etcd might be the problem, so I tried to annotate
kubectl annotate node agent-001 <http://etcd.k3s.cattle.io/remove=true|etcd.k3s.cattle.io/remove=true>
, but this did not work. How do I proceed?
s

sparse-fireman-14239

12/22/2022, 2:30 PM
$ sudo /usr/local/bin/rke2-uninstall.sh $ kubectl delete node <node> run the install again as agent
s

steep-manchester-31195

12/22/2022, 2:31 PM
Hi @sparse-fireman-14239, thank you for this! I was hoping there was a way without the uninstall.sh 😅 as this is an active workload agent
But if this is the only way...
s

sparse-fireman-14239

12/22/2022, 2:31 PM
There might be other methods, dunno.
s

steep-manchester-31195

12/22/2022, 2:32 PM
I'm just wondering what keeps scheduling the pods again after I delete them
The pods are controlled by the node object
But there's nothing in there that gives a hint
Could it be the rke2-config-hash annotation?
<http://rke2.io/node-config-hash|rke2.io/node-config-hash>: redacted====
c

cold-egg-49710

12/22/2022, 2:46 PM
have you tried to run rke2-killall.sh and then to start rke2-agent (systemctl start rke2-agent)?
s

steep-manchester-31195

12/22/2022, 2:46 PM
No I haven't, but I did reboot. Let me try.
Did not work unfortunately
h

hallowed-breakfast-56871

12/22/2022, 6:49 PM
I have been in the same boat due to terraform typo's. You will need to uninstall, then re-install for the node to come back into the cluster as a worker. Doesn't take long.
c

creamy-pencil-82913

12/22/2022, 7:30 PM
delete the control-plane component manifests from /var/lib/rancher/rke2/agent/pod-manifests
👍 1
and then delete the node from the cluster and rejoin it to get the labels and annotations cleared
s

steep-manchester-31195

01/24/2023, 3:20 PM
Thanks Brandon, should this happen again I'll try this approach!
@creamy-pencil-82913 FYI, I needed this again, and tried your approach. Worked perfectly. Thanks!