I m trying to add an etcd only dedicated etcd node to a clus Rancher Users #rke2

I’m trying to add an etcd-only (dedicated etcd) no...

alert-candle-24446

05/30/2024, 3:17 PM

I’m trying to add an etcd-only (dedicated etcd) node to a cluster. It’s a small cluster of Ubuntu 20.04.6 servers. I have one node with roles “control-plane,etcd,master” and another with roles “control-plane,master” so far. I can ’kubectl get nodes” and see them but the etcd-only node is NotReady and I see a bunch of errors when I run journalctl -xef. RKE2 is v1.28.10+rke2r1. One of the errors is “panic: bootstrap data already found and encrypted with different token”

creamy-pencil-82913

05/30/2024, 5:05 PM

It sounds like there’s an issue with the token on the second server… I would probably uninstall RKE2 from that node,

kubectl delete node

to remove it from the cluster, then reinstall and rejoin it with the correct token.

creamy-pencil-82913

05/30/2024, 5:05 PM

I’m not sure how it would even get into the cluster with the wrong token though.

creamy-pencil-82913

05/30/2024, 5:06 PM

just to be clear, you have an all-roles server, and a control-plane-only server, and the etcd-only node does not show up in the cluster at all? Or it does show up but is NotReady?

alert-candle-24446

05/31/2024, 11:42 PM

the etcd-only node was half-joined to the cluster. Was in state Not Ready

alert-candle-24446

05/31/2024, 11:42 PM

chatgpt told me to do this: kubectl config set-cluster tzz-yo --server=https://zt1.tzz.yo:6443 --certificate-authority=/etc/docker/certs.d/docker.tzz.yo:5000/ca.crt kubectl config set-credentials msh --client-certificate=/etc/docker/certs.d/docker.tzz.yo:5000/ca.crt --client-key=/var/lib/rancher/rke2/server/tls/etcd/client.key kubectl config set-context tzz-yo --cluster=tzz --user=msh kubectl config use-context tzz-yo

alert-candle-24446

05/31/2024, 11:43 PM

but my user can’t read /etc/docker/certs.d/docker.tzz.yo:5000/ca.crt no matter what permissions I set

alert-candle-24446

05/31/2024, 11:50 PM

chatgpt confuses rke2 with generic upstream K8s, sadly.

alert-candle-24446

05/31/2024, 11:51 PM

makes references to /etc/kubernetes/

alert-candle-24446

05/31/2024, 11:56 PM

I broke the shit out of my ability to talk to my cluster with those AI generated commands.

alert-candle-24446

05/31/2024, 11:56 PM

🙂

alert-candle-24446

05/31/2024, 11:59 PM

so I went cp /etc/rancher/rke2/rke2.conf /home/msh/.kube/config

alert-candle-24446

05/31/2024, 11:59 PM

and I can still talk to the cluster

alert-candle-24446

06/01/2024, 12:00 AM

NAME STATUS ROLES AGE VERSION cz1 Ready control-plane,etcd,master 2d22h v1.28.10+rke2r1 cz2 Ready control-plane,master 47h v1.28.10+rke2r1

alert-candle-24446

06/01/2024, 12:03 AM

the etcd-only host is gone

alert-candle-24446

06/01/2024, 12:07 AM

etcd node is always saying, Jun 01 000426 cp0 systemd[1]: rke2-server.service: Found left-over process 837 (containerd-shim) in control group while starting unit. Ignoring. Jun 01 000426 cp0 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jun 01 000426 cp0 systemd[1]: rke2-server.service: Found left-over process 844 (containerd-shim) in control group while starting unit. Ignoring. Jun 01 000426 cp0 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=warning msg=“not running in CIS mode” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“Applying Pod Security Admission Configuration” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“Starting rke2 v1.28.10+rke2r1 (b0d0d687d98f4fa015e7b30aaf2807b50edcc5d7)” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“Managed etcd cluster bootstrap already complete and initialized” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“Reconciling bootstrap data between datastore and disk” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“Successfully reconciled with datastore” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=start Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“schedule, now=2024-06-01T000426Z, entry=1, next=2024-06-01T120000Z” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“Starting etcd for existing cluster member” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“Defragmenting etcd database” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“etcd data store connection OK” Jun 01 000426 cp0 rke2[26308]: time=“2024-06-01T000426Z” level=info msg=“Saving cluster bootstrap data to datastore” Jun 01 000426 cp0 rke2[26308]: panic: bootstrap data already found and encrypted with different token

alert-candle-24446

06/01/2024, 12:10 AM

I think I need to wipe out the etcd database entirely

alert-candle-24446

06/01/2024, 12:10 AM

just on the etcd-only node

creamy-pencil-82913

06/01/2024, 12:19 AM

yeah just stop rke2-server, rm -rf /var/lib/rancher/rke2/server/db, then start it again

creamy-pencil-82913

06/01/2024, 12:19 AM

on that node that wont join

alert-candle-24446

06/01/2024, 5:07 PM

do I need to run a cluster-init with v1.28.10+rke2r1?

alert-candle-24446

06/01/2024, 5:08 PM

I realized there is was a rancher http gui docker image last week and forgot I had installed it until today. I could use it to import a cluster but I don’t really have a cluster yet.

alert-candle-24446

06/01/2024, 5:10 PM

btw I completely wiped the etcd-only VM and reinstalled without any restrictions on what process/daemon could run.

alert-candle-24446

06/01/2024, 5:10 PM

so it’s a plain server.

alert-candle-24446

06/01/2024, 5:10 PM

I have only one controlplane VM at the moment

alert-candle-24446

06/01/2024, 5:14 PM

How do I add nodes to the default local-node cluster or should I build a new cluster using cli only and then see if the gui can import it as a generic k8s cluster?

255 Views

Open in Slack

Previous Next