This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

02/25/2025, 11:17 AM

This message was deleted.

hundreds-evening-84071

02/25/2025, 12:35 PM

how many etcd/control plane nodes do you have ? what version of RKE2? have you looked at this doc? https://gist.github.com/superseb/3b78f47989e0dbc1295486c186e944bf

hundreds-evening-84071

02/25/2025, 12:36 PM

there is section on etcd https://gist.github.com/superseb/3b78f47989e0dbc1295486c186e944bf#etcd

refined-gigabyte-18875

02/25/2025, 12:40 PM

Hihi, I have 3 control plane nodes (embedded etcd), its running v1.23.5, and yes I have seen this doc, have ensured the binaries and images are there but some of the commands doesn't work for me. I can't

kubectl

to check some things because the master nodes are all dead.

refined-gigabyte-18875

02/25/2025, 12:41 PM

I can't verify if any pods is running on the master node through ctr/crictl, but based on the context deadline exceeded, I can only assume the etcd are failing somehow, its just weird and random that this is happening only to master suddenly

hundreds-evening-84071

02/25/2025, 12:44 PM

then you will have to use

crictl

and

etcdctl

https://gist.github.com/superseb/3b78f47989e0dbc1295486c186e944bf#on-the-etcd-host-itself did you see

alarm:NOSPACE

with etcd?

journalctl -u rke2-server

refined-gigabyte-18875

02/25/2025, 12:46 PM

I don't think so, I have also checked with

du -h

and I have more than sufficient space available on disk hmm

refined-gigabyte-18875

02/25/2025, 12:54 PM

but when i ran

rke2 server --debug

, I saw like it was going to 127.0.0.1 and some port that is not 2379 or etcd kinda port number, then subsequently something like

Copy code

"Failed to test data store connection: context deadline exceeded"

I tried to do a "cluster-reset" on one of the control plane node as well, but when I tried to use the latest snapshot for restoration, it failed midway (didn't got a good look at why) but not doing restoration would work, not sure if it's possibly a corrupted etcd.

✅ 1

86 Views

Open in Slack

Previous Next