This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

07/11/2023, 1:16 AM

This message was deleted.

blue-kitchen-51801

07/11/2023, 11:49 AM

do you have static allocated IP addresses for each node?

blue-kitchen-51801

07/11/2023, 11:50 AM

usually, when the IP is changed, etcd has the old IP and will try to connect to it.. and when etcd is down, kubectl is down also

stale-painting-80203

07/11/2023, 12:49 PM

All IPs are static and have not changed.

👍 1

blue-kitchen-51801

07/11/2023, 12:51 PM

it would be useful to have more logs.. for eg.

journalctl -u rke2-server.service --no-pager -n 1000

stale-painting-80203

07/11/2023, 1:10 PM

attached the journalctl logs

blue-kitchen-51801

07/11/2023, 1:12 PM

i would suggest restarting the node and then collect journalctl logs again

blue-kitchen-51801

07/11/2023, 1:13 PM

PS. I’m not part of Rancher team or an expert in RKE2, i’m just trying to help with the experience I have while running our own RKE2 clusters

stale-painting-80203

07/11/2023, 1:17 PM

logs after restarting node

stale-painting-80203

07/11/2023, 1:20 PM

I have three nodes: 172.16.1.11, 172.16.1.12, and 172.16.1.13. I did try restarting all three yesterday, but it did not resolve the issue

blue-kitchen-51801

07/11/2023, 1:23 PM

could you please dump the journalctl again? i’m not able to find any errors (

"level":"error"

) logs

blue-kitchen-51801

07/11/2023, 1:23 PM

Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:9345/v1-rke2/readyz>: 500 Internal Server Error

is expected while rke2 is starting..

blue-kitchen-51801

07/11/2023, 1:29 PM

I would debug why etcd is not up and running

<etcd-endpoints://0xc002dbe700/127.0.0.1:2379> transport: authentication handshake failed: context deadline exceeded

using these commands: https://gist.github.com/superseb/3b78f47989e0dbc1295486c186e944bf#etcd

blue-kitchen-51801

07/11/2023, 1:31 PM

also, i would look also in the other log files described in https://gist.github.com/superseb/3b78f47989e0dbc1295486c186e944bf#logging

stale-painting-80203

07/11/2023, 1:35 PM

Since I am unable to run kubectl commands, this may not be easy.

/var/lib/rancher/rke2/bin/kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes

The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?

👍 1

stale-painting-80203

07/11/2023, 1:36 PM

Thanks for the suggestions though.

blue-kitchen-51801

07/11/2023, 1:37 PM

you might also try these commands (https://gist.github.com/superseb/3b78f47989e0dbc1295486c186e944bf#on-the-etcd-host-itself)

Copy code

export CRI_CONFIG_FILE=/var/lib/rancher/rke2/agent/etc/crictl.yaml
etcdcontainer=$(/var/lib/rancher/rke2/bin/crictl ps --label io.kubernetes.container.name=etcd --quiet)
/var/lib/rancher/rke2/bin/crictl exec $etcdcontainer sh -c "ETCDCTL_ENDPOINTS='<https://127.0.0.1:2379>' ETCDCTL_CACERT='/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt' ETCDCTL_CERT='/var/lib/rancher/rke2/server/tls/etcd/server-client.crt' ETCDCTL_KEY='/var/lib/rancher/rke2/server/tls/etcd/server-client.key' ETCDCTL_API=3 etcdctl endpoint status --cluster --write-out=table"

stale-painting-80203

07/12/2023, 10:42 PM

Figured out what caused the failure and resolved it for now. Seems installing docker on the host machine breaks RKE2 networking somehow. Filed a bug: https://github.com/rancher/rke2/issues/4472

👍 1

1901 Views

Open in Slack

Previous Next