https://rancher.com/ logo
Title
d

dazzling-action-9795

04/20/2023, 6:20 AM
Hi Everyone! We were running a k3s cluster on flatcar on top of a VM, during the upgrade, we seem to have messed up our first master. The other 2 masters are up and running fine it seems. When starting the k3s service, we keep getting this logs. We reset the 1st master and now are trying to add it using 2nd master as primary.. Appreciate if someone can help.. Is there a way we can join this node to one of the other masters? Thanks!
Apr 19 16:28:55 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:28:55Z" level=info msg="Waiting for etcd server to become available"
Apr 19 16:28:55 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:28:55Z" level=info msg="Waiting for API server to become available"
Apr 19 16:28:57 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:28:57Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:6443/v1-k3s/readyz>: 500 Internal Server Error"
Apr 19 16:29:00 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:00Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available"
Apr 19 16:29:02 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:02Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:6443/v1-k3s/readyz>: 500 Internal Server Error"
Apr 19 16:29:05 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:05Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available"
Apr 19 16:29:07 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:07Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:6443/v1-k3s/readyz>: 500 Internal Server Error"
Apr 19 16:29:10 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:10Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available"
Apr 19 16:29:10 ng-10-51-52-101.lab k3s[989]: {"level":"warn","ts":"2023-04-19T16:29:10.926Z","caller":"etcdserver/server.go:2065","msg":"failed to publish local member to cluster through raft","local-member-id":"7dff6a3be78a5ee3","local-member-attributes":"{Name:ng-10-51-52-101.lab-accc9f42 ClientURLs:[<https://10.51.52.101:2379>]}","request-path":"/0/members/7dff6a3be78a5ee3/attributes","publish-timeout":"15s","error":"etcdserver: request timed out"}
Apr 19 16:29:11 ng-10-51-52-101.lab k3s[989]: {"level":"warn","ts":"2023-04-19T16:29:11.746Z","logger":"etcd-client","caller":"v3@v3.5.3-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"<etcd-endpoints://0xc0005428c0/127.0.0.1:2379>","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
Apr 19 16:29:11 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:11Z" level=error msg="Failed to check local etcd status for learner management: context deadline exceeded"
Apr 19 16:29:12 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:12Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:6443/v1-k3s/readyz>: 500 Internal Server Error"
Apr 19 16:29:15 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:15Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available"
Apr 19 16:29:17 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:17Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:6443/v1-k3s/readyz>: 500 Internal Server Error"
Apr 19 16:29:20 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:20Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available"
Apr 19 16:29:22 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:22Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:6443/v1-k3s/readyz>: 500 Internal Server Error"
Apr 19 16:29:25 ng-10-51-52-101.lab k3s[989]: time="2023-04-19T16:29:25Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available"
c

creamy-pencil-82913

04/20/2023, 7:41 AM
Is there a way we can join this node to one of the other masters?
delete the database files under /var/lib/rancher/k3s/server/db, and set one of the other servers as the --server arg, just like you did when you joined them to this one.
You’ll want to
kubectl delete
the node first so that the other two evict it from the etcd cluster and it can join as a new node.
you don’t need to reset anything if you’re going to rejoin it. Reseting it makes it a separate one-node cluster. You basically want to wipe it clean and rejoin it from zero.
d

dazzling-action-9795

04/20/2023, 2:56 PM
Thanks @creamy-pencil-82913 Youre a life saver!
That worked!