This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

05/19/2022, 11:17 PM

This message was deleted.

faint-airport-83518

05/19/2022, 11:21 PM

This is trying to add a new node to an existing cluster

Copy code

May 19 23:20:53 vm-server01 rke2[14190]: time="2022-05-19T23:20:53Z" level=info msg="Failed to test data store connection: context deadline exceeded"
May 19 23:20:55 vm-server01 rke2[14190]: time="2022-05-19T23:20:55Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:9345/v1-rke2/readyz>: 500 Internal Server Error"
May 19 23:21:00 vm-server01 rke2[14190]: time="2022-05-19T23:21:00Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:9345/v1-rke2/readyz>: 500 Internal Server Error"
May 19 23:21:05 vm-server01 rke2[14190]: {"level":"warn","ts":"2022-05-19T23:21:05.809Z","logger":"etcd-client","caller":"v3@v3.5.3-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"<etcd-endpoints://0xc001c90000/127.0.0.1:2379>","attempt":0,"error":"rpc error: 
code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused\""}

faint-airport-83518

05/19/2022, 11:22 PM

@gray-lawyer-73831 heeeelllp

gray-lawyer-73831

05/19/2022, 11:34 PM

So, you still need a “leader node” or “bootstrap node” that is brought up without the

server

flag, but you just don't have to wait until that's fully up to join others anymore. You still can, and will see less errors if you do waiting, but in not waiting the errors are just transient and go away after the first node is up

faint-airport-83518

05/19/2022, 11:35 PM

ah, okay. so we just don't need that obnoxious wait time

faint-airport-83518

05/19/2022, 11:35 PM

alright.. let me see if I can figure out some kinda logic...

💪 1

faint-airport-83518

05/19/2022, 11:36 PM

what happens if the "leader" is deleted?

gray-lawyer-73831

05/19/2022, 11:38 PM

The cluster should actually continue to operate as normal! It's mostly needed to initialize etcd, and once that's done, then it's happy to swap out nodes and no longer have that special node anymore

faint-airport-83518

05/19/2022, 11:39 PM

I guess it's weird then - because my cluster is still there I just changed the init script and added new nodes..

faint-airport-83518

05/19/2022, 11:39 PM

anyway, I'll tweak it just to remove the sleep time and see if it works

faint-airport-83518

05/20/2022, 2:29 PM

I think it worked eventually but for some reason the other servers didn't join the cluster for over an hour

faint-airport-83518

05/20/2022, 2:30 PM

I took this logic out - https://github.com/rancherfederal/rke2-azure-tf/blob/main/modules/custom_data/files/rke2-init.sh#L184-L189 and made the sleep time 30 seconds