This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

08/24/2023, 4:46 PM

This message was deleted.

creamy-pencil-82913

08/24/2023, 4:50 PM

are you starting prod1 with --cluster-init instead of --server=prod2 to tell it to join the existing cluster?

creamy-pencil-82913

08/24/2023, 4:51 PM

The only reason you’d get a new cluster with etcd on prod1 is if you started it with --cluster-init

gentle-queen-7402

08/24/2023, 4:51 PM

I have tried both, same result each time

gentle-queen-7402

08/24/2023, 4:51 PM

I also tried starting prod1 with --cluster-init and restarting 2+3 with --server=prod1

creamy-pencil-82913

08/24/2023, 4:52 PM

I would: • uninstall K3s • Make sure the data (specifically the certs and etcd db) are removed from disk • Reinstall and ensure that you’re passing --token=x and --server=prod2 or --server=prod3 to get it to join the existing cluster

👍 1

gentle-queen-7402

08/24/2023, 4:52 PM

I'm wondering if there's somewhere the configuration could have been cached and causing issues, but I',m not sure

gentle-queen-7402

08/24/2023, 4:53 PM

Thank you, I will try that next, the cluster is limping along on 2 nodes for now, so I'm waiting till after hours to avoid downtime at this point

creamy-pencil-82913

08/24/2023, 4:53 PM

check the cli args in the systemd unit, and /etc/rancher/k3s/config.yaml

gentle-queen-7402

08/24/2023, 5:01 PM

I have a k3s.yaml there, no config.yaml, and to be clear, should I expect it to show prod1 as the server from prod2 if I used prod1 in the startup command, or localhost? Not sure what to look for startup command from

systemctl cat k3s

on prod2

Copy code

ExecStart=/usr/local/bin/k3s \
    server \
        '--server' \
        '<https://ldi-mech-prod1:6443>' \ 
        '--token' \
        <redacted> \
        '--disable' \
        'traefik' \

and the server in the k3s.yaml there is https://127.0.0.1:6443 Thanks again, your recommendations are very helpful here, our operations engineer who set this all up retired and left no docs

creamy-pencil-82913

08/24/2023, 5:27 PM

that’s what I would expect to see if the cluster was originally built with prod1 as the first node, and prod2 and prod3 joining. You can remove the --server or --cluster-init options from any of the servers once the cluster is up and running and it won’t make a difference.

creamy-pencil-82913

08/24/2023, 5:27 PM

You now want to join prod1 to prod2 or prod3 though, so set up your args appropriately.

gentle-queen-7402

08/25/2023, 3:29 PM

Wanted to come back and say thanks, I was a bit in panic mode yesterday and didn't realize that I was trying things out of order, came back and reread this conversation today and was able to fix it just by running the uninstaller and correct installation command from the impacted node. If you'll be at kubecon this year let me come buy you a beer or something

9 Views

Open in Slack

Previous Next