So I have a highly available cluster with multiple...
# k3s
m
So I have a highly available cluster with multiple server nodes using embedded etcd… and so there’s the first server node that creates the cluster…. And then all joining nodes use its address and cluster token to join the cluster. Well what happens if I just wiped the OS of this first server node…. And I want it to rejoin the cluster? What I did was just change its config to point to one of the other server nodes…. And it seemed to work fine….. Is this okay? So for example: 1. Node1 server = 192.168.0.1 and creates the cluster with the “cluster-init” flag 2. All nodes connect to that address to join cluster 3. I wipe Node1 OS (thus wiping k3s) 4. I now have Node1 join the cluster with its same IP of 192.168.0.1 by way of changing its config .yaml to point to 192.168.0.2 which is the ip of server node2.
c
yep. you can join against any server.
but best practice is to use an external LB or DNS alias so that single server isn’t a single point of failure
m
What do you mean ? “Single point of failure” I have 3 server nodes (odd quorum) and the cluster creator 192.168.0.1 everyone points to to join….. but when I deleted it from the cluster, my cluster was still fine…. Like the other 2 servers continue normal operation…. If I understand what you’re saying, what’s the point of HA cluster with odd quorum of server nodes if they all just rely on that single ip address node….. makes no sense @creamy-pencil-82913
Oh, I think I understand what you meant. It would basically be a “small inconvenience” because if I wanted to join new nodes in, I’d have to update their config.yaml to any server that’s actively on, if the original node creator is offline…. I get it I think
I just got a little confused when you said “single point of failure” I thought you meant that 192.168.0.1 always has to be on even in a HA k3s setup… but that’s not the case as I found
@creamy-pencil-82913
s
You could have created a DNS record, say
api.cluster
, which pointed to
192.168.0.1
(an
A
record) and you could have registered all your other nodes against that name. When you removed node 1, which had that IP address, you could have updated the
A
record for
api.cluster
to point to
192.168.0.2
. There would be no change in any of your nodes, and no change in your joining configuration to make the new node 1 join the cluster. This is what Brandon meant when he spoke about a DNS alias: > best practice is to use an external LB or DNS alias so that single server isn’t a single point of failure
The other alternative is to set aside an extra IP address, say
192.168.0.10
which you add as a load-balancer (LB) to your cluster when you create the first node. Then you add all of the other nodes pointing to that LB address. The LB address can move from one node to another, as you take nodes down, so it is always available to connect to.
An example of using a load-balancer with K3s can be seen here: https://kube-vip.io/docs/usage/k3s/
m
Okay, then why isn’t this step included on the k3s website as a requirement and apart of the install setup instructions for the Highly Available k3s cluster install?
@sticky-summer-13450
s
Don't ask me - I'm just a user 😉
c
Because there’s more than one way to do it, and it requires deploying or managing things outside what K3s comes with. Its up to you to figure out what sort of fixed registration address works best for your environment, and how you want to provide it. https://docs.k3s.io/datastore/ha-embedded
• Optional: A fixed registration address for agent nodes to register with the cluster
There are also two examples of how to provide the fixed registration address in the docs here - using haproxy or nginx https://docs.k3s.io/datastore/cluster-loadbalancer
so yes, I would say that it is included in the website
If you like kube-vip and wanted to add a 3rd example using that, a PR would be appreciated: https://github.com/k3s-io/docs/edit/main/docs/datastore/cluster-loadbalancer.md
m
@sticky-summer-13450 @creamy-pencil-82913 but again, I wiped my node 192.168.0.1 that every other node used that ip to join the cluster to, and my cluster continued operating normally because I had 2 other servers still operational…. But like this seems fine. So the HA cluster continues normal operation if there’s 3 server nodes, and even in the event of original cluster creator 192.168.0.1 dies… because the HA cluster can handle “any 1 of the 3 servers being terminated”
The only thing to know in this event, is that when I want to join new nodes to the cluster, I’d have to point them to one of the other 2 server nodes addresses….. and I’d still be fine
c
yep. you don’t need the server address set on servers after they’ve joined the cluster.
m
Okay, thankyou for clarifying. This stuff gets confusing especially over text messages lol