This message was deleted.
# harvester
a
This message was deleted.
b
I thought that it was just a tag/label you apply
There should be either 1 or 3 I think
what does
kubectl get nodes -o wide
show?
Read the warning in the thread for sure
m
Yeah i know that I can label them like this, but is it supported?
b
The bigger issue is do you have 3 control plane nodes?
Because if you do, this isn't an issue
m
And do I always have to make sure that only 3 are running as control-planes or can i just increase the number when my cluster grows?
b
You need 3 to have HA
for times like when you need to update or take down two of them
As long as one is up you should be good
m
I have 3. like harvester-node 1-3 are control-planes and we needed to put 2 and 3 in maintenance... we just extended the downtime windows and did one host after another and didnt have to put both simultaneously
b
If it were me, I'd just do them.
m
I cannot put the second node into maintenance mode when one already is. Harvester doesnt allow it
b
Then you're gonna have to use a second window to drain and change the labels to make a new CP node
Which kinda defeats the purpose.
The easier/safer thing to would just be to do them one at a time
🙌 1
Because you shouldn't have 4 nodes on the CP.
m
I once broke my previous harvester cluster by deleting more then 1 control-plane... but i was even able to restore it by cluster-resetting the rke2 cluster... but that was bevor the current production setup and I didnt want to do it again xD
That was the reason we did it like this
b
As I see it you can: • Extend your window and just do one after the other • Add another window and do another. • Add two more nodes to the CP and pray that nothing screws up and etcd doesn't get corrupted then demote two you need to workers to avoid a split brain. (something you should probably do in a maint window so this doesn't actually save you time unless you want to just YOLO it)
You can have more than 3 nodes for the CP but I haven't seen that in practice very much.
m
That was what I wasn't sure enough to try it without first asking somewhere
b
It's a good use case for a staging or dev/test env if yall don't have one already
m
I think I am creating an "official" question on github to sort this out
b
I'm not sure what they're going to say. If they have rules for not taking out more than 1 of the CP nodes at a time, adding a number higher than 3 will just make future upgrades more difficult.
m
No I WANT to have more then 3 CP Nodes
Shouldnt the control-plane scale with the size of the cluster anyways
b
It's just handling the Kube API requests.
and keeping track of etcd
More is not always better
m
and the documentation is silent on the issue with the exception of this Note
And even this implies there could be more than 3 CP-Nodes
b
Yes
But just because you can do something, doesn't mean that you should.
m
No but I think a fault tolerance of just one of 5 nodes is just a little sad
b
So let me put it this way: If it's pretty standard for a 3 node CP to manage a 100 nodes, why bump your cp up to 9 at 21?
You can still lose 2 of the CP and be functioning, but things are going to be angry.
Fault tolerance isn't the same as availability.
It's just what harvester is going to let you willingly do.
But I don't know all the details of what you're running, where it's running, and what your business drivers and SLAs are.
So there might be legit reasons to need X or Y.
m
Hm you maybe right, but I am wondering if the scaling recommendations from rke2 are not applicable to harvester cluster atleast in principle
b
I would say it's not
The load for API stuff in a harvester cluster (just for running VMs) would look very different from a standard k8s cluster at capacity.
Keeping track of a 16 core VM with 128GB is a lot easier than the 3,000 pods of microservces that could consume the same resources.
m
Hmmm you got a point... but at some point i would think the 3 CP should become a bottleneck
b
at some point, but I suspect that we're talking about hundreds or thousands of nodes, and you should have a support contract with Suse at that point for sanity and business continuity
You'd also start to see warnings about error budget because the KubeAPI can't keep up with requests
But there's lots of things that could affect that, not just the number of CP nodes
And if it's something like your network throughput isn't big enough, adding more CP will just make the issue worse.
m
Good points. We are discussing internally if we should get support anyway. But I still think I will ask anyways, because those information should be part of the docs, don't you think?
b
Yes and no? ¯\_(ツ)_/¯
The docs aren't meant to give you cut and dry how to's for every situation. They're suppose to explain how to get it started/running and explain enough of how things work so you can navigate to your own answers.
This to me is more along the lines of consulting/support, which they charge for.
100% transparency - I don't work for Suse, but I do have a support contract with them through my day job and I've been pleased so far.
m
But it atleast should tell you if this is supported in principle or this is discouraged for some reason
b
I think the docs are clear that it's supported in principal and why you wouldn't do it is completely environmental and dependent on your own circumstances.
m
That's great to hear... I will recommend that we will ask for a quote for support. But where do you think it is clear that it is supported? The only reference in the whole docs was the note I screenshotted and the wording is so indirect that I am not sure if it is supported, Thats why I am here asking stupid questions xD
b
It's built on Kubernetes. You can look to upstream docs for that too: https://kubernetes.io/docs/setup/best-practices/cluster-large/
Even that is very loose because it's largely dependent on how good your controlplane nodes are at keeping up with the requests.
anyways, I'll stop my yammering
Good luck! 🙂
m
Thank you I appreciate it very much.