This message was deleted.
# k3s
a
This message was deleted.
e
I'm having discussions with my team about what's the way forward. I'm really open to the idea of using k3s everywhere instead of other distributions for our main clusters, but from what I've understood, my team feels that because k3s sounds/feels like less it's not as capable.
we're also talking with outside partners about this and there was not a single suggestion to use k3s for our main clusters.
I'm looking for information, arguments (if viable) to tip the scale.
b
I think k3s is great for edge, or smaller clusters where you want the cp and worker nodes merged, or maybe there's a lot of in/out happening. The more you scale up/out for cluster size, the more I'd start looking at RKE over k3s. Wanting more resilience with CP vs worker nodes or larger sets of workloads that you're trying to compress into a single node.
That being said, there's a lot to say about trying to keep things simple, but it's gonna depend on what your workloads/platform/hardware looks like to pick k3s over a full blown k8s deployment.
e
is k3s not a good fit for multi node cp? https://docs.k3s.io/architecture#high-availability-k3s
b
I use it to run HA 3 node cluster in prod all the time.
That's not the same as a 100 node cluster though.
e
are you using etcd for the datastore?
b
I am. We did mysql one too, but had other IO issues that made it difficult for prod. etcd has been fine with our baremetal k3s servers.
👍 1
e
could you give an example why/where HA k3s cluster would be lacking in a 100 node cluster?
b
To me, k3s really shines where you need a multi-purpose node/cluster. Like you need a box that does X and kube. Lightweight is or was in the description for a long time. But that extra weight is often for things to make things more resilient. So the larger the cluster the more strain, the more you need that extra resiliency. You can't afford to let that large cluster fail or have issues.
vs if you had VMs and deployed all your apps into separate clusters for isolation and want something lightweight to keep your overhead down.
Different engineered perspectives.
e
I completely understand this logic. It's why my team has the feeling it does about k3s, I think. I'd like to know on a technical level. What does k3s lack that other distributions have to offer this resiliency in 'larger' clusters?
c
The only real downside to k3s is that you can't scale the individual Kubernetes components separately. They're all in one process so you can't see how much CPU or memory just the apiserver is using, vs the scheduler or controller manager. and you can't individually limit their resources. That bothers some people. Other than that there's no real reason why k3s wouldn't scale out the same as any other Kubernetes distribution given the same hardware.
🚀 3
e
thanks for this great answer. I assumed as much.
c
I have also seen folks want to use a different CNI than flannel on larger clusters but that's easy enough to do.
e
when you say larger clusters, what do you have in mind?
we're still figuring this part out. we haven't yet decided if we'd like to have many nodes with smaller resources, or larger nodes with more resources. I assume we'll go with the second option
and when you mention folks wanted to use a different CNI than flannel, were the reasons primarily for performance?
c
the feedback I have heard is that ALL the iptables stuff starts getting slow once you get very large rulesets. So if you have a lot of nodes/pods people start looking at replacing kube-proxy/flannel/kube-router with stuff that uses native routing and/or ebpf
e
great info. could you put any very approximate numbers for large rulesets, nodes/pods in these cases?
c
no, sorry - I haven’t been directly involved in any of that myself.
e
no worries, you were extremely helpful. just one more question. if we decide to use k3s and if we hit these problems. I understand CNI is easy to replace in k3s. what about kube-proxy/kube-router you mentioned?
c
most advanced CNIs will also have a network policy controller and kube-proxy replacement
👍 1
s
I think it's really important to just see K3s as a distribution. For our old on-prem clusters (from around 4 years old) we are migrating from the K8s distribution deployed using kubesprey to using K3s as our K8s distribution. The upgrade of each node changes from around 30 minutes to 30 seconds, which is fantastic. And that's all driven from Ansible. We have on-prem clusters up to about 20 nodes. For public cloud clusters we use native distributions provided by the cloud providers.
e
thanks for all the information! I think I've managed to convince my team that k3s distro could be a viable option for us.
r
funny, randomly found this old thread. I’m running a 46 nodes (incl. 3 control-plane nodes) for 2+ years now and I’m still happy 😄
😇 1
🦜 1
🙌 1