Hi :wave: nice to be here. I have been chasing a s...
# k3s
g
Hi 👋 nice to be here. I have been chasing a severe performance problem on my homelab cluster over the past couple of days. I'm not sure what kicked this all off, other than after I rebooted each server (one at a time) for upgrades, k3s appears to suffer a memory leak, and eventually crashes. I'm going to keep digging soon but just curious if anyone else has noticed this lately
This is with v1.33.4+k3s1. Chart of process uptime of k3s on each server:
c
are there any logs to this to help?
g
I'll get back to digging into this today
I've downgraded to v1.33.3 for the moment, hoping that reduces the chaos
Nothing really stuck out from the logs in v1.33.4, except that etcd applies became very slow once memory was starved
Okay poking some more at etcd metrics, it occurs to me that a linearly increasing amount of traffic to the leader doesn't seem sustainable either. Now I need to figure out where that traffic is coming from
The cluster is a handful of nodes and quiet in terms of workload, I can't think of anything that would be doing this deliberately
ahaah. I think it's the Tailscale operator
I had used the operator.yaml straight out of the repo, which is unstable. switching to their Helm chart (stable) fixed it
c
glad to see this was resolved
g
Me too 😮‍💨