Hi wave nice to be here I have been chasing a severe perform Rancher Users #k3s

Hi :wave: nice to be here. I have been chasing a s...

gorgeous-tomato-72495

09/10/2025, 7:38 AM

Hi 👋 nice to be here. I have been chasing a severe performance problem on my homelab cluster over the past couple of days. I'm not sure what kicked this all off, other than after I rebooted each server (one at a time) for upgrades, k3s appears to suffer a memory leak, and eventually crashes. I'm going to keep digging soon but just curious if anyone else has noticed this lately

gorgeous-tomato-72495

09/10/2025, 12:11 PM

This is with v1.33.4+k3s1. Chart of process uptime of k3s on each server:

clever-analyst-23771

09/10/2025, 4:21 PM

are there any logs to this to help?

gorgeous-tomato-72495

09/10/2025, 11:04 PM

I'll get back to digging into this today

gorgeous-tomato-72495

09/11/2025, 1:03 AM

I've downgraded to v1.33.3 for the moment, hoping that reduces the chaos

gorgeous-tomato-72495

09/11/2025, 1:04 AM

Nothing really stuck out from the logs in v1.33.4, except that etcd applies became very slow once memory was starved

gorgeous-tomato-72495

09/11/2025, 3:15 AM

Okay poking some more at etcd metrics, it occurs to me that a linearly increasing amount of traffic to the leader doesn't seem sustainable either. Now I need to figure out where that traffic is coming from

gorgeous-tomato-72495

09/11/2025, 3:16 AM

The cluster is a handful of nodes and quiet in terms of workload, I can't think of anything that would be doing this deliberately

gorgeous-tomato-72495

09/11/2025, 3:55 AM

ahaah. I think it's the Tailscale operator

gorgeous-tomato-72495

09/11/2025, 4:13 AM

I had used the operator.yaml straight out of the repo, which is unstable. switching to their Helm chart (stable) fixed it

clever-analyst-23771

09/11/2025, 1:13 PM

glad to see this was resolved

gorgeous-tomato-72495

09/11/2025, 8:56 PM

Me too 😮‍💨

Open in Slack

Previous Next