I m looking for some help with a performance issue I have an Rancher Users #k3s

I'm looking for some help with a performance issue...

clever-country-20476

08/15/2025, 8:47 PM

I'm looking for some help with a performance issue. I have an application that is running very slowly when I run it as a pod in k3s, but if I run the same workload using docker directly on the host machine, it runs way faster... like the workload takes <1m using docker vs 2.5hrs when run inside a pod. Any advice on where to even start looking?

clever-country-20476

08/15/2025, 8:48 PM

the application in question is

prowler

, which is launching a bunch of async tasks via some python library

millions-dusk-51992

08/15/2025, 8:49 PM

Do you have resource limits set? get a dump of the deployment and/or pod manifest

clever-country-20476

08/15/2025, 8:49 PM

I have no resource limits on the pod

millions-dusk-51992

08/15/2025, 8:50 PM

how about resource request?

clever-country-20476

08/15/2025, 8:50 PM

None, the

Resources

on the container is just an empty object (pulling manifest momentarily)

millions-dusk-51992

08/15/2025, 8:51 PM

is it a single host running k3s?

clever-country-20476

08/15/2025, 8:52 PM

yes

millions-dusk-51992

08/15/2025, 8:52 PM

try to put resource requests so that it will get some guaranteed cpu and memory

👍 1

clever-country-20476

08/15/2025, 8:53 PM

will try that now and report

clever-country-20476

08/15/2025, 9:55 PM

Even with the configured resources, it's running quite slowly Here is the manifest

manifest.yml

clever-country-20476

08/15/2025, 9:56 PM

giving it 3 CPU and 1Gi of memory, still not running as fast when just run with docker

clever-country-20476

08/15/2025, 9:56 PM

top pod

is reporting that it's just not using any CPU anyway

millions-dusk-51992

08/15/2025, 9:58 PM

maybe take a look at docker stats and see how much it is using and match that

clever-country-20476

08/15/2025, 10:03 PM

it goes to 100% for a bit at the start, and then jumps around 25-75% for a while

clever-country-20476

08/15/2025, 10:04 PM

ends up using ~480M of memory

clever-country-20476

08/15/2025, 10:05 PM

It's doing a lot of network calls, could be that slowing it down?

clever-country-20476

08/15/2025, 10:11 PM

ya, it ends up pulling down like 7MB worth of JSON data from the aws API, so I'm assuming it's just rapid-firing those network calls

clever-country-20476

08/15/2025, 10:11 PM

would something in the k3s networking be slowing that down?

millions-dusk-51992

08/15/2025, 10:52 PM

core-dns maybe, try to see if you can resolve urls quickly inside a container running in k3s, do you have a lot of DNS look ups?

clever-country-20476

08/15/2025, 10:55 PM

it's all calls to the AWS API, so I would think it would only be resolving a handful of times and then have it cached, no? (I'm not too savvy on core-dns itself)

clever-country-20476

08/15/2025, 10:57 PM

I don't see any slowdown in general DNS resolution when using dig and curl inside an

alpine

container running in k3s

clever-country-20476

08/15/2025, 10:57 PM

running

apk update

did take a while to complete though, longer than I would expect

clever-country-20476

08/15/2025, 10:58 PM

yeah,

apk update

in docker returns super fast

millions-dusk-51992

08/15/2025, 10:58 PM

are you multi-homed with your host?

millions-dusk-51992

08/15/2025, 10:58 PM

wired ? wifi?

clever-country-20476

08/15/2025, 10:58 PM

I don't think so?

clever-country-20476

08/15/2025, 10:58 PM

the host is an EC2 instance running debian

millions-dusk-51992

08/15/2025, 11:13 PM

If you're able to experiment try running k3s with docker shim or re-install k3s with

curl -sfL <https://get.k3s.io> | sh -s - --docker

you may get some traction, or change flannel to calico for the cni

clever-country-20476

08/15/2025, 11:14 PM

I can add that

--docker

flag by just updating the systemd service file, right?

millions-dusk-51992

08/15/2025, 11:14 PM

yeah you can try it

clever-country-20476

08/15/2025, 11:40 PM

hm, neither of those options seems to be fixing the issue at hand... but this has officially become a Monday problem

👍 1

millions-dusk-51992

08/16/2025, 12:09 AM

try using the latest AL2023 C5 instances and see if it makes any difference

clever-country-20476

08/18/2025, 4:33 PM

@millions-dusk-51992 any recommendation between kernel 6.1 and 6.12?

millions-dusk-51992

08/18/2025, 4:36 PM

6.12 I think is LTS go for that, unless you're super dependent on 6.1 for any specific reason

✅ 1

clever-country-20476

08/18/2025, 4:47 PM

Any opinions on AMD vs Intel for this?

millions-dusk-51992

08/18/2025, 4:48 PM

Personally AMD but shouldn't be any different

✅ 1

millions-dusk-51992

08/18/2025, 5:01 PM

If you want to test if it's a k3s problem, maybe deploy EKS that can scale to zero and try it there. Rather than building an environment from scratch.

clever-country-20476

08/18/2025, 5:17 PM

it's mostly automated anyway, just needed to tweak the setup script to use dnf instead of apt

👍 1

millions-dusk-51992

08/21/2025, 5:29 PM

were you able to fix the performance problem?

clever-country-20476

08/21/2025, 7:37 PM

OK, finally confirmed it is an issue with the k3s setup. I got a cluster running with

kops

using callium for the CNI and the workload in question runs in the expected timeframe

clever-country-20476

08/21/2025, 7:38 PM

running on the latest c5a.large instance did not fix it, nor did using

--docker

for the runtime

clever-country-20476

08/21/2025, 7:38 PM

I also tried switching out the CNI in k3s to use calico

clever-country-20476

08/21/2025, 7:38 PM

which also did not work

millions-dusk-51992

08/21/2025, 7:39 PM

nice! don't know if it were related to cgroupv1 vs v2 but I seem to recall encountering that

clever-country-20476

08/21/2025, 7:39 PM

(as an aside.. I'd never actually tried to deploy EKS until trying to debug this... what a pain in the ass it is...)

clever-country-20476

08/21/2025, 7:40 PM

nice! don't know if it were related to cgroupv1 vs v2 but I seem to recall encountering that

Sorry, not sure what you mean. Is this something that I can solve with k3s, or do I need to stick with an alternative for this workload?

millions-dusk-51992

08/21/2025, 7:41 PM

depending on OS you're using, i remembering in the past encountering an issue with cgroup v1 vs v2. Which OS distro were you using?

clever-country-20476

08/21/2025, 7:41 PM

I tried on both debian and AL2023

millions-dusk-51992

08/21/2025, 7:41 PM

ok same problem?

clever-country-20476

08/21/2025, 7:41 PM

yes

millions-dusk-51992

08/21/2025, 7:42 PM

what version of k3s did you use?

clever-country-20476

08/21/2025, 7:43 PM

it was the lastest when I installed

Copy code

curl -sfL <https://get.k3s.io> > /tmp/k3s
          chmod +x /tmp/k3s
          /tmp/k3s \
            --write-kubeconfig-mode 640 \
            --write-kubeconfig-group adm \
            --cluster-cidr 10.44.0.0/16 \
            --node-name ${pInstanceHostname} \
            --with-node-id

the machine with the initial problem was deployed last month, then I deployed again when I ran the test on a c5a.large with AL2023

millions-dusk-51992

08/21/2025, 7:45 PM

do you still have that machine or is it gone?

clever-country-20476

08/21/2025, 7:46 PM

I have it, just shut it down for a bit..

clever-country-20476

08/21/2025, 7:46 PM

one sec

millions-dusk-51992

08/21/2025, 7:47 PM

If it works with kops, just stick with it since Cilium CNI is better

clever-country-20476

08/21/2025, 7:47 PM

I know, I just wanted to have everything easily on one node

clever-country-20476

08/21/2025, 7:48 PM

I don't actually need a cluster, I just prefer k8s as an orchestration tool

millions-dusk-51992

08/21/2025, 7:50 PM

yep, you can also use kind or minikube as single node which uses just docker. Did you endup running on same instance type C5a.large with AL2023?

clever-country-20476

08/21/2025, 7:51 PM

yes

millions-dusk-51992

08/21/2025, 7:53 PM

Oh you're running the k3s binary under /tmp ? don't know if that has an effect, you didn't run the install?

clever-country-20476

08/21/2025, 7:53 PM

Copy code

curl -sfL <https://get.k3s.io> > /tmp/k3s

the installer script is just being written to

/tmp/k3s

millions-dusk-51992

08/21/2025, 7:54 PM

ok got it

5 Views

Open in Slack

Previous Next