This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

02/12/2025, 2:01 PM

This message was deleted.

brainy-kilobyte-33711

02/12/2025, 2:39 PM

When the cluster is stable IOPs are much smaller but redeploys, pod migrations etc seem to trigger 30x spikes

brainy-kilobyte-33711

02/12/2025, 2:40 PM

Might be a symptom rather than a cause...

brainy-kilobyte-33711

02/12/2025, 2:43 PM

etcd stats

creamy-pencil-82913

02/12/2025, 5:39 PM

I’d probably try to figure out what’s thrashing the disks. Is it etcd? is it image pull/prune? Is it your workload?

brainy-kilobyte-33711

02/12/2025, 8:11 PM

Our latest theory is it's our workload somehow causing a huge spike in etcd writes by thrashing the API server. Unfortunately when the server nodes go down we lose all access and detailed metrics stop reporting. The control plane AWS NLB is being swarmed by something. We will go through audit and any other logging we can find to see what is causing these spikes. This post was probably more suited to the k8s slack but if anyone has any general tips of what to look for and where, they would be appreciated!

brainy-kilobyte-33711

02/12/2025, 8:18 PM

the control plane nodes have the

CriticalAddonsOnly=true:NoExecute

taint so all disk IO should come indirectly from our workloads

brainy-kilobyte-33711

02/13/2025, 11:46 AM

I'll keep this as a running log, we shutdown all the agents, kept the servers on, waited 5 minutes and turned all the agents back on again. 2/3 servers were fine. 1 went down for 30 minutes but recovered without restart. iotop and iostat was running in the background for all servers but on the 1 that failed, all logs stopped (including /var/log/messages) when it was unresponsive. Given 2/3 servers were fine, fundamentally it must be fine and there is an error condition somewhere (not necessarily rke2) that causes a cascading failure. In the screenshot the left server went down, the right one is fine and didn't come close to the disk io of server1

brainy-kilobyte-33711

02/13/2025, 11:50 AM

iotop from around the problematic time gives multiple readings for processes which suggests they are being restarted

Copy code

08:55:39 Actual DISK READ:     127.63 M/s | Actual DISK WRITE:     654.91 K/s
b'08:55:39    2085 be/4 etcd        2.22 M/s  115.27 K/s ?unavailable?  etcd --config-file=/var/lib/
b'08:55:39  244376 be/4 root        2.11 M/s  135.36 B/s ?unavailable?  containerd -c /var/lib/ranch
b'08:55:39    3354 be/4 root        2.34 M/s    0.00 B/s ?unavailable?  cilium-agent --config-dir=/t
b'08:55:39    3412 be/4 root        3.45 M/s    0.00 B/s ?unavailable?  cilium-agent --config-dir=/t
b'08:55:39    3749 be/4 root        2.30 M/s    0.00 B/s ?unavailable?  cilium-agent --config-dir=/t
b'08:55:39    3766 be/4 root        3.27 M/s    0.00 B/s ?unavailable?  cilium-agent --config-dir=/t
b'08:55:39    3865 be/4 root        2.71 M/s    0.00 B/s ?unavailable?  coredns -conf /etc/coredns/C
b'08:55:39  244118 be/4 root        2.62 M/s    0.00 B/s ?unavailable?  kube-scheduler --permit-port
b'08:55:39  244160 be/4 root        2.22 M/s    0.00 B/s ?unavailable?  rke2 server'
b'08:55:39  244162 be/4 root        3.05 M/s    0.00 B/s ?unavailable?  rke2 server'
b'08:55:39  244349 be/4 root        4.48 M/s    0.00 B/s ?unavailable?  kubelet --volume-plugin-dir=
b'08:55:39  244356 be/4 root        3.39 M/s    0.00 B/s ?unavailable?  kubelet --volume-plugin-dir=
b'08:55:39  244361 be/4 root        4.44 M/s    0.00 B/s ?unavailable?  kubelet --volume-plugin-dir=
b'08:55:39  244762 be/4 root        3.55 M/s    0.00 B/s ?unavailable?  kube-controller-manager --fl
b'08:55:39  245001 be/4 root        4.89 M/s    0.00 B/s ?unavailable?  kube-controller-manager --fl
b'08:55:39  245160 be/4 root        3.80 M/s    0.00 B/s ?unavailable?  kube-controller-manager --fl
b'08:55:39  245151 be/4 root        3.14 M/s    0.00 B/s ?unavailable?  kube-apiserver --admission-c
b'08:55:39  245158 be/4 root        3.96 M/s    0.00 B/s ?unavailable?  kube-apiserver --admission-c
b'08:55:39  245159 be/4 root        3.92 M/s    0.00 B/s ?unavailable?  kube-apiserver --admission-c

brainy-kilobyte-33711

02/13/2025, 12:26 PM

Found some OOM in system messages kubeapi got to 1.8GB, rke2 was at 707MB, etcd 300MB. Happy usage is 500MB for kubeapi and 350MB for rke2. The servers are 2CPU 4GB RAM based on https://docs.rke2.io/install/requirements#vm-sizing-guide So looks likely it was being overwhelmed and stuck in a reboot loop. Unsure why the other other 2 servers are fine

brainy-kilobyte-33711

02/13/2025, 5:13 PM

All are seemingly symptoms of a large etcd. When we manually ran a defrag / compact and it went from 500mb to 40Mb but has already increased to 140MB

creamy-pencil-82913

02/13/2025, 6:12 PM

These are all very low resource levels. 2 core 4gb is like, the lowest possible amount that will even schedule all the server pods. And a couple hundred mb in etcd is nothing. It sounds like you just need to increase resources from bare minimum.

brainy-kilobyte-33711

02/13/2025, 6:17 PM

Sounds like we took the vm sizing guide too much at face value. Will increase the sizes. Would you recommended setting up auto etcd defrag at all or it it generally not needed with the auto compact

creamy-pencil-82913

02/13/2025, 6:42 PM

we don’t recommend doing it while it’s running since it pauses all IO operations. It is auto defragged whenever you start the rke2-server service. While it’s running I would just allow it to be at whatever size it is at, a couple hundred MB for pages that will eventually get reused shouldn’t be a lot to ask.

brainy-kilobyte-33711

02/13/2025, 6:45 PM

Thanks

miniature-notebook-6405

04/30/2025, 1:07 PM

Good read...

19 Views

Open in Slack

Previous Next