powerful-easter-15334
09/18/2025, 11:30 AMpowerful-easter-15334
09/18/2025, 11:31 AMpowerful-easter-15334
09/18/2025, 11:31 AMpowerful-easter-15334
09/18/2025, 11:32 AMpowerful-easter-15334
09/18/2025, 11:34 AM541121Z","logger":"etcd-client","caller":"v3@v3.5.13-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"<etcd-endpoints://0xc0009e41e0/1>>","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = received context error while waiting for new LB policy update: context deadline exceeded
Or the same thing with a different message.
desc = context deadline exceeded"
powerful-easter-15334
09/18/2025, 11:35 AMSep 16 23:38:16 harvester-2 rke2[20131]: time="2025-09-16T23:35:00Z" level=error msg="Failed to check local etcd status for learner management: context deadline exceeded"
powerful-easter-15334
09/18/2025, 11:37 AMpowerful-easter-15334
09/18/2025, 11:37 AMpowerful-easter-15334
09/18/2025, 11:39 AM917081Z","logger":"etcd-client","caller":"v3@v3.5.13-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"<etcd-endpoints://0xc003a9a000/1>>
g="Failed to check local etcd status for learner management: context deadline exceeded"
="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
="Stopped tunnel to 10.0.1.63:9345"
="Proxy done" err="context canceled" url="<wss://10.0.1.63:9345/v1-rke2/connect>"
="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
g="leaderelection lost for rke2-etcd"
ode=exited, status=1/FAILURE
powerful-easter-15334
09/18/2025, 11:40 AMpowerful-easter-15334
09/18/2025, 11:40 AMmsg="Proxy error: write failed: write tcp 127.0.0.1:9345->127.0.0.1:35390: write: connection reset by peer"
They do appear fairly often in the logs though ๐powerful-easter-15334
09/18/2025, 11:41 AMpowerful-easter-15334
09/18/2025, 12:23 PMbland-article-62755
09/18/2025, 2:13 PMdisk slowness (they are 10k SAS HDDs in RAID1)Highly likely it's the disk.
powerful-easter-15334
09/18/2025, 2:14 PMbland-article-62755
09/18/2025, 2:15 PMbland-article-62755
09/18/2025, 2:16 PMpowerful-easter-15334
09/18/2025, 2:16 PMpowerful-easter-15334
09/18/2025, 2:16 PMbland-article-62755
09/18/2025, 2:16 PMpowerful-easter-15334
09/18/2025, 2:17 PMpowerful-easter-15334
09/18/2025, 2:17 PMbland-article-62755
09/18/2025, 2:17 PMpowerful-easter-15334
09/18/2025, 2:17 PMpowerful-easter-15334
09/18/2025, 2:18 PMbland-article-62755
09/18/2025, 2:18 PMbland-article-62755
09/18/2025, 2:18 PMbland-article-62755
09/18/2025, 2:19 PM#!/bin/bash
etcdnode=$(kubectl -n kube-system get pod -l component=etcd --no-headers -o custom-columns=NAME:.metadata.name | head -1)
echo "Getting etcd Status"
kubectl -n kube-system exec -it ${etcdnode} -- etcdctl --endpoints 127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key endpoint status --cluster -w table
echo "Defragging the etcd in the current cluster via ${etcdnode}"
kubectl -n kube-system exec -it ${etcdnode} -- etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt defrag --cluster
echo "Getting etcd Health"
kubectl -n kube-system exec -it ${etcdnode} -- etcdctl --endpoints 127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key endpoint health --cluster -w table
echo "Getting etcd Status"
kubectl -n kube-system exec -it ${etcdnode} -- etcdctl --endpoints 127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key endpoint status --cluster -w table
bland-article-62755
09/18/2025, 2:21 PMTOOK
) to be less than 15ms. When we were running on spinners it was like 200+powerful-easter-15334
09/18/2025, 2:42 PMpowerful-easter-15334
09/18/2025, 2:42 PMbland-article-62755
09/18/2025, 2:42 PMkubectl -n kube-system exec -it ${etcdnode} -- etcdctl --endpoints 127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key endpoint health --cluster -w table
is just the health partpowerful-easter-15334
09/18/2025, 2:43 PMbland-article-62755
09/18/2025, 2:43 PMbland-article-62755
09/18/2025, 2:43 PMpowerful-easter-15334
09/18/2025, 2:43 PMpowerful-easter-15334
09/18/2025, 2:43 PMbland-article-62755
09/18/2025, 2:44 PMbland-article-62755
09/18/2025, 2:44 PMpowerful-easter-15334
09/18/2025, 2:44 PMbland-article-62755
09/18/2025, 2:45 PMpowerful-easter-15334
09/18/2025, 2:45 PMbland-article-62755
09/18/2025, 2:45 PMbland-article-62755
09/18/2025, 2:45 PMpowerful-easter-15334
09/18/2025, 2:46 PMsteep-teacher-58732
09/18/2025, 11:26 PM