https://rancher.com/ logo
Title
b

bitter-beard-2626

01/11/2023, 11:56 AM
Hi. A cross post from the Cilium Slack. I just noticed that doing a
kubectl rollout restart daemonset -n kube-system cilium
caused a cluster outage. Output from
journalctl -u rke2-server
Jan 10 11:31:21 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:21Z" level=error msg="Remotedialer proxy error" error="websocket: close 1006 (abnormal closure): unexpected EOF"
Jan 10 11:31:21 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:21Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
Jan 10 11:31:26 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:26Z" level=info msg="Connecting to proxy" url="<wss://10.81.23.12:9345/v1-rke2/connect>"
Jan 10 11:31:26 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:26Z" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 10.81.23.12:9345: connect: connection refused"
Jan 10 11:31:26 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:26Z" level=error msg="Remotedialer proxy error" error="dial tcp 10.81.23.12:9345: connect: connection refused"
Jan 10 11:31:28 k8s-build001-master001.dc1 rke2[980]: {"level":"warn","ts":"2023-01-10T11:31:28.069Z","logger":"etcd-client","caller":"v3@v3.5.4-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"<etcd-endpoints://0xc000aace00/127.0.0.1:2379>","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
Jan 10 11:31:28 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:28Z" level=error msg="Failed to get recorded learner progress from etcd: context deadline exceeded"
Jan 10 11:31:31 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:31Z" level=info msg="Connecting to proxy" url="<wss://10.81.23.12:9345/v1-rke2/connect>"
Jan 10 11:32:10 k8s-build001-master001.dc1 rke2[980]: E0110 11:32:10.988495     980 leaderelection.go:330] error retrieving resource lock kube-system/rke2: Get "<https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/rke2>": stream error: stream ID 2425069; INTERNAL_ERROR; received from peer
k8s recovered after about 5 minutes. In the hubble monitor I notice return traffic is denied from etcd port
Jan 10 11:51:29.529: 10.81.23.12:2380 (kube-apiserver) <> 10.81.23.11:32768 (host) policy-verdict:none DENIED (TCP Flags: SYN, ACK)
Jan 10 11:51:29.529: 10.81.23.12:2380 (kube-apiserver) <> 10.81.23.11:32768 (host) Policy denied DROPPED (TCP Flags: SYN, ACK)
Logs from cilium agent
Policy verdict log: flow 0xc18ce21f local EP ID 2998, remote ID kube-apiserver, proto 6, ingress, action deny, match none, 10.81.23.12:2380 -> 10.81.23.11:32770 tcp SYN, ACK
xx drop (Policy denied) flow 0xc18ce21f to endpoint 0, file bpf_host.c line 2147, , identity kube-apiserver->unknown: 10.81.23.12:2380 -> 10.81.23.11:32770 tcp SYN, ACK
Policy verdict log: flow 0xe145a85 local EP ID 2998, remote ID kube-apiserver, proto 6, ingress, action deny, match none, 10.81.23.12:2380 -> 10.81.23.11:32768 tcp SYN, ACK
xx drop (Policy denied) flow 0xe145a85 to endpoint 0, file bpf_host.c line 2147, , identity kube-apiserver->unknown: 10.81.23.12:2380 -> 10.81.23.11:32768 tcp SYN, ACK
Anyone that has encountered similar issues?
Here is the full error message since the outage started until it started working again. Part 1:
Jan 10 11:31:21 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:21Z" level=error msg="Remotedialer proxy error" error="websocket: close 1006 (abnormal closure): unexpected EOF"
Jan 10 11:31:21 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:21Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
Jan 10 11:31:26 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:26Z" level=info msg="Connecting to proxy" url="<wss://10.81.23.12:9345/v1-rke2/connect>"
Jan 10 11:31:26 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:26Z" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 10.81.23.12:9345: connect: connection refused"
Jan 10 11:31:26 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:26Z" level=error msg="Remotedialer proxy error" error="dial tcp 10.81.23.12:9345: connect: connection refused"
Jan 10 11:31:28 k8s-build001-master001.dc1 rke2[980]: {"level":"warn","ts":"2023-01-10T11:31:28.069Z","logger":"etcd-client","caller":"v3@v3.5.4-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"<etcd-endpoints://0xc000aace00/127.0.0.1:2379>","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
Jan 10 11:31:28 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:28Z" level=error msg="Failed to get recorded learner progress from etcd: context deadline exceeded"
Jan 10 11:31:31 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:31:31Z" level=info msg="Connecting to proxy" url="<wss://10.81.23.12:9345/v1-rke2/connect>"
Jan 10 11:32:10 k8s-build001-master001.dc1 rke2[980]: E0110 11:32:10.988495     980 leaderelection.go:330] error retrieving resource lock kube-system/rke2: Get "<https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/rke2>": stream error: stream ID 2425069; INTERNAL_ERROR; received from peer
Jan 10 11:33:02 k8s-build001-master001.dc1 rke2[980]: E0110 11:33:02.607407     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Endpoints: unknown (get endpoints) - error from a previous attempt: http2: server sent GOAWAY and closed the connection; LastStreamID=8109, ErrCode=NO_ERROR, debug=""
Jan 10 11:33:04 k8s-build001-master001.dc1 rke2[980]: W0110 11:33:04.152220     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:04 k8s-build001-master001.dc1 rke2[980]: E0110 11:33:04.152940     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:06 k8s-build001-master001.dc1 rke2[980]: W0110 11:33:06.930686     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:06 k8s-build001-master001.dc1 rke2[980]: E0110 11:33:06.930727     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:11 k8s-build001-master001.dc1 rke2[980]: W0110 11:33:11.087541     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:11 k8s-build001-master001.dc1 rke2[980]: E0110 11:33:11.087826     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:21 k8s-build001-master001.dc1 rke2[980]: W0110 11:33:21.483851     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:21 k8s-build001-master001.dc1 rke2[980]: E0110 11:33:21.484366     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:42 k8s-build001-master001.dc1 rke2[980]: W0110 11:33:42.375343     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:42 k8s-build001-master001.dc1 rke2[980]: E0110 11:33:42.375656     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:33:47 k8s-build001-master001.dc1 rke2[980]: E0110 11:33:47.622820     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Addon: the server is currently unable to handle the request (get <http://addons.meta.k8s.io|addons.meta.k8s.io>) - error from a previous attempt: http2: server sent GOAWAY and closed the connection; LastStreamID=2425085, ErrCode=NO_ERROR, debug=""
Jan 10 11:34:02 k8s-build001-master001.dc1 rke2[980]: E0110 11:34:02.604790     980 leaderelection.go:330] error retrieving resource lock kube-system/rke2: Get "<https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/rke2>": stream error: stream ID 9; INTERNAL_ERROR; received from peer - error from a previous attempt: http2: server sent GOAWAY and closed the connection; LastStreamID=2425085, ErrCode=NO_ERROR, debug=""
Jan 10 11:34:19 k8s-build001-master001.dc1 rke2[980]: W0110 11:34:19.156878     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:34:19 k8s-build001-master001.dc1 rke2[980]: E0110 11:34:19.157793     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:34:38 k8s-build001-master001.dc1 rke2[980]: W0110 11:34:38.567785     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Addon: the request has been made before all known HTTP paths have been installed, please try again
Jan 10 11:34:38 k8s-build001-master001.dc1 rke2[980]: I0110 11:34:38.629386     980 trace.go:205] Trace[1320632717]: "Reflector ListAndWatch" name:<http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169> (10-Jan-2023 11:33:48.495) (total time: 50072ms):
Jan 10 11:34:38 k8s-build001-master001.dc1 rke2[980]: Trace[1320632717]: ---"Objects listed" error:the request has been made before all known HTTP paths have been installed, please try again 50071ms (11:34:38.567)
Part 2
Jan 10 11:34:38 k8s-build001-master001.dc1 rke2[980]: Trace[1320632717]: [50.072154123s] [50.072154123s] END
Jan 10 11:34:38 k8s-build001-master001.dc1 rke2[980]: E0110 11:34:38.629496     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Addon: failed to list *v1.Addon: the request has been made before all known HTTP paths have been installed, please try again
Jan 10 11:35:04 k8s-build001-master001.dc1 rke2[980]: E0110 11:35:04.777115     980 leaderelection.go:330] error retrieving resource lock kube-system/rke2: Get "<https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/rke2>": dial tcp 127.0.0.1:6443: connect: connection refused - error from a previous attempt: unexpected EOF
Jan 10 11:35:08 k8s-build001-master001.dc1 rke2[980]: E0110 11:35:08.440801     980 leaderelection.go:330] error retrieving resource lock kube-system/rke2: Get "<https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/rke2>": dial tcp 127.0.0.1:6443: connect: connection refused
Jan 10 11:35:11 k8s-build001-master001.dc1 rke2[980]: W0110 11:35:11.274515     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Endpoints: Get "<https://127.0.0.1:6443/api/v1/namespaces/default/endpoints?fieldSelector=metadata.name%3Dkubernetes&resourceVersion=17725417>": dial tcp 127.0.0.1:6443: connect: connection refused
Jan 10 11:35:11 k8s-build001-master001.dc1 rke2[980]: E0110 11:35:11.275017     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get "<https://127.0.0.1:6443/api/v1/namespaces/default/endpoints?fieldSelector=metadata.name%3Dkubernetes&resourceVersion=17725417>": dial tcp 127.0.0.1:6443: connect: connection refused
Jan 10 11:35:12 k8s-build001-master001.dc1 rke2[980]: E0110 11:35:12.266393     980 leaderelection.go:330] error retrieving resource lock kube-system/rke2: Get "<https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/rke2>": dial tcp 127.0.0.1:6443: connect: connection refused
Jan 10 11:35:15 k8s-build001-master001.dc1 rke2[980]: E0110 11:35:15.405940     980 leaderelection.go:330] error retrieving resource lock kube-system/rke2: Get "<https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/rke2>": dial tcp 127.0.0.1:6443: connect: connection refused
Jan 10 11:35:17 k8s-build001-master001.dc1 rke2[980]: E0110 11:35:17.619340     980 leaderelection.go:330] error retrieving resource lock kube-system/rke2: Get "<https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/rke2>": dial tcp 127.0.0.1:6443: connect: connection refused
Jan 10 11:35:21 k8s-build001-master001.dc1 rke2[980]: E0110 11:35:21.791037     980 leaderelection.go:330] error retrieving resource lock kube-system/rke2: Get "<https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/rke2>": dial tcp 127.0.0.1:6443: connect: connection refused
Jan 10 11:35:35 k8s-build001-master001.dc1 rke2[980]: W0110 11:35:35.331116     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Addon: Get "<https://127.0.0.1:6443/apis/k3s.cattle.io/v1/addons?resourceVersion=17723578>": dial tcp 127.0.0.1:6443: connect: connection refused
Jan 10 11:35:35 k8s-build001-master001.dc1 rke2[980]: I0110 11:35:35.331634     980 trace.go:205] Trace[1982004136]: "Reflector ListAndWatch" name:<http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169> (10-Jan-2023 11:34:40.316) (total time: 55015ms):
Jan 10 11:35:35 k8s-build001-master001.dc1 rke2[980]: Trace[1982004136]: ---"Objects listed" error:Get "<https://127.0.0.1:6443/apis/k3s.cattle.io/v1/addons?resourceVersion=17723578>": dial tcp 127.0.0.1:6443: connect: connection refused 55014ms (11:35:35.331)
Jan 10 11:35:35 k8s-build001-master001.dc1 rke2[980]: Trace[1982004136]: [55.015347514s] [55.015347514s] END
Jan 10 11:35:35 k8s-build001-master001.dc1 rke2[980]: E0110 11:35:35.331653     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Addon: failed to list *v1.Addon: Get "<https://127.0.0.1:6443/apis/k3s.cattle.io/v1/addons?resourceVersion=17723578>": dial tcp 127.0.0.1:6443: connect: connection refused
Jan 10 11:35:55 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:35:55Z" level=info msg="Handling backend connection request [k8s-build001-master003.dc1]"
Jan 10 11:36:06 k8s-build001-master001.dc1 rke2[980]: W0110 11:36:06.250079     980 reflector.go:424] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:36:06 k8s-build001-master001.dc1 rke2[980]: E0110 11:36:06.251120     980 reflector.go:140] <http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169>: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints "kubernetes" is forbidden: User "system:rke2-controller" cannot list resource "endpoints" in API group "" in the namespace "default"
Jan 10 11:36:11 k8s-build001-master001.dc1 rke2[980]: I0110 11:36:11.454734     980 trace.go:205] Trace[690507413]: "Reflector ListAndWatch" name:<http://k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169|k8s.io/client-go@v1.25.3-k3s1/tools/cache/reflector.go:169> (10-Jan-2023 11:35:41.415) (total time: 30038ms):
Jan 10 11:36:11 k8s-build001-master001.dc1 rke2[980]: Trace[690507413]: ---"Objects listed" error:<nil> 30038ms (11:36:11.454)
Jan 10 11:36:11 k8s-build001-master001.dc1 rke2[980]: Trace[690507413]: [30.038984997s] [30.038984997s] END
Jan 10 11:36:11 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:36:11Z" level=info msg="Updating TLS secret for kube-system/rke2-serving (count: 14): map[<http://listener.cattle.io/cn-10.43.0.1:10.43.0.1|listener.cattle.io/cn-10.43.0.1:10.43.0.1> <http://listener.cattle.io/cn-10.81.23.10:10.81.23.10|listener.cattle.io/cn-10.81.23.10:10.81.23.10> <http://listener.cattle.io/cn-10.81.23.11:10.81.23.11|listener.cattle.io/cn-10.81.23.11:10.81.23.11> <http://listener.cattle.io/cn-10.81.23.12:10.81.23.12|listener.cattle.io/cn-10.81.23.12:10.81.23.12> <http://listener.cattle.io/cn-127.0.0.1:127.0.0.1|listener.cattle.io/cn-127.0.0.1:127.0.0.1> <http://listener.cattle.io/cn-__1-f16284:::1|listener.cattle.io/cn-__1-f16284:::1> <http://listener.cattle.io/cn-k8s-build001-master001.dc1:k8s-build001-master001.dc1|listener.cattle.io/cn-k8s-build001-master001.dc1:k8s-build001-master001.dc1> <http://listener.cattle.io/cn-k8s-build001-master002.dc1:k8s-build001-master002.dc1|listener.cattle.io/cn-k8s-build001-master002.dc1:k8s-build001-master002.dc1> listener.cattle.>
Jan 10 11:37:20 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:37:20Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
Jan 10 11:37:20 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:37:20Z" level=error msg="Remotedialer proxy error" error="websocket: close 1006 (abnormal closure): unexpected EOF"
Jan 10 11:37:21 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:37:21Z" level=info msg="Stopped tunnel to 10.81.23.11:9345"
Jan 10 11:37:43 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:37:43Z" level=info msg="Connecting to proxy" url="<wss://10.81.23.11:9345/v1-rke2/connect>"
Jan 10 11:37:44 k8s-build001-master001.dc1 rke2[980]: time="2023-01-10T11:37:44Z" level=info msg="Handling backend connection request [k8s-build001-master002.sto1.dc1]"
Interesting, cilium had been running for days before this rolling restart but I found this in the containerd.log file.
time="2023-01-10T11:30:35.880312036Z" level=error msg="failed to reload cni configuration after receiving fs change event(\"/etc/cni/net.d/05-cilium.conf\": REMOVE)" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
config file
---
tls-san: ['k8s-build001-master001.redacted']
etcd-snapshot-schedule-cron: 0 */12 * * *
etcd-snapshot-retention: 5
disable: ['rke2-ingress-nginx']
node-name: k8s-build001-master001.redacted
cni: cilium
disable-kube-proxy: true
protect-kernel-defaults: True
selinux: True
resolv-conf: /etc/resolv.conf
etcd-expose-metrics: true
kube-controller-manager-arg:
  - "bind-address=0.0.0.0"
  - "allocate-node-cidrs=true"
kube-scheduler-arg:
  - "bind-address=0.0.0.0"
c

creamy-pencil-82913

01/11/2023, 4:37 PM
Is re-deploying the CNI pods on a running cluster NOT expected to be disruptive?
I would probably defer to the cilium team on what the expected behavior is here
b

bitter-beard-2626

04/06/2023, 7:15 AM
Sorry for reviving this thread but I noticed this issue https://github.com/rancher/rke2/issues/4053 and it seem to be related to what we are experiencing. Please let me know if you need any information from me on how we can help to troubleshooting this!