magnificent-glass-73162
08/19/2022, 9:52 AM$ kubectl -n kube-system get pods -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-685d6d555d-q9pn4 0/1 CrashLoopBackOff 4 2m40s
If I look at the pod description I see the events as follows:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 62m (x2 over 62m) kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 62m kubelet Container image "rancher/mirrored-cluster-proportional-autoscaler:1.8.3" already present on machine
Normal Created 62m kubelet Created container autoscaler
Normal Started 62m kubelet Started container autoscaler
Normal SandboxChanged 52m (x2 over 52m) kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 52m kubelet Container image "rancher/mirrored-cluster-proportional-autoscaler:1.8.3" already present on machine
Normal Created 52m kubelet Created container autoscaler
Normal Started 52m kubelet Started container autoscaler
Warning Unhealthy 51m kubelet Readiness probe failed: Get "<http://10.42.0.17:8080/healthz>": dial tcp 10.42.0.17:8080: i/o timeout (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 51m (x3 over 52m) kubelet Readiness probe failed: Get "<http://10.42.0.17:8080/healthz>": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Normal SandboxChanged 37m (x2 over 37m) kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 37m kubelet Container image "rancher/mirrored-cluster-proportional-autoscaler:1.8.3" already present on machine
Normal Created 37m kubelet Created container autoscaler
Normal Started 37m kubelet Started container autoscaler
Warning Unhealthy 36m (x2 over 37m) kubelet Readiness probe failed: Get "<http://10.42.0.93:8080/healthz>": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Normal SandboxChanged 22m (x2 over 22m) kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 22m kubelet Container image "rancher/mirrored-cluster-proportional-autoscaler:1.8.3" already present on machine
Normal Created 22m kubelet Created container autoscaler
Normal Started 22m kubelet Started container autoscaler
Warning Unhealthy 21m (x4 over 22m) kubelet Readiness probe failed: Get "<http://10.42.0.161:8080/healthz>": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
It seems to be failing the health check, but I’m not sure why$ docker version
Client: Docker Engine - Community
Version: 19.03.13
API version: 1.40
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:03:54 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.13
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:01:49 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.3
GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.18.0
GitCommit: fec3683
Could not resolve host: <http://git.rancher.io|git.rancher.io>
in the rancher/server logskubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- nslookup <http://www.google.com|www.google.com>
If you don't see a command prompt, try pressing enter.
nslookup: can't resolve '<http://www.google.com|www.google.com>'
pod "busybox" deleted
pod default/busybox terminated (Error)
nslookup
from a busybox
container in docker it works, but as a rancher pod not so much
$ docker run -it busybox:1.28 nslookup <http://www.google.com|www.google.com>
Server: 192.168.17.178
Address 1: 192.168.17.178 <http://legolas.inhouse-broker.org|legolas.inhouse-broker.org>
Name: <http://www.google.com|www.google.com>
Address 1: 2607:f8b0:4009:807::2004 <http://ord38s19-in-x04.1e100.net|ord38s19-in-x04.1e100.net>
Address 2: 172.217.5.4 <http://lga15s49-in-f4.1e100.net|lga15s49-in-f4.1e100.net>
with kubectl
no good
$ kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- nslookup <http://www.google.com|www.google.com>
If you don't see a command prompt, try pressing enter.
nslookup: can't resolve '<http://www.google.com|www.google.com>'
pod "busybox" deleted
pod default/busybox terminated (Error)
hundreds-evening-84071
08/19/2022, 8:11 PMmagnificent-glass-73162
08/19/2022, 8:44 PM$ kubectl -n kube-system get pods -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-685d6d555d-l4v4d 0/1 CrashLoopBackOff 205 10h
$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
kubectl -n kube-system describe pod coredns-685d6d555d-l4v4d
I see there is a potential cgroup problem
Warning Failed 74m (x4 over 75m) kubelet Error: failed to start container "coredns": Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:326: applying cgroup configuration for process caused: failed to write 1 to memory.kmem.limit_in_bytes: write /sys/fs/cgroup/memory/kubepods/burstable/podc57e2d18-9421-4c84-8d9b-f97dcd3ee3f2/coredns/memory.kmem.limit_in_bytes: operation not supported: unknown
but that seems like it is just a warning and not fatal