quiet-potato-9276
05/12/2023, 10:47 AMgreat-jewelry-76121
05/12/2023, 10:59 AMI have issues with coredns not running which I think is caused by calico-kube-controllers in crash loop back off.It won't be -
calico-kube-controllers
doesn't do anything which affects the dataplane for other pods. Its essentially a garbage collector and label updater.
Can you do a quick check for me - can pods talk to each other? On the same node? On different nodes?quiet-potato-9276
05/12/2023, 11:05 AM[al@rkemaster01 ~]$ kubectl exec -ti busybox2 -- /bin/sh
E0512 12:03:20.136051 25714 memcache.go:287] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0512 12:03:20.138957 25714 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0512 12:03:20.143424 25714 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
/ #
/ # ping 10.42.3.7
PING 10.42.3.7 (10.42.3.7): 56 data bytes
64 bytes from 10.42.3.7: seq=0 ttl=62 time=1.524 ms
64 bytes from 10.42.3.7: seq=1 ttl=62 time=0.766 ms
^C
--- 10.42.3.7 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.766/1.145/1.524 ms
[al@rkemaster01 ~]$ kubectl get pods -A
E0512 12:23:31.447388 4839 memcache.go:287] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0512 12:23:31.473213 4839 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0512 12:23:31.477644 4839 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0512 12:23:31.481753 4839 memcache.go:121] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox1 1/1 Running 2 (119s ago) 22m
default busybox2 1/1 Running 1 (2m24s ago) 22m
default nginx 1/1 Running 3 (104s ago) 72m
ingress-nginx ingress-nginx-admission-create-p8z4t 0/1 Completed 0 73m
ingress-nginx nginx-ingress-controller-kqczp 0/1 Running 30 (1s ago) 73m
ingress-nginx nginx-ingress-controller-mdlxg 0/1 Running 30 (23s ago) 73m
ingress-nginx nginx-ingress-controller-ms44h 0/1 Running 31 (2m36s ago) 73m
kube-system calico-kube-controllers-85d56898c-swvqw 0/1 Running 30 (20s ago) 74m
kube-system canal-h5lcp 2/2 Running 6 (2m9s ago) 74m
kube-system canal-rwz8j 2/2 Running 6 (104s ago) 74m
kube-system canal-sdfmv 2/2 Running 6 (2m34s ago) 74m
kube-system canal-trxkx 2/2 Running 6 (3m3s ago) 74m
kube-system coredns-autoscaler-74d474f45c-knhk7 1/1 Running 3 (2m34s ago) 74m
kube-system coredns-dfb7f8fd4-7ncjq 0/1 Running 3 (99s ago) 74m
kube-system metrics-server-c47f7c9bb-g5jxw 0/1 CrashLoopBackOff 30 (52s ago) 74m
kube-system rke-coredns-addon-deploy-job-2vqwq 0/1 Completed 0 74m
kube-system rke-ingress-controller-deploy-job-6vk5x 0/1 Completed 0 74m
kube-system rke-metrics-addon-deploy-job-rv5rp 0/1 Completed 0 74m
kube-system rke-network-plugin-deploy-job-7d6zw 0/1 Completed 0 74m
great-jewelry-76121
05/12/2023, 11:33 AMI can ping between busybox1 and busybox2 on different nodes:Cool, that suggests that the canal parts are working at least. Is kube-proxy up and happy? That's what takes care of converting service IPs (like this one) into "real" IPs
quiet-potato-9276
05/12/2023, 11:34 AMgreat-jewelry-76121
05/12/2023, 11:34 AMquiet-potato-9276
05/12/2023, 11:35 AMroot 1852 1717 0 12:21 ? 00:00:01 kube-proxy --cluster-cidr=10.42.0.0/16 --hostname-override=192.168.0.170 --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-proxy.yaml --healthz-bind-address=127.0.0.1 --v=2
great-jewelry-76121
05/12/2023, 11:36 AMquiet-potato-9276
05/12/2023, 11:43 AM{"log":"E0512 11:52:16.996066 1 resource_quota_controller.go:417] unable to retrieve the complete list of server APIs: <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request\n","stream":"stderr","time":"2023-05-12T11:52:16.996473805Z"}
{"log":"W0512 11:52:18.369195 1 garbagecollector.go:752] failed to discover some groups: map[<http://metrics.k8s.io/v1beta1:the|metrics.k8s.io/v1beta1:the> server is currently unable to handle the request]\n","stream":"stderr","time":"2023-05-12T11:52:18.370125543Z"}
$ kubectl api-resources
error: unable to retrieve the complete list of server APIs: <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
and:
$ kubectl get apiservice
<http://v1beta1.metrics.k8s.io|v1beta1.metrics.k8s.io> kube-system/metrics-server False (MissingEndpoints) 114m
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "<https://10.43.0.1:443/version>": dial tcp 10.43.0.1:443: i/o timeout
[al@rkemaster01 ~]$ curl -k <https://10.43.0.1:443/>
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "Unauthorized",
"reason": "Unauthorized",
"code": 401
}
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
iptables -F
And killed the existing coredns pod. Now everything is working:
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox1 1/1 Running 2 (72m ago) 92m
default busybox2 1/1 Running 1 (72m ago) 92m
default nginx 1/1 Running 3 (72m ago) 142m
ingress-nginx ingress-nginx-admission-create-p8z4t 0/1 Completed 0 144m
ingress-nginx nginx-ingress-controller-ck5bh 1/1 Running 0 27s
ingress-nginx nginx-ingress-controller-ct84l 1/1 Running 0 50s
ingress-nginx nginx-ingress-controller-kqczp 1/1 Running 50 (7m20s ago) 144m
kube-system calico-kube-controllers-85d56898c-swvqw 1/1 Running 52 (5m46s ago) 144m
kube-system canal-h5lcp 2/2 Running 6 (72m ago) 144m
kube-system canal-rwz8j 2/2 Running 6 (72m ago) 144m
kube-system canal-sdfmv 2/2 Running 6 (72m ago) 144m
kube-system canal-trxkx 2/2 Running 6 (73m ago) 144m
kube-system coredns-autoscaler-74d474f45c-knhk7 1/1 Running 3 (72m ago) 144m
kube-system coredns-dfb7f8fd4-9gz8j 1/1 Running 0 3m19s
kube-system coredns-dfb7f8fd4-dpdcp 1/1 Running 0 9m15s
kube-system metrics-server-c47f7c9bb-kjt8f 1/1 Running 0 2m35s
kube-system rke-coredns-addon-deploy-job-2vqwq 0/1 Completed 0 144m
kube-system rke-ingress-controller-deploy-job-6vk5x 0/1 Completed 0 144m
kube-system rke-metrics-addon-deploy-job-rv5rp 0/1 Completed 0 144m
kube-system rke-network-plugin-deploy-job-7d6zw 0/1 Completed 0 144m
Not sure what is causing this issue and if it will return after a reboot. Never really understood iptables.great-jewelry-76121
05/12/2023, 12:50 PMquiet-potato-9276
05/12/2023, 1:03 PMgreat-jewelry-76121
05/18/2023, 9:14 AMThese are fresh VMsSure, but some OSes enable firewalls by default which write to iptables. ufw, firewalld, etc.