https://rancher.com/ logo
Title
c

chilly-telephone-51989

12/09/2022, 10:36 AM
im using k3s on amazom. restarted the master node and it started giving 504 gateway timed out. here is the log
k -n kube-system logs traefik-7cd4fcff68-49cn2 -f

time="2022-12-09T09:42:28Z" level=info msg="Configuration loaded from flags."
time="2022-12-09T09:42:59Z" level=error msg="Error watching kubernetes events: could not retrieve server version: Get \"<https://10.43.0.1:443/version?timeout=32s>\": dial tcp 10.43.0.1:443: i/o timeout" providerName=kubernetes
I1209 09:42:59.042242       1 trace.go:205] Trace[1457098512]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167 (09-Dec-2022 09:42:29.039) (total time: 30002ms):
Trace[1457098512]: [30.002607632s] [30.002607632s] END
E1209 09:42:59.042279       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167: Failed to watch *v1alpha1.IngressRoute: failed to list *v1alpha1.IngressRoute: Get "<https://10.43.0.1:443/apis/traefik.containo.us/v1alpha1/ingressroutes?limit=500&resourceVersion=0>": dial tcp 10.43.0.1:443: i/o timeout
I1209 09:42:59.042377       1 trace.go:205] Trace[1530918296]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167 (09-Dec-2022 09:42:29.041) (total time: 30000ms):
Trace[1530918296]: [30.00042368s] [30.00042368s] END
E1209 09:42:59.042389       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167: Failed to watch *v1alpha1.MiddlewareTCP: failed to list *v1alpha1.MiddlewareTCP: Get "<https://10.43.0.1:443/apis/traefik.containo.us/v1alpha1/middlewaretcps?limit=500&resourceVersion=0>": dial tcp 10.43.0.1:443: i/o timeout
I1209 09:42:59.042468       1 trace.go:205] Trace[649799410]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167 (09-Dec-2022 09:42:29.042) (total time: 30000ms):
Trace[649799410]: [30.000317588s] [30.000317588s] END
[3:00 PM] please note that requests are not being passed to the gateway or any other pod. here is my ingress file:
ingress.middleware.yaml

apiVersion: <http://traefik.containo.us/v1alpha1|traefik.containo.us/v1alpha1>
kind: Middleware
metadata:
  name: strip-path
  namespace: xplorie
spec:
  stripPrefix:
    prefixes:
      - /api
[3:00 PM] ingress.yaml

apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
kind: Ingress
metadata:
  name: ingress
  namespace: xplorie
  annotations:
    <http://traefik.ingress.kubernetes.io/router.middlewares|traefik.ingress.kubernetes.io/router.middlewares>: xplorie-strip-path@kubernetescrd
spec:
  rules:
    - http:
        paths:
          - path: "/api"
            pathType: Prefix
            backend:
              service:
                name: gateway
                port:
                  number: 80
          - path: "/"
            pathType: Prefix
            backend:
              service:
                name: portal
                port:
                  number: 80
so far Ithink that dns service isn't responding to the requests. When I run the busy box and try to lookup
nslookup gateway
I get this only after deleting the coredns pod and re-spawn. before that I wasn't able to get anything except timeout errors
nslookup gateway
Server:		10.43.0.10
Address:	10.43.0.10:53

Name:	gateway.xplorie.svc.cluster.local
Address: 10.43.156.140

** server can't find gateway.svc.cluster.local: NXDOMAIN

** server can't find gateway.svc.cluster.local: NXDOMAIN


** server can't find gateway.cluster.local: NXDOMAIN

** server can't find gateway.cluster.local: NXDOMAIN

** server can't find gateway.us-east-2.compute.internal: NXDOMAIN

** server can't find gateway.us-east-2.compute.internal: NXDOMAIN
i had seen this error before and then the reason was that traefik wasn't able to find the gateway.
ks get pods -o wide
NAME                                      READY   STATUS      RESTARTS      AGE    IP            NODE              NOMINATED NODE   READINESS GATES
helm-install-traefik-crd-jh2nn            0/1     Completed   0             87d    10.42.0.3     ip-172-31-46-55   <none>           <none>
helm-install-traefik-rzcq4                0/1     Completed   2             87d    10.42.0.2     ip-172-31-46-55   <none>           <none>
svclb-traefik-12c86e56-9728x              2/2     Running     2 (87d ago)   87d    10.42.1.3     ip-172-31-34-0    <none>           <none>
svclb-traefik-12c86e56-k8w7g              2/2     Running     8 (86d ago)   87d    10.42.2.19    ip-172-31-41-97   <none>           <none>
svclb-traefik-12c86e56-kz9fc              2/2     Running     4 (44h ago)   87d    10.42.0.15    ip-172-31-46-55   <none>           <none>
local-path-provisioner-7b7dc8d6f5-w7f68   1/1     Running     3 (44h ago)   87d    10.42.0.16    ip-172-31-46-55   <none>           <none>
metrics-server-668d979685-9gxfk           1/1     Running     3 (44h ago)   87d    10.42.0.19    ip-172-31-46-55   <none>           <none>
coredns-b96499967-c6827                   1/1     Running     0             30m    10.42.1.124   ip-172-31-34-0    <none>           <none>
traefik-7cd4fcff68-f8b7z                  1/1     Running     0             5m7s   10.42.0.20    ip-172-31-46-55   <none>           <none>
traefik-7cd4fcff68-pftq7                  1/1     Running     0             5m7s   10.42.2.107   ip-172-31-41-97   <none>           <none>
traefik-7cd4fcff68-l6t45                  1/1     Running     0             5m7s   10.42.1.125   ip-172-31-34-0    <none>           <none>
please note that this cluster is k3s, 1 master 2 agents. all pods are running on two agent nodes. This is on Amazon server. registry for docker is running on master 1. the problem happened when the master was rebooted. the code system is ubuntu 20.10 LTS. all we are getting is 504 gateway timed out whether you access from over the net or private IP of master 1
w

worried-judge-39444

02/11/2023, 11:55 AM