This message was deleted.
# general
a
This message was deleted.
a
Yeah, depends how you set it up, with Load Balancer or HostPort? The official Traefik uses Load Balancer by default, but ServiceLB should do the trick to combine both
n
Thank you for the response! So, none of the documentation I found for installing Rancher said anything about setting Load Balancer or Host Port. It just said that after installation using Helm, I should be able to hit the DNS name for my load balancer. https://ranchermanager.docs.rancher.com/pages-for-subheaders/install-upgrade-on-a-kubernetes-cluster
a
RKE1?
n
K3s
a
k3s doesn't come with servicelb?
n
Not 100% sure how to answer that, but I have "svclb" pods. kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE cattle-fleet-system fleet-controller-56968b86b6-q4hff 1/1 Running 0 2d18h cattle-fleet-system gitjob-7d68454468-vj7v7 1/1 Running 0 2d18h cattle-provisioning-capi-system capi-controller-manager-6f87d6bd74-wl5h9 1/1 Running 0 2d18h cattle-system rancher-64cf6ddd96-4q8hr 1/1 Running 0 2d18h cattle-system rancher-64cf6ddd96-9lhpr 1/1 Running 0 2d18h cattle-system rancher-64cf6ddd96-p5rgk 1/1 Running 0 2d18h cattle-system rancher-webhook-58d68fb97d-cdg2p 1/1 Running 0 2d18h cert-manager cert-manager-bf45d98f8-mqs7t 1/1 Running 0 2d18h cert-manager cert-manager-cainjector-66d4789659-z7v7b 1/1 Running 0 2d18h cert-manager cert-manager-webhook-7bfb8977bc-vplpl 1/1 Running 0 2d18h kube-system coredns-59b4f5bbd5-668tb 1/1 Running 0 2d18h kube-system helm-install-traefik-crd-85xcg 0/1 Completed 0 2d18h kube-system helm-install-traefik-gg8gh 0/1 Completed 1 2d18h kube-system local-path-provisioner-76d776f6f9-9mb78 1/1 Running 0 2d18h kube-system metrics-server-68cf49699b-q9nwg 1/1 Running 0 2d18h kube-system svclb-traefik-88919fa5-cg4d2 2/2 Running 0 2d18h kube-system svclb-traefik-88919fa5-k9wrp 2/2 Running 0 2d18h kube-system svclb-traefik-88919fa5-sshxs 2/2 Running 0 2d18h kube-system traefik-dd49d7bb6-zzhnt 1/1 Running 0 2d18h
a
there is something
now list services
n
kubectl get services --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE cattle-fleet-system gitjob ClusterIP 10.43.189.156 <none> 80/TCP 2d18h cattle-provisioning-capi-system capi-webhook-service ClusterIP 10.43.198.81 <none> 443/TCP 2d18h cattle-system rancher ClusterIP 10.43.202.96 <none> 80/TCP,443/TCP 2d18h cattle-system rancher-webhook ClusterIP 10.43.55.247 <none> 443/TCP 2d18h cert-manager cert-manager ClusterIP 10.43.72.65 <none> 9402/TCP 2d18h cert-manager cert-manager-webhook ClusterIP 10.43.205.106 <none> 443/TCP 2d18h default kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 2d18h kube-system kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP,9153/TCP 2d18h kube-system metrics-server ClusterIP 10.43.251.245 <none> 443/TCP 2d18h kube-system traefik LoadBalancer 10.43.223.220 172.16.160.41,172.16.160.42,172.16.160.43 8032765/TCP,44332253/TCP 2d18h
a
what do you want to use ?
rancher uses nginx class...
n
Are you saying that Rancher doesn't work with Traefik?
a
nope
maybe it can though
n
So ideally, I would install K3s without Traefik and install Nginx ingress instead?
a
you can check with describe rancher ingress
oh you can set ingress.ingressClassName = "traefik"
n
kubectl describe service rancher ingress -n cattle-system Name: rancher Namespace: cattle-system Labels: app=rancher app.kubernetes.io/managed-by=Helm chart=rancher-2.7.9 heritage=Helm release=rancher Annotations: meta.helm.sh/release-name: rancher meta.helm.sh/release-namespace: cattle-system Selector: app=rancher Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.43.202.96 IPs: 10.43.202.96 Port: http 80/TCP TargetPort: 80/TCP Endpoints: 10.42.0.980,10.42.1.980,10.42.2.9:80 Port: https-internal 443/TCP TargetPort: 444/TCP Endpoints: 10.42.0.9444,10.42.1.9444,10.42.2.9:444 Session Affinity: None Events: <none> Error from server (NotFound): services "ingress" not found
a
noooo
list ingresses first 😄
n
kubectl describe ingress rancher -n cattle-system Name: rancher Labels: app=rancher app.kubernetes.io/managed-by=Helm chart=rancher-2.7.9 heritage=Helm release=rancher Namespace: cattle-system Address: 172.16.160.41,172.16.160.42,172.16.160.43 Ingress Class: traefik Default backend: <default> TLS: tls-rancher-ingress terminates rancher.lml.farm Rules: Host Path Backends ---- ---- -------- rancher.lml.farm / rancher:80 (10.42.0.980,10.42.1.980,10.42.2.9:80) Annotations: cert-manager.io/issuer: rancher cert-manager.io/issuer-kind: Issuer field.cattle.io/publicEndpoints: [{"addresses":["172.16.160.41","172.16.160.42","172.16.160.43"],"port":443,"protocol":"HTTPS","serviceName":"cattle-system:rancher","ingre... meta.helm.sh/release-name: rancher meta.helm.sh/release-namespace: cattle-system nginx.ingress.kubernetes.io/proxy-connect-timeout: 30 nginx.ingress.kubernetes.io/proxy-read-timeout: 1800 nginx.ingress.kubernetes.io/proxy-send-timeout: 1800 Events: <none>
Looks like Ingress Class is already Traefik.
a
cool
so it has 3 IPs
rancher.lml.farm should point to one of endpoints listed as externalips
n
But then those IPs aren't listed on the service, and everything is 404.
a
browser has to open the hostname
exactly as it is written in ingress
n
So rancher.lml.farm points to my haproxy load balancer.
a
then haproxy should hit one of those IPs
n
But the load balancer shows everything down because the health check is failing since everything is 404.
a
then fwd hostname and put them into hosts file
Host header should be propagated
n
Gimme a sec...
Ha!
Well, it works when I set the hosts file on my workstation to one of those addresses. I'll just have to figure out how to get the health checks to use that host header.
a
yup 😄
good luck
n
Thank you so much!!!!
Hope it's not a problem to pick up where we left off earlier. So Rancher was available for a while and then I wasn't able to connect to it. Then for a while, I was able to connect, but only over port 32253, not 443. Then I was able to connect to it again on 443 for a while. Then it became unavailable again. Really not sure what's going on. Any ideas?
Port 80 has been available the entire time, with healthz showing "ok"
a
logs, logs, logs
check haproxy logs and healthchecks
n
Health checks are all good. I will check the logs.
HA Proxy logs didn't tell me anything. The health checks are good, and the problem persists even if I'm connecting directly to the k3s nodes. Logs for the Rancher pods have a lot of stuff like: 2023/12/28 161620 [ERROR] Failed to serve peer connection 10.42.2.24: websocket: close 1006 (abnormal closure): unexpected EOF 2023/12/28 161620 [INFO] error in remotedialer server [400]: read tcp 10.42.0.17443 &gt;10.42.2.2449498: use of closed network connection 2023/12/28 161620 [ERROR] Failed to handle tunnel request from remote address 10.42.1.2044162 response 400: cluster not found 2023/12/28 161625 [INFO] Handling backend connection request [10.42.2.24] 2023/12/28 161625 [ERROR] Failed to handle tunnel request from remote address 10.42.1.2059500 response 400: cluster not found 2023/12/28 161626 [INFO] Adding peer wss://10.42.1.20/v3/connect, 10.42.1.20 2023/12/28 161630 [INFO] Handling backend connection request [10.42.1.20]
a
I don't know CNI in k3s
seems like issues with tunnel(flannel?)
can you check flannel daemonset logs?
n
...just started working again... Having difficulty finding flannel logs...
a
DaemonSet also consists of pods
so you have to identify the pod
n
These are all my pods, and I don't see anything about flannel cattle-fleet-local-system fleet-agent-545c88bf5f-56qjc 1/1 Running 0 3h38m cattle-fleet-system fleet-controller-56968b86b6-q4hff 1/1 Running 0 2d22h cattle-fleet-system gitjob-7d68454468-vj7v7 1/1 Running 0 2d22h cattle-provisioning-capi-system capi-controller-manager-6f87d6bd74-wl5h9 1/1 Running 0 2d22h cattle-system rancher-6b569758cb-2gd7m 1/1 Running 0 37m cattle-system rancher-6b569758cb-t7v6s 1/1 Running 0 37m cattle-system rancher-6b569758cb-tqkhr 1/1 Running 0 36m cattle-system rancher-webhook-78dd455966-js4k6 1/1 Running 0 37m cert-manager cert-manager-bf45d98f8-mqs7t 1/1 Running 0 2d22h cert-manager cert-manager-cainjector-66d4789659-z7v7b 1/1 Running 0 2d22h cert-manager cert-manager-webhook-7bfb8977bc-vplpl 1/1 Running 0 2d22h kube-system coredns-5d7c7bf957-sw94v 1/1 Running 0 36m kube-system helm-install-traefik-crd-85xcg 0/1 Completed 0 2d22h kube-system helm-install-traefik-gg8gh 0/1 Completed 1 2d22h kube-system local-path-provisioner-f64fcd499-qmmf8 1/1 Running 0 36m kube-system metrics-server-7f4b5dd498-sdf74 1/1 Running 0 36m kube-system svclb-traefik-88919fa5-lv6lz 2/2 Running 0 36m kube-system svclb-traefik-88919fa5-pnwpk 2/2 Running 0 36m kube-system svclb-traefik-88919fa5-rlrbf 2/2 Running 0 36m kube-system traefik-7588f65c69-nn55b 1/1 Running 0 36m
Flannel was configured for "local gateway" during some earlier troubleshooting steps.
a
are there more than 1 node?
n
That should be all three nodes.
a
Can't see any networking
so it's somehow unknown
n
Oh, so flannel being set to host gateway uses the server's routing table to direct traffic.
a
ip route
n
default via 172.16.160.1 dev ens192 proto static metric 100 10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1 10.42.1.0/24 via 172.16.160.42 dev ens192 10.42.2.0/24 via 172.16.160.43 dev ens192 172.16.160.0/24 dev ens192 proto kernel scope link src 172.16.160.41 metric 100
a
looks like it
n
Thinking it might be better / easier to troubleshoot if I switch back to VXLAN mode.
a
you can use tcpdump on cni0
or any pcap