https://rancher.com/ logo
Title
a

able-traffic-85986

11/03/2022, 3:22 PM
I have some questions about k3s configuration und how some parts work together. After trying different solutions und reading a lot of tutorials and documentations in the last weeks I hope someone here in the community can help me out to clarify what is going on and why and how to get a working solution. The Recent Setup is build as HA with external Datastore. Therefor 3 Master/Server, 3 Worker/Agent and 3 Datastore (etcd) nodes are deployed with a HA IP and DNS name. Each node has a public and a local IP. The public IP on the first interface and the local IP on the second interface. The first and most important problem is that the internal cluster overwrites headers like x-forwarded-for, x-real-ip but we have one application that needs to know the IP of the client and gets only an internal cluster IP atm. I already tried a lot of solution e.g. setting
externalTrafficPolicy: Local
and many more. Is there any solution at all with my recent setup? Do I have to exchange Traefik and/or KlipperLB with another product like MetalLB, NGINX-Ingress or something else to get this working on-prem? Do I have to disable NAT Masquerading to preserve the IP and is it mandatory to configure routes myself without NAT or can k3s handle that by itself? Or do I have to change something in flannel? Another Problem is that randomly some nodes rapidly consume a lot of RAM and freeze. Only a reboot can fix that. Happens with with master and worker nodes. Could that be a side effect of forwarding traffic inside the cluster? Is that a known issue and is there a solution to fix that? For Installation the following commands were used
# First Master
curl -sfL <https://get.k3s.io> |sh -s - server --datastore-endpoint ${K3S_DATASTORE_ENDPOINT} --node-taint CriticalAddonsOnly=true:NoExecute --node-ip <LOCAL IP MASTER01> --node-external-ip <PUBLIC IP MASTER01> --tls-san <LOCAL IP MASTER01> --tls-san <LOCAL IP MASTER02> --tls-san <LOCAL IP MASTER03> --tls-san <PUBLIC IP MASTER01> --tls-san <PUBLIC IP MASTER02> --tls-san <PUBLIC IP MASTER03> --tls-san <http://master.example.com|master.example.com> --tls-san master01.example.xom --tls-san <http://master02.example.com|master02.example.com> --tls-san <http://master03.example.com|master03.example.com> --flannel-iface=eth1
# Second/Third Master (same command with exchanged IPs plus token and serverURL)
# Worker Nodes
curl -sfL <https://get.k3s.io> |sh -s - agent --server ${K3S_URL} --token ${K3S_NODE_TOKEN} --node-ip <NODE LOCAL IP> --node-external-ip <NODE PUBLIC IP> --flannel-iface eth1
b

bland-account-99790

11/03/2022, 5:44 PM
Let me start with the first question. If I understand correctly, you have an app running inside a pod. The app's service is being exposed using a kubernetes service and you are trying to access that service from a client sitting outside of the cluster.. You would like the IP of the client to be preserved, i.e. have the original source IP in the packets that end up in the pod, right? A couple of questions: 1 - Are you connecting to the service directly or via ingress? If ingress, are you using traefik? 2 - Are you using the external-ip to connect to the ingress service or the app's service?
Another Problem is that randomly some nodes rapidly consume a lot of RAM and freeze. Only a reboot can fix that. Happens with with master and worker nodes. Could that be a side effect of forwarding traffic inside the cluster? Is that a known issue and is there a solution to fix that?
This is something I have never experienced. Maybe it rings a bell to you @creamy-pencil-82913?
c

creamy-pencil-82913

11/03/2022, 5:54 PM
what is consuming a bunch of ram - which process specifically? Why does it freeze when the RAM is consumed, are you running with swap enabled? While that is possible, it is not recommended and my first suggestion would be to turn that off.
a

able-traffic-85986

11/03/2022, 6:17 PM
If I understand correctly, you have an app running inside a pod. The app's service is being exposed using a kubernetes service and you are trying to access that service from a client sitting outside of the cluster.. You would like the IP of the client to be preserved, i.e. have the original source IP in the packets that end up in the pod, right?
yes, there is an application running in a pod and there is a related service what is exposed via traefik ingress to make the application public available. The application should able to see the real client IP and not one of the internal cluster IPs.
1 - Are you connecting to the service directly or via ingress? If ingress, are you using traefik?
the service of the pod is exposed via ingress and the ingress controller is traefik, the load balancer is klipperLB (everything is k3s default except the datastore)
2 - Are you using the external-ip to connect to the ingress service or the app's service?
traefik is using the external IPs with type LoadBalancer
what is consuming a bunch of ram - which process specifically? Why does it freeze when the RAM is consumed, are you running with swap enabled? While that is possible, it is not recommended and my first suggestion would be to turn that off.
No, swap is disabled., I'm not sure because the node was freezed but i could see that the oom killer cleaned a lot of mandatory services. Since I installed some tools like atop for more details it didn't freeze again but i will let you now when I have more specifics about that
c

creamy-pencil-82913

11/03/2022, 6:26 PM
yes, there is an application running in a pod and there is a related service what is exposed via traefik ingress to make the application public available. The application should able to see the real client IP and not one of the internal cluster IPs.
That’s not going to happen for a multitude of reasons. You should read https://kubernetes.io/docs/tutorials/services/source-ip/ and decide what approach you are going to take to solve the problem.