This message was deleted.
# general
a
This message was deleted.
c
this is usually controlled by the externalTrafficPolicy
https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip
.spec.externalTrafficPolicy
- denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints. There are two available options:
Cluster
(default) and
Local
.
Cluster
obscures the client source IP and may cause a second hop to another node, but should have good overall load-spreading.
Local
preserves the client source IP and avoids a second hop for LoadBalancer and NodePort type Services, but risks potentially imbalanced traffic spreading.
If you set it to Local it will only send traffic to nodes with a pod
if you’re manually configuring the load balancer, its up to you to do that for yourself.
a
hmm.... ok thank you. I'm looking at the ingress that this app created (deployed via helm) and at the bottom it has
Copy code
status:
  loadbalancer:
   ingress:
 <all k8s worker node IPs>
I assume that's what the above link is referencing? Pods belonging to that app could be running on any one of those nodes, although we only have about a third of those nodes in the target group for our AWS load balancer. It's worked mostly fine up until this point, but we've grown a lot and are seeing more latency issues in the app itself. I know I'm being light on the details but the rest is probably more specific to the app itself and out of scope here. The nginx ingress is created by the helm chart so it's nothing we manually configure.
c
what are you using for your LoadBalancer controller?
a
Actually yes I see in the helm charts where we can set the externalTrafficPolicy, which is still the default Cluster. Sorry I'm not a networking person at all, so just summarize. If the externalTrafficPolicy is set to Cluster, as long as the AWS External Load balancer gets the traffic to a node in the k8s cluster, it will find it's way to the correct node/pod with an extra hop. If it is set to local, then presumably the node where the traffic originates must be part of the target group so it can get back to the right node in the cluster
c
the policy on the service tells the LoadBalancer controller how to route traffic
in the case of the AWS LB controller, it will change whether all nodes go in the target group, or just nodes with pods for that service.
a
Where would I find which LoadBalancer controller it is using?
c
generally there is only one per cluster. What have you deployed?
a
This is our older RKE1 cluster running in ec2. We have a NLB for Rancher itself, and an ALB for the specific app we're running. We actually have a couple app deployments in different namespaces, with their own ingresses. This is the values file and helm chart we're using, the only difference is we set the type to ClusterIP https://github.com/coder/enterprise-helm/blob/main/values.yaml#L22
c
sounds like you’re not using a LB controller and are just manually configuring the ALB to point at your service?
although you said you have IPs under status.loadbalancer.ingress so I suppose you must have something, as that would remain pending otherwise
a
in AWS? Yes we're manually configuring that
we do not have the aws-cloud-controller configured in this cluster
c
the cloud controller is separate from the AWS lb controller
you can run one without the other, if you set things up right
a
we have our AWS Autoscale groups set to register the instances with the ALB target group. Although we have a couple different ASGs for nodes within that cluster, and only one of them is registering instances with the ALB target group. That's an easy fix but I'm just trying to determine if that is what's causing our issues, or if it's not a problem
yea we don't have any aws integration with this cluster
I don't know how the status.loadbalancer.ingress is updated, it's possible it just adds every k8s worker node. Or maybe the service itself adds them when a pod is spun up on a node? I have no idea how that works though. This isn't a case of something just not working, it's intermittent which is so damn annoying to troubleshoot, so I'm trying to rule out everything I can