This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

12/05/2024, 7:49 PM

This message was deleted.

creamy-pencil-82913

12/05/2024, 7:50 PM

have you tried the ethtool command listed here: https://github.com/flannel-io/flannel/blob/master/Documentation/troubleshooting.md#nat >

Copy code

/usr/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off

narrow-carpet-42783

12/05/2024, 8:02 PM

Just tried it; no change (more details added, above).

creamy-pencil-82913

12/05/2024, 8:05 PM

wait are you going through haproxy or not? this is a weird setup. Why are you running a loadbalancer outside the cluster? Why not just deploy metallb or kube-vip instead of managing a totally independent loadbalancer external to the cluster?

narrow-carpet-42783

12/05/2024, 8:08 PM

I normally use haproxy on a node on the public IP I have in front of the cluster. I wanted to eliminate my haproxy configuration as a possible cause of the problem; hence the

curl

command that uses

haproxy-protocol

to connect. Did that make any sense?

creamy-pencil-82913

12/05/2024, 8:11 PM

what is the

externalTrafficPolicy

on your service? Have you tried setting that to local so that it’s not bouncing connections between nodes?

creamy-pencil-82913

12/05/2024, 8:12 PM

this is all just managed by kube-proxy, I’m not convinced that flannel is even involved here

narrow-carpet-42783

12/05/2024, 8:19 PM

I may not have adequately described my situation. I have a single public IP address that is attached to a non-cluster node running haproxy. It has a backend that includes all (3) nodes in my cluster, mapping port 443 on the public IP to port 32443 on each of the nodes in the k3s cluster. It uses haproxy protocol to preserve the source IP address. On k3s I have a daemonset running ingress-nginx with a NodePort listing on port 32443 that maps to port 443 on each nginx pod. This NodePort service is what’s failing sometimes. If there’s a better way to achieve this, I’m all for it. But this setup worked great for a year until just recently, when the issue outlined above started happening.

narrow-carpet-42783

12/05/2024, 8:49 PM

Is there a way to debug what’s happening with the nodeport so I can try to learn why it’s failing sometimes?

creamy-pencil-82913

12/05/2024, 8:50 PM

not really. like I said it’s all just iptables rules managed by kube-proxy. There’s not really anything changing in there minute to minute, so your best bet is trying to capture the packets at various places and see what’s going on.

creamy-pencil-82913

12/05/2024, 8:51 PM

have you tried changing the externaltrafficpolicy to rule out issues with traffic going to pods on different nodes than the one you’re hitting the nodeport of?

narrow-carpet-42783

12/05/2024, 8:56 PM

The cluster policy is the only workable option for me, since most deployments can only run a single pod, which may/may not be on the node where ingress-nginx is running.

creamy-pencil-82913

12/05/2024, 9:02 PM

you didn’t say you were going through the ingress. you said you were going through the service nodeport to access the pod.

creamy-pencil-82913

12/05/2024, 9:03 PM

If you are going through the ingress then it doesn’t matter where the pod is for the workload, only matters where the ingress pod is running - since that’s what you’re hitting with the service nodeport.

creamy-pencil-82913

12/05/2024, 9:03 PM

So in reality your path is client -> haproxy -> ingress-ningx nodeport -> ingress-nginx pod -> workload pod

creamy-pencil-82913

12/05/2024, 9:04 PM

because the “service” you are hitting is the ingress-nginx service’s nodeport, NOT your workload’s service nodeport

creamy-pencil-82913

12/05/2024, 9:05 PM

and in that case changing the externaltrafficpolicy would only be a problem if ingress-nginx is not running on all the nodes

narrow-carpet-42783

12/05/2024, 9:05 PM

Your description of the path is correct; that’s what I was trying to convey.

creamy-pencil-82913

12/05/2024, 9:06 PM

ok so your logic about why you can’t change the policy is flawed then

creamy-pencil-82913

12/05/2024, 9:06 PM

by default the ingress-ningx nodeport -> ingress-nginx pod hop may be going between nodes. troubleshooting that will be much simpler if you change the policy so that traffic to the nodeport always hits the local ingress-nginx pod on that node. assuming you can run ingress-nginx on all nodes.

narrow-carpet-42783

12/05/2024, 9:07 PM

Ahh, now I understand. How do I do that?

creamy-pencil-82913

12/05/2024, 9:07 PM

change the policy in the ingress-nginx service spec

narrow-carpet-42783

12/05/2024, 9:07 PM

OK, let me try that.

creamy-pencil-82913

12/05/2024, 9:08 PM

you said you’re running ingress nginx as a daemonset, are you sure you’re accessing it at the service nodeport and just not at a nodeport on the pod itself?

narrow-carpet-42783

12/05/2024, 9:14 PM

Here’s what I’ve tried: • From the node,

curl

to the node’s IP address (fails as above) • From an ingress-nginx pod,

curl

to the cluster IP address (fails) • From an ingress-nginx pod,

curl

to the pod’s IP address (works) Changing the externalTrafficPolicy of the nodeport's service didn’t make any difference.

narrow-carpet-42783

12/05/2024, 9:15 PM

Well, strike that. It does seem to be working now when I

curl

to the node’s IP address from that node.

creamy-pencil-82913

12/05/2024, 9:17 PM

but are you curling the pod nodeport or the service nodeport on the node

creamy-pencil-82913

12/05/2024, 9:17 PM

you may have both, and which you’re doing makes a difference

creamy-pencil-82913

12/05/2024, 9:18 PM

I don’t understand why you’re doing things in the ingress-nginx pod. Are you trying to troubleshoot the connection between haproxy and nginx, or between nginx and your workload?

creamy-pencil-82913

12/05/2024, 9:19 PM

they are completely separate TCP connections. If you are seeing something timing out at the TLS handshake level, you need to figure out where that SPECIFIC handshake is, and troubleshoot htat.

narrow-carpet-42783

12/05/2024, 9:25 PM

I understand that there is a connection from the client to haproxy, another connection to ingress-nginx, and another connection to the pod. When I started, I didn’t know what was failing. By running curl from the node, I eliminated haproxy and the physical network. By running curl from inside the ingress-nginx pod to its own IP, I eliminated nginx itself and the backend pods that do the work. So what’s left is whatever networking magic is making the TCP connection from the node’s IP to ingress-nginx. Hopefully that wasn’t a misguided process.

creamy-pencil-82913

12/05/2024, 9:31 PM

right but the nginx pod nodeport or the service nodeport

narrow-carpet-42783

12/05/2024, 9:31 PM

What I would expect to happen is that iptables DNATs the TCP connection to the NodePort, routing it to the IP/port of the ingress-nginx pod running on that node. Which seems pretty simple, but afaict that’s what’s failing.

narrow-carpet-42783

12/05/2024, 9:32 PM

Whichever one is connected to the IP address of the node.

creamy-pencil-82913

12/05/2024, 9:32 PM

creamy-pencil-82913

12/05/2024, 9:32 PM

the question matters

creamy-pencil-82913

12/05/2024, 9:32 PM

if its the pod nodeport, then it will always hit the local pod

creamy-pencil-82913

12/05/2024, 9:32 PM

if its the service nodeport, then it may go to ANY pod in the cluster, even ones on other nodes. depending on the traffic policy.

creamy-pencil-82913

12/05/2024, 9:33 PM

and the path through the NAT layers ends up being VERY different

creamy-pencil-82913

12/05/2024, 9:33 PM

Are you hitting the pod nodeport on the node, or the service nodeport on the node.

narrow-carpet-42783

12/05/2024, 9:34 PM

That’s a great question. When I curl to the node’s IP address, which one is it going to? Because that’s the only one I care about.

narrow-carpet-42783

12/05/2024, 9:35 PM

(Re-reading your question, I’m pretty sure I’m hitting the service nodeport, since that’s the only one I’ve configured to use port 32488).

narrow-carpet-42783

12/05/2024, 9:37 PM

Here’s the ingress-nginx service: https://bin.koehn.com/?f42d38d45f11b42f#2MZUWMzB5qa1YEC9NhviuMdbCmqUvnJ25tsy8JvRuzMU

creamy-pencil-82913

12/05/2024, 9:37 PM

I’m not sure why you can’t answer that question. When you’re curling the node’s IP, what port are you using? and where did you get that port from?

creamy-pencil-82913

12/05/2024, 9:37 PM

Is it the port of the service, or the port of the pod

narrow-carpet-42783

12/05/2024, 9:38 PM

It’s the node of the service. The node has IP address 10.0.1.235, and the service configures a nodeport 32488 (see above link). I’m curling to 10.0.1.235:32488.

narrow-carpet-42783

12/05/2024, 9:39 PM

(there are more nodes, and the service configures a nodeport on each one).

creamy-pencil-82913

12/05/2024, 9:39 PM

no, the nodeports of that service are 32080 and 32443

creamy-pencil-82913

12/05/2024, 9:39 PM

Look at the pods. what ports are THEY using?

narrow-carpet-42783

12/05/2024, 9:39 PM

Sorry, dyslexia.

creamy-pencil-82913

12/05/2024, 9:40 PM

Can you show the yaml for one of the ingress-nginx daemonset pods?

narrow-carpet-42783

12/05/2024, 9:41 PM

Here: https://bin.koehn.com/?4e68a6a04b6fabf6#2b6CMnL9NWRThr9drfqTzpo1xj1CUaryB2r82PbfcUPj

creamy-pencil-82913

12/05/2024, 9:41 PM

can you show the yaml

narrow-carpet-42783

12/05/2024, 9:43 PM

Here: https://bin.koehn.com/?ea6060d194466cde#34wi6wBjDD3D4yDq7znDXLquTbrPd8VravtRzvbf3fJy

narrow-carpet-42783

12/05/2024, 9:44 PM

(BTW thanks for all your time)

creamy-pencil-82913

12/05/2024, 9:46 PM

ok. I see node host ports in the pod port spec. So all you have is the service nodeport

creamy-pencil-82913

12/05/2024, 9:47 PM

With the external traffic policy set to local, traffic to the service nodeport will only ever go to the nginx pod on the node you’re accessing. it will not go to a pod on a different node.

👍 1

creamy-pencil-82913

12/05/2024, 9:47 PM

So then you can focus on identifying if it is just one node that does this, or all nodes, or what.

narrow-carpet-42783

12/05/2024, 9:48 PM

I’ve tried the curl command on each of the nodes, to each of the node’s own IP addresses. They all fail.

creamy-pencil-82913

12/05/2024, 9:49 PM

You can’t

curl -vks <https://NODE-IP:32443>

from the node itself?

narrow-carpet-42783

12/05/2024, 9:51 PM

I need to use

--connect-to $WEBSITE:443:NODE-IP:32443 --haproxy-protocol

so it actually works.

narrow-carpet-42783

12/05/2024, 9:51 PM

Without --haproxy-protocol there’s no hope, and if I use the --connect-to I get the right certificate.

creamy-pencil-82913

12/05/2024, 9:53 PM

ok but you said you’re just trying to diagnose a TLS handshake failure. or a connection timeout even.

narrow-carpet-42783

12/05/2024, 9:54 PM

Which is caused by a dropped connection that only happens when I connect through the nodeport. When I connect through the pod’s own IP address it works great.

creamy-pencil-82913

12/05/2024, 9:54 PM

yes but that bypasses lots and lots of things so it doesnt really help any

narrow-carpet-42783

12/05/2024, 9:55 PM

I know; it just confirms that nginx isn’t the problem.

narrow-carpet-42783

12/05/2024, 9:55 PM

Can you help me understand what the flow is when I connect to the nodeport?

narrow-carpet-42783

12/05/2024, 9:56 PM

(i.e., curl https://node-ip:32443)

creamy-pencil-82913

12/05/2024, 9:56 PM

If you’re troubleshooting basic connectivity I would probably disable all the haproxy stuff for now and try to just troubleshoot the TCP connection itself

creamy-pencil-82913

12/05/2024, 9:57 PM

See if you can get the connection to hang with

echo QUIT | openssl s_client -connect NODE-IP:32443 -servername WEBSITE

creamy-pencil-82913

12/05/2024, 9:57 PM

that way you are not doing ANYTHING other than testing the TLS connection

narrow-carpet-42783

12/05/2024, 9:59 PM

One time usually works. But here’s what happens when I try multiple times: https://bin.koehn.com/?d89850be142363f8#4gLrP81gWigd7QMQBi1iEvaiZAEaJT17JLw83Vj1FKpx

narrow-carpet-42783

12/05/2024, 10:00 PM

Again,

openssl

doesn’t speak proxy protocol, so that’s not going to work very well. In any case, I know proxy protocol isn’t the problem because I can use

curl

to successfully complete the connection from the pod itself.

creamy-pencil-82913

12/05/2024, 10:02 PM

ok so start doing some packet captures and see where you stop seeing traffic. is it not making it in to the pod? are the responses not making it out?

narrow-carpet-42783

12/05/2024, 10:11 PM

OK, I capture this tcpdump when I ran this command from the node to itself (the node is 10.0.1.236):

WEBSITE=<http://diaspora.koehn.com|diaspora.koehn.com> ; for i in $(seq 1 20) ; do curl -v --connect-to $WEBSITE:443:10.0.1.236:32443 --haproxy-protocol  https://$WEBSITE/ > /dev/null  ; done

I stopped it when the

for

loop hung.

narrow-carpet-42783

12/05/2024, 10:12 PM

Ah, poop. I need to remove it from haproxy. Stand by and I’ll run it again.

creamy-pencil-82913

12/05/2024, 10:13 PM

10.0.1.236.32443 > 10.0.1.1.12811: Flags [P.], cksum 0x21b3 (incorrect -> 0x784c)

you’re sure you disabled checksum offload on all the nodes?

narrow-carpet-42783

12/05/2024, 10:16 PM

Yes, I ran the ethtool command you sent earlier on all the nodes. I just checked my command history:

/usr/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off

creamy-pencil-82913

12/05/2024, 10:23 PM

I’d probably continue on down this path, trying to figure out where the packets are getting dropped. I can’t really tell from that text output what’s missing, I usually grab a capture file off all the interfaces I’m interested in and then pull it up in wireshark or something

creamy-pencil-82913

12/05/2024, 10:24 PM

but this seems pretty odd, I’ve not seen anything like this before so I suspect whatever is going on with your node is not going to be particularly easy to diagnose

creamy-pencil-82913

12/05/2024, 10:24 PM

I take it you’ve already upgraded everything to the greatest extent possible?

narrow-carpet-42783

12/05/2024, 10:25 PM

Yeah. Latest everything I can find.

narrow-carpet-42783

12/05/2024, 10:26 PM

alright, I’ll keep at it. Thanks again for all the help.

narrow-carpet-42783

12/06/2024, 5:44 PM

I eventually figured it out, BTW. The NodePort issue was masking the real problem because of the

externalTrafficPolicy

as you correctly pointed out. The actual problem was because one of the nodes had a too-low

kernel.threads-max

setting which sent nginx on a tailspin. So many thanks again; I think it would have taken much, much longer without all your help!

creamy-pencil-82913

12/06/2024, 5:50 PM

oh so after all that it was just nginx hanging on one of the nodes?

narrow-carpet-42783

12/06/2024, 5:51 PM

Yeah. The NodePort turned out to be a wild goose chase. As soon as I tweaked the threads of the one node and restarted nginx, it worked perfectly.

narrow-carpet-42783

12/06/2024, 5:52 PM

Still, I’m glad to have fixed the externalTrafficPolicy; I assumed that a NodePort in front of a DaemonSet would use Local by default.

narrow-carpet-42783

12/06/2024, 6:40 PM

One last aside… the incorrect checksums were not because of the ethtool setting. They’re because checksumming is offloaded to the NIC, and tcpdump cannot see it.

creamy-pencil-82913

12/06/2024, 6:43 PM

yeah that’s fair. there are bugs with vxlan packets getting dropped due to invalid checksums, thats why I asked

👍 1

41 Views

Open in Slack

Previous Next