famous-accountant-54532
02/16/2024, 5:41 AM12:08:04.657660 IP (tos 0x0, ttl 64, id 6305, offset 0, flags [DF], proto UDP (17), length 154)
ccperf1-ch-s1-z3-0.openstack.na-us-1.cloud.sap.8301 > ccperf1-ch-s1-z3-1.node.perf1-ms-cc.consul.mu.devlab.cc.int.ariba.com.8301: [bad udp cksum 0x93a0 -> 0xf7a3!] UDP, length 126
12:08:05.790492 IP (tos 0x0, ttl 64, id 6466, offset 0, flags [none], proto UDP (17), length 110)
ccperf1-ch-s1-z3-0.openstack.na-us-1.cloud.sap.40809 > ccperf1-ch-s1-z3-1.node.perf1-ms-cc.consul.mu.devlab.cc.int.ariba.com.8472: [bad udp cksum 0x2f9f -> 0xf321!] OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 63, id 10042, offset 0, flags [DF], proto TCP (6), length 60)
10.1.2.0.57732 > sampleapp-java-v-192ed3e-17-1707307092.service.default.perf1-ms-cc.consul.mu.devlab.cc.int.ariba.com.http: Flags [S], cksum 0x6432 (incorrect -> 0x27b5), seq 1690897648, win 64390, options [mss 1370,sackOK,TS val 2711604204 ecr 0,nop,wscale 7], length 0
• We've followed this article https://github.com/kubernetes/kubernetes/pull/92035#issuecomment-1329950771 & after doing below chksum offloading, containers able to communicate between hosts.
ethtool -K flannel.1 tx-checksum-ip-generic off
ethtool -K ens192 tx-checksum-ip-generic off
• but we see perf degradation after doing chksum offloading. so trying to find out any other alternative solution for this issue.
• OS & Kernel version
ccloud@ccperf1-ch-s1-z2-0:~$ uname -r
5.15.0-76-generic
ccloud@ccperf1-ch-s1-z2-0:~$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"
• This article https://github.com/flannel-io/flannel/issues/1279#issuecomment-647715533 says, Here is another workaround for the issue not requiring turning off chksum offload. UDP port 8472 is the default port for flannel encapsulating packet. It clears the mark to avoid doing SNAT on the encapsulating packet, thus no double SNAT. but this workaround didnt help us. do we need to tweak anything here?
sudo iptables -A OUTPUT -p udp -m udp --dport 8472 -j MARK --set-xmark 0x0
current iptable rules for flannel
# iptables -t nat -vnL | grep flannel
629K 65M FLANNEL-POSTRTG all -- * * 0.0.0.0/0 0.0.0.0/0 /* flanneld masq */
0 0 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x4000/0x4000 /* flanneld masq */
19457 1167K RETURN all -- * * 10.1.43.0/24 10.1.0.0/16 /* flanneld masq */
0 0 RETURN all -- * * 10.1.0.0/16 10.1.43.0/24 /* flanneld masq */
195K 12M RETURN all -- * * !10.1.0.0/16 10.1.43.0/24 /* flanneld masq */
0 0 MASQUERADE all -- * * 10.1.0.0/16 !224.0.0.0/4 /* flanneld masq */ random-fully
0 0 MASQUERADE all -- * * !10.1.0.0/16 10.1.0.0/16 /* flanneld masq */ random-fully
• any suggestion/ideas to fix the containers communication between hosts without doing chksum offload.