<@U016WKMFLL9> have you seen these panics from Fel...
# rke2
b
@creamy-pencil-82913 have you seen these panics from Felix? This is
v1.33.1-rke2r1
.
Copy code
2025-09-30 09:12:59.719 [INFO][67177] felix/masq_mgr.go 145: IPAM pools updated, refreshing iptables rule ipVersion=0x4
2025-09-30 09:12:59.723 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:12:59.723 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:12:59.723 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:12:59.724 [INFO][67177] felix/wireguard.go 1763: Trying to connect to linkClient ipVersion=0x4
2025-09-30 09:12:59.724 [INFO][67177] felix/ipsets.go 344: Retrying after an ipsets update failure... family="inet"
2025-09-30 09:12:59.726 [INFO][67177] felix/wireguard.go 648: Public key out of sync or updated ipVersion=0x4 ourPublicKey=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
2025-09-30 09:12:59.726 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:12:59.726 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:12:59.726 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:12:59.728 [INFO][67177] felix/ipsets.go 344: Retrying after an ipsets update failure... family="inet"
2025-09-30 09:12:59.730 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:12:59.730 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:12:59.730 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:12:59.734 [INFO][67177] felix/ipsets.go 344: Retrying after an ipsets update failure... family="inet"
2025-09-30 09:12:59.735 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:12:59.735 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:12:59.735 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:12:59.743 [INFO][67177] felix/ipsets.go 344: Retrying after an ipsets update failure... family="inet"
2025-09-30 09:12:59.745 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:12:59.745 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:12:59.745 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:12:59.761 [INFO][67177] felix/ipsets.go 344: Retrying after an ipsets update failure... family="inet"
2025-09-30 09:12:59.763 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:12:59.763 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:12:59.763 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:12:59.796 [INFO][67177] felix/ipsets.go 344: Retrying after an ipsets update failure... family="inet"
2025-09-30 09:12:59.797 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:12:59.797 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:12:59.797 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:12:59.862 [INFO][67177] felix/ipsets.go 344: Retrying after an ipsets update failure... family="inet"
2025-09-30 09:12:59.864 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:12:59.864 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:12:59.864 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:12:59.993 [INFO][67177] felix/ipsets.go 344: Retrying after an ipsets update failure... family="inet"
2025-09-30 09:12:59.994 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:12:59.994 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:12:59.994 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:13:00.251 [INFO][67177] felix/ipsets.go 344: Retrying after an ipsets update failure... family="inet"
2025-09-30 09:13:00.253 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:13:00.253 [ERROR][67177] felix/ipsets.go 409: Failed to get the list of ipsets error=exit status 1 family="inet"
2025-09-30 09:13:00.253 [WARNING][67177] felix/ipsets.go 353: Failed to resync with dataplane error=exit status 1 family="inet"
2025-09-30 09:13:00.536 [INFO][67177] felix/calc_graph.go 568: Local endpoint updated id=WorkloadEndpoint(node=brolin, orchestrator=k8s, workload=kube-system/svclb-rke2-ingress-nginx-controller-33bb93e8-4g67c, name=eth0)
2025-09-30 09:13:00.616 [INFO][67177] felix/calc_graph.go 566: Local endpoint deleted id=WorkloadEndpoint(node=brolin, orchestrator=k8s, workload=kube-system/svclb-rke2-ingress-nginx-controller-33bb93e8-4g67c, name=eth0)
2025-09-30 09:13:00.624 [INFO][67177] felix/int_dataplane.go 1590: Linux interface state changed. ifIndex=1813 ifaceName="calia7d88453e90" state="down"
2025-09-30 09:13:00.624 [INFO][67177] felix/int_dataplane.go 1634: Linux interface addrs changed. addrs=set.Set{} ifaceName="calia7d88453e90"
2025-09-30 09:13:00.626 [INFO][67177] felix/int_dataplane.go 1590: Linux interface state changed. ifIndex=1813 ifaceName="calia7d88453e90" state="up"
2025-09-30 09:13:00.718 [INFO][67177] felix/int_dataplane.go 1590: Linux interface state changed. ifIndex=1813 ifaceName="calia7d88453e90" state="down"
2025-09-30 09:13:00.720 [INFO][67177] felix/int_dataplane.go 1590: Linux interface state changed. ifIndex=1813 ifaceName="calia7d88453e90" state=""
2025-09-30 09:13:00.720 [INFO][67177] felix/int_dataplane.go 1634: Linux interface addrs changed. addrs=<nil> ifaceName="calia7d88453e90"
2025-09-30 09:13:00.767 [ERROR][67177] felix/ipsets.go 671: Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
2025-09-30 09:13:00.767 [ERROR][67177] felix/ipsets.go 1036: Failed to get the list of IP sets. error=exit status 1 family="inet"
2025-09-30 09:13:00.767 [PANIC][67177] felix/ipsets.go 379: Failed to update IP sets after multiple retries. family="inet"
panic: (*logrus.Entry) 0xc000aa77a0

goroutine 478 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc00052e540, 0x0, {0xc00191a570, 0x30})
        /go/pkg/mod/github.com/sirupsen/logrus@v1.9.3/entry.go:260 +0x485
github.com/sirupsen/logrus.(*Entry).Log(0xc00052e540, 0x0, {0xc001499988?, 0x5?, 0x0?})
        /go/pkg/mod/github.com/sirupsen/logrus@v1.9.3/entry.go:304 +0x48
github.com/sirupsen/logrus.(*Entry).Panic(...)
        /go/pkg/mod/github.com/sirupsen/logrus@v1.9.3/entry.go:342
github.com/projectcalico/calico/felix/ipsets.(*IPSets).ApplyUpdates(0xc000450600)
        /go/src/github.com/projectcalico/calico/felix/ipsets/ipsets.go:379 +0x5b9
github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply.func1({0x52f4df0?, 0xc000450600?})
        /go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2457 +0x37
created by github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply in goroutine 523
        /go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2456 +0xb4c

# restarted by runit

2025-09-30 09:13:00.829 [INFO][67272] felix/logutils.go 78: Early screen log level set to info
2025-09-30 09:13:00.829 [INFO][67272] felix/daemon.go 132: Felix starting up GOMAXPROCS=255 builddate="2025-07-31T07:32:47+0000" gitcommit="cf50b562271b7c2ad896af0488d48eddabbb74eb" version="v3.30.2"
c
Bad return code from 'ipset list -name'. error=exit status 1 family="inet" stderr="ipset v7.21: Cannot open session to kernel.\n"
That would seem to indicate a problem with the kernel or ipset binary. But not one that I’ve seen before, no.
What kind of node is this running on?
Is this a real node or some sort of paravirtualized container thing?
b
real node
single node
c
can you run that same command on the base OS?
what kind of kernel are you running?
b
Linux node Thu Sep 4 20:28:16 EDT 2025 x86_64 GNU/Linux
c
wheres the rest of it
uname -a
?
b
Linux {{ hostname }} 5.4.297-1-1 #1 SMP Thu Sep 4 20:28:16 EDT 2025 x86_64 GNU/Linux
c
and what distro?
b
uh, "other"
c
This looks like a custom kernel config, as opposed to something maintained by a distro that we might have tested on. Did you forget to include ipset support?
b
Copy code
CONFIG_IP_SET=m
CONFIG_IP_SET_MAX=256
CONFIG_IP_SET_BITMAP_IP=m
CONFIG_IP_SET_BITMAP_IPMAC=m
CONFIG_IP_SET_BITMAP_PORT=m
CONFIG_IP_SET_HASH_IP=m
CONFIG_IP_SET_HASH_IPMARK=m
CONFIG_IP_SET_HASH_IPPORT=m
CONFIG_IP_SET_HASH_IPPORTIP=m
CONFIG_IP_SET_HASH_IPPORTNET=m
CONFIG_IP_SET_HASH_IPMAC=m
CONFIG_IP_SET_HASH_MAC=m
CONFIG_IP_SET_HASH_NETPORTNET=m
CONFIG_IP_SET_HASH_NET=m
CONFIG_IP_SET_HASH_NETNET=m
CONFIG_IP_SET_HASH_NETPORT=m
CONFIG_IP_SET_HASH_NETIFACE=m
CONFIG_IP_SET_LIST_SET=m
CONFIG_NET_EMATCH_IPSET=m
c
is it loaded?
b
Copy code
ip_set_hash_net        36864  3
ip_set_hash_ip         36864  1
ip_set                 45056  3 ip_set_hash_ip,xt_set,ip_set_hash_net
nfnetlink              16384  7 nfnetlink_acct,nf_conntrack_netlink,ip_se
oh looks like it unwedged here:
Copy code
2025-09-30 09:18:26.933 [PANIC][98123] felix/ipsets.go 379: Failed to update IP sets after multiple retries. family="inet"
2025-09-30 09:18:26.993 [INFO][98245] felix/daemon.go 132: Felix starting up GOMAXPROCS=255 builddate="2025-07-31T07:32:47+0000" gitcommit="cf50b562271b7c2ad896af0488d48eddabbb74eb" version="v3.30.2"
2025-09-30 09:18:28.175 [PANIC][98245] felix/ipsets.go 379: Failed to update IP sets after multiple retries. family="inet"
2025-09-30 09:18:28.236 [INFO][98400] felix/daemon.go 132: Felix starting up GOMAXPROCS=255 builddate="2025-07-31T07:32:47+0000" gitcommit="cf50b562271b7c2ad896af0488d48eddabbb74eb" version="v3.30.2"
2025-09-30 09:23:51.848 [INFO][81] felix/daemon.go 132: Felix starting up GOMAXPROCS=255 builddate="2025-07-31T07:32:47+0000" gitcommit="cf50b562271b7c2ad896af0488d48eddabbb74eb" version="v3.30.2"
2025-09-30 09:26:30.211 [INFO][764] felix/daemon.go 132: Felix starting up GOMAXPROCS=255 builddate="2025-07-31T07:32:47+0000" gitcommit="cf50b562271b7c2ad896af0488d48eddabbb74eb" version="v3.30.2"
this may be a red herring. Not sure what causes it.
c
Something weird with your kernel, I suspect.
b
could be
but I just noticed I was looking at rotated logs
the problem I am looking at is that nginx ingress does not come up
l
Our issue of calico-node -felix-ready failing was resolved by switching to ubuntu 22.04. I havent looked further into why 24.04 would start having this issue in 1.30, when it was fine in 1.29 (again, perhaps vxlan)
c
hmm, interesting. we do much testing on 24.04, my main dev workstation is 24.04, and I’ve never seen that.
but then I’m not running a custom kernel build either, as Ben is.