Hi all, since updating to rancher 2.12 the UI som...
# general
s
Hi all, since updating to rancher 2.12 the UI sometimes hangs on "Loading" forever when selecting "pods" or other resources. Other times it does work. kubectl works as usual. Now it seems to have something to do with calico's tigera-operator as there are 1000+ tigera-operator pods in Evicted state. Also the deployment often says "deployment does not have minimum availabilty". Has anyone experienced the same and found a solution?
maybe calico runs out of IP's to hand out with so many evicted pods?
b
There's certainly IP addresses to consider if you're using a /24 (although,if you were, you'd have come across the problem long ago), but what about maximum pods per worker? Could that be an issue for you?
s
apiVersion: crd.projectcalico.org/v1 kind: IPPool metadata: creationTimestamp: "2025-05-20T123958Z" generation: 1 labels: app.kubernetes.io/managed-by: tigera-operator name: default-ipv4-ippool resourceVersion: "1109" uid: e6b64a88-c974-4dcf-8a8b-a0830ad221c6 spec: allowedUses: - Workload - Tunnel blockSize: 26 cidr: 10.42.0.0/16 ipipMode: Never natOutgoing: true nodeSelector: all() vxlanMode: Always
this is the default IP pool
g
> as there are 1000+ tigera-operator pods in Evicted state
maybe calico runs out of IP's to hand out with so many evicted pods?
tigera-operator runs host-networked (because it is the thing that installs Calico). So it won't be running out of pod IPs, because its not using them 🙂 That said, it shouldn't be being Evicted. Eviction is something the scheduler does when a node doesn't have enough resources. What is the state of your nodes? What is causing the scheduler to evict it?
s
big thanks, will share here after find ing the cause
🤞 1
It is working again after changing the config of machinepools so that they are recreated and all pods start fresh. upgrading kubernetes version in rancher does not do that
the config of the controlplane machine pool
s
Would you be able to remember how many pods in total were that cluster (pre fix)? also how many there are now (with fix) and if the Pods list in the ui correctly loads and is usable?
s
with the Evicted tigera-operator there were over a thousand but dont remember how many were left after deleting those. right now there's 85 pods total with
Copy code
k get po -A
s
thanks, and the Pods list is operational again?
s
yeah and everything lists pretty quick in the UI again
s
great news. In Rancher 2.12 there were some big updates to improve performance as resources scale. I'm wondering if this time around the UI failed to load lists due to a general node performance issue impacting rancher performance. If you do hit a similar scenario where the UI fails to load lists (or is slow to) would you be able to confirm if the node is under pressure, and if not see if there are any network errors in the browser or suspicious log lines in the rancher pod logs? That would help us greatly to make sure those improvements really to help