Anyone else getting lots of CPU Soft Lockups on Az...
# rke2
c
Anyone else getting lots of CPU Soft Lockups on Azure? I'm on these versions, been having issues since 20.04 and 1.22, upgrades have not helped:
Copy code
VERSION            OS-IMAGE             KERNEL-VERSION
v1.26.10+rke2r2    Ubuntu 22.04.3 LTS   6.2.0-1017-azur
I get multiple nodes daily in
NotReady
state, pods all get stuck
Terminating
, and the serial console shows CPU Soft Lockups. Not seeing this issue on AWS clusters, but the workloads are somewhat different. I've scoured grafana and basically just have monitoring gaps during these periods, and cannot find any indicators of a particular pod spiking CPU before prometheus stops reporting in. So right now I'm just rebooting nodes like a moron.