Going down a bit of a rabbit hole right now trying...
# longhorn-storage
a
Going down a bit of a rabbit hole right now trying to diagnose some longhorn issues (1.5.3). I'm not sure if it's a problem with laggy ec2 instances causing longhorn problems, or longhorn causing laggy ec2 instances. Using top, CPU and RAM seem fine. Looking in the Longhorn Dashboard, I'll randomly see nodes show up as down for maybe 20 or 30 seconds. Looking at pods, the longhorn-manager pods become unresponsive sometimes, and the logs are spammed with messages like
DeltaFIFO Pop Process ID: longhorn-system/<node-name>,Depth:61, Reason,: slow event handlers blocking the queue (total time : 373ms