This message was deleted.
# longhorn-storage
a
This message was deleted.
c
You seem to have cut off the actual error. Did it time out? did it fail to resolve? did it return a HTTP error response?
b
You're correct... ": context deadline exceeded"
I was definitely hit by the checksum issue for flannel - Nothing would actually resolve DNS initially (thanks RHEL) but implementing a service to auto-disable the checksumming seems to have remedied that, and now i'm here. Lol
(Bear in mind this springs into life perfectly fine on a Rocky setup, i'm just forced into using RHEL here)
That said, this is 32 nodes and not 7 (not sure if that makes a difference or not).
root@dnsutils:/# host longhorn-admission-webhook.longhorn-system.svc longhorn-admission-webhook.longhorn-system.svc.cluster.local has address 10.44.172.173
DNS looks good...
c
tx checksum thing is likely if you’re using vxlan
b
Yeah... I might just create some systemd udev stuff to get round it, so the moment an interface comes up it auto-disables it.
Should I have to do this? Absolutely not. Lol
It's weird because it's clearly communicating... I get leader election messages from the managers...
Copy code
longhorn-manager time="2022-11-25T21:34:06Z" level=info msg="New upgrade leader elected: node-k98l-020"                                                                                  longhorn-manager time="2022-11-25T21:34:13Z" level=info msg="New upgrade leader elected: node-k98l-033"                                                                                  longhorn-manager time="2022-11-25T21:34:23Z" level=info msg="New upgrade leader elected: node-k98l-021"                                                                                  longhorn-manager time="2022-11-25T21:34:35Z" level=info msg="New upgrade leader elected: node-k98l-022"
Ok so the plot thickens
I can fire up a smaller cluster and... All fine. Manager picks up, longhorn starts. This looks to be some sort of scalability issue
How many nodes does Longhorn support scaling-wise?