This message was deleted.
# harvester
a
This message was deleted.
h
Unfortunately, I don’t have a switch to check an 802.3ad config. But I don’t see this with other bond modes. Have you tried another bond mode to see if the issues also happens? Are you certain the switches are running the latest firmware? And what does the vendor of the switches say about this?
b
Will get the Juniper switches updated to the latest - I’m required to use 802.3ad for this but will try to set up another system using another bonding mode after verifying whether it also suffers from the same issue 👍
@happy-cat-90847 is there a mode you’d recommend for testing? Switch SW is fully updated now, but bonds managed by the cluster controller still go down every 10 hours
h
You can use alb - hopefully it will show you if the same issue continues.
b
Switched to alb, but problem remains, so filed https://github.com/harvester/harvester/issues/3744 about it 👍
(that is, set up a completely fresh cluster first with 802.3ad, then switched)
h
I’d argue this might be a problem with the switch or the NICs + switch. We don’t see this in our hardware. Have you tried installing SLES or another Linux with a similar kernel version on that hardware?
I’ll poke around and see if anyone can comment.
b
Yeah, that’s a possibility, though we see this over two types of switches and two types of nics - the management bonds on the same machine are unaffected, but they’re also handled differently (by wickedd)
I guess a good test in that case is running management on the bonds we have issues with
Seems we’re seeing the same behaviour regardless of switch/nic as long as it’s a bond managed by the network-controller pods in harvester
Updated the issue, hopefully someone will be able to help us experience that eureka moment 🙂