This message was deleted Rancher Users #longhorn-storage

Join Slack

This message was deleted.

# longhorn-storage

adamant-kite-43734

06/29/2023, 6:20 PM

This message was deleted.

famous-shampoo-18483

06/30/2023, 3:39 AM

Longhorn does not care if the worker nodes are from the same vendor or not. But the network condition matters. If network fluctuation happens, some replicas will become failed, which is caused by a timeout between volume engine and replicas. And I guess this is one reason why there are some replica failures in your cluster. If it’s unavoidable, you can check why the rebuilding is not triggered automatically. You can check longhorn manager logs for details. I am thinking if Longhorn is trying to wait and reuse failed replica then the rebuilding is not happened immediately.

quiet-area-89381

06/30/2023, 3:42 AM

But replicas are all on-premise in the same rack. It’s a strange correlation that the issue happen when the connection to master nodes is down. I am puzzled since it’s designed to not be dependent on those nodes. I’ll continue to have a look at the logs. Thanks sharing your thoughts!

3 Views

Open in Slack

Previous Next