This message was deleted Rancher Users #longhorn-storage

Join Slack

This message was deleted.

# longhorn-storage

adamant-kite-43734

12/28/2020, 9:08 AM

This message was deleted.

rhythmic-energy-29295

01/11/2021, 1:15 PM

Hey @kind-alarm-73406 and @famous-journalist-11332, My name is Snir and im working with @gifted-dentist-60252 on the highly available storage solution. I followed your instructions and recommendations in this conversation and using the documentation. I created a 3 masters k8s cluster, along with longhorn storageclass and simple nginx application deployment with longhorn pvc mounted to it. I changed the settings of:

Copy code

Volume Attachment Recovery Policy: immediate
Pod Deletion Policy When Node is Down: delete-both-statefulset-and-deployment-pod

I also changed the default toleration of the nginx pod to:

Copy code

tolerations:
        - effect: NoExecute
          key: <http://node.kubernetes.io/not-ready|node.kubernetes.io/not-ready>
          operator: Exists
          tolerationSeconds: 0
        - effect: NoExecute
          key: <http://node.kubernetes.io/unreachable|node.kubernetes.io/unreachable>
          operator: Exists
          tolerationSeconds: 0

So the k8s won't tolerate a node shutdown at all. Then, I tried to force shutdown to the machine that currently run the nginx pod, and started to follow the process from this moment until the nginx becomes up and running with the longhorn PVC again. The process looks like this: poweroff machine -> approx. 37 secs -> Node marked NotReady -> ContainerCreating immediately on new node for approx. 2 mins. Seems like we're getting two common errors that delays the pod creation:

Copy code

Multi-Attach error for volume "pvc-879f95fd-e69a-4c67-8b8c-d9fbca183edd" Volume is already exclusively attached to one node and can't be attached to another

Unable to attach or mount volumes: unmounted volumes=[nginx-storage], unattached volumes=[nginx-storage default-token-swdnp]: timed out waiting for the condition

I also noticed that two longhorn workloads are unavailable sometimes, but assuming these are not relevant for my case:

Copy code

longhorn-driver-deployer
	longhorn-ui

This is how the Rancher Events looks like:

3 Views

Open in Slack

Previous Next