https://rancher.com/ logo
Title
c

clean-lawyer-76009

02/03/2023, 4:52 PM
Hey, i installed longhorn v.1.4.0 on a Bare Metal Kubernetes Cluster with RHEL8. When the cluster creates a pv from the longhorn storageclass the pv state is bound, but in longhorn i see the state of the volume is Faulted:
Not ready for workload, Volume Faulted
the output of csi attacher log is:
I0203 10:10:21.981136       1 csi_handler.go:231] Error processing "csi-": failed to attach: rpc error: code = Aborted desc = volume pvc- is not ready for workloads
The folder on the host belongs to root user in group root How can i drill this issue further down? Could this be a permission issue?
q

quick-sandwich-76600

02/03/2023, 9:33 PM
Hi @clean-lawyer-76009. May you please share the volume status from the yaml definition to check if it contains more info?
kubectl get <http://volumes.longhorn.io|volumes.longhorn.io> pvc-xxxxxx -n longhorn-system -o yaml
c

clean-lawyer-76009

02/06/2023, 6:52 AM
Hey, so here is the yaml of the pvc for the monitoring system that i deployed, it picked up the default storage class created by longhorn, pvc and pv are bound. Still the pod itself is stuck in Status Init
apiVersion: v1
items:
- apiVersion: v1
  kind: PersistentVolumeClaim
  metadata:
    annotations:
      <http://pv.kubernetes.io/bind-completed|pv.kubernetes.io/bind-completed>: "yes"
      <http://pv.kubernetes.io/bound-by-controller|pv.kubernetes.io/bound-by-controller>: "yes"
      <http://volume.beta.kubernetes.io/storage-provisioner|volume.beta.kubernetes.io/storage-provisioner>: <http://driver.longhorn.io|driver.longhorn.io>
      <http://volume.kubernetes.io/storage-provisioner|volume.kubernetes.io/storage-provisioner>: <http://driver.longhorn.io|driver.longhorn.io>     
    creationTimestamp: "2023-02-02T18:50:12Z"
    finalizers:
    - <http://kubernetes.io/pvc-protection|kubernetes.io/pvc-protection>
    labels:
      <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: kube-prometheus-stack-prometheus     
      <http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: prometheus-operator
      <http://app.kubernetes.io/name|app.kubernetes.io/name>: prometheus
      <http://operator.prometheus.io/name|operator.prometheus.io/name>: kube-prometheus-stack-prometheus    
      <http://operator.prometheus.io/shard|operator.prometheus.io/shard>: "0"
      prometheus: kube-prometheus-stack-prometheus
    name: prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube-prometheus-stack-prometheus-0
    namespace: monitoring
    resourceVersion: "18457791"
    uid: 7707d28d-d4ac-4120-b570-bc4fc50ed95a
  spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 20Gi
    storageClassName: longhorn
    volumeMode: Filesystem
    volumeName: pvc-7707d28d-d4ac-4120-b570-bc4fc50ed95a
  status:
    accessModes:
    - ReadWriteOnce
    capacity:
      storage: 20Gi
    phase: Bound
kind: List
metadata:
  resourceVersion: ""
So i reinstalled longhorn and configured the nodes to be scheduled on another disk on the vm i can attach volumes to the disk, that works. But when i deploy a statefulset, the pod fails with
Warning  FailedMount             59s (x10 over 5m10s)   kubelet
  MountVolume.MountDevice failed for volume "pvc--6b51e214a520" : rpc error: code = Internal desc = format of disk "/dev/longhorn/pvc-" failed: type:("xfs") target:("/var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io//globalmount") options:("defaults") errcode:(exit status 1) output:(mkfs.xfs: cannot open /dev/longhorn/pvc-: Device or resource busy
😪 1
How do /var/lib/kubelet/.../globalmount and /dev/longhorn/pvc- relate ?
👀 1
n

narrow-egg-98197

02/07/2023, 6:53 AM
I think it might have something to do with creating a multipath device. Would you like to refer this KB article (troubleshooting-volume-with-multipath) to see if it resolves your issue?
🙌 1
1
About the relation between
/var/lib/kubelet/.../globalmount
and
/dev/longhorn/pvc-
, it is related the NodePublishVolume step in CSI (Container Storage Interface) to bind mount the global directory on a container directory. (ref. https://github.com/kubernetes-csi/docs/issues/24#issuecomment-408342071)
c

clean-lawyer-76009

02/07/2023, 1:44 PM
yes, Problem was this: https://longhorn.io/kb/troubleshooting-volume-with-multipath/ configuring the blacklist in the
etc/multipath.conf
and restarting the service solved the issue. thanks for your help guys 👍🏻
🎉 2