https://rancher.com/ logo
Title
r

red-printer-88208

11/17/2022, 2:47 PM
Hello Everyone, We installed longhorn for storage class using kubectl and everything was working well when we used the root disk for storing all the data. Now I created an addtinal disk and attached it to the VM by creating a lvm. This was done specifically for /var/lib/longhorn folder as this contains the PVs.
-> df -h
Filesystem                 Size  Used Avail Use% Mounted on
/dev/sda3                  6.0G  1.5G  4.5G  24% /
devtmpfs                   7.9G     0  7.9G   0% /dev
/dev/sda2                   93M   32M   55M  37% /media/hdCnf
/dev/sda8                   43G  6.7G   34G  17% /media/hdVar
/dev/sda5                   93M   28K   86M   1% /media/hdUser
/dev/loop0                 2.0G   13M  1.9G   1% /media/loopfs
cgroup                     7.9G     0  7.9G   0% /sys/fs/cgroup
/dev/sda7                  100M   69M   31M  69% /boot
/dev/sda1                  200M   69M  131M  35% /run/media/sda1
/dev/mapper/vg01-longhorn  177G   29M  168G   1% /var/lib/longhorn
But now the PV are not getting attached to the pods and keeps giving error message failed to attach the volume. Any idea why this might happening.
Just to add this, I can see these error messages in the log
I1117 14:38:18.901278   20823 operation_generator.go:631] "MountVolume.WaitForAttach entering for volume \"pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6\" (UniqueName: \"<http://kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6\|kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6\>") pod \"cassandra-0\" (UID: \"79de5cf0-e647-42d8-ab6b-f6ad546ce257\") DevicePath \"\"" pod="databases/cassandra-0"
E1117 14:38:18.922341   20823 nestedpendingoperations.go:335] Operation for "{volumeName:<http://kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6|kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6> podName: nodeName:}" failed. No retries permitted until 2022-11-17 14:39:22.922309835 +0000 UTC m=+4133.541601983 (durationBeforeRetry 1m4s). Error: MountVolume.WaitForAttach failed for volume "pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6" (UniqueName: "<http://kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6|kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6>") pod "cassandra-0" (UID: "79de5cf0-e647-42d8-ab6b-f6ad546ce257") : watch error:unknown (get <http://volumeattachments.storage.k8s.io|volumeattachments.storage.k8s.io>) for volume pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6
E1117 14:38:25.128335   20823 kubelet.go:1761] "Unable to attach or mount volumes for pod; skipping pod" err="unmounted volumes=[data], unattached volumes=[data kube-api-access-hxfkw]: timed out waiting for the condition" pod="databases/cassandra-0"
E1117 14:38:25.128379   20823 pod_workers.go:951] "Error syncing pod, skipping" err="unmounted volumes=[data], unattached volumes=[data kube-api-access-hxfkw]: timed out waiting for the condition" pod="databases/cassandra-0" podUID=79de5cf0-e647-42d8-ab6b-f6ad546ce257
I1117 14:39:22.995714   20823 operation_generator.go:631] "MountVolume.WaitForAttach entering for volume \"pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6\" (UniqueName: \"<http://kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6\|kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6\>") pod \"cassandra-0\" (UID: \"79de5cf0-e647-42d8-ab6b-f6ad546ce257\") DevicePath \"\"" pod="databases/cassandra-0"
E1117 14:39:23.005036   20823 csi_attacher.go:712] <http://kubernetes.io/csi|kubernetes.io/csi>: attachment for pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6 failed: rpc error: code = DeadlineExceeded desc = volume pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6 failed to attach to node nodegrid-kangie1i
E1117 14:39:23.005127   20823 nestedpendingoperations.go:335] Operation for "{volumeName:<http://kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6|kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6> podName: nodeName:}" failed. No retries permitted until 2022-11-17 14:41:25.005106365 +0000 UTC m=+4255.624398487 (durationBeforeRetry 2m2s). Error: MountVolume.WaitForAttach failed for volume "pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6" (UniqueName: "<http://kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6|kubernetes.io/csi/driver.longhorn.io^pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6>") pod "cassandra-0" (UID: "79de5cf0-e647-42d8-ab6b-f6ad546ce257") : rpc error: code = DeadlineExceeded desc = volume pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6 failed to attach to node nodegrid-kangie1i
E1117 14:40:41.566329   20823 kubelet.go:1761] "Unable to attach or mount volumes for pod; skipping pod" err="unmounted volumes=[data], unattached volumes=[data kube-api-access-hxfkw]: timed out waiting for the condition" pod="databases/cassandra-0"
E1117 14:40:41.566372   20823 pod_workers.go:951] "Error syncing pod, skipping" err="unmounted volumes=[data], unattached volumes=[data kube-api-access-hxfkw]: timed out waiting for the condition" pod="databases/cassandra-0" podUID=79de5cf0-e647-42d8-ab6b-f6ad546ce257
h

high-butcher-71851

11/17/2022, 4:10 PM
Can you look at the logs on longhorn containers running on that host?
r

red-printer-88208

11/17/2022, 7:56 PM
Any specific pods you wanna to see. I am not getting much in logs there. Getting this in
longhorn-admission-webhook
pods
longhorn-admission-webhook E1117 19:55:33.572066       1 gvks.go:69] failed to sync schemas: unable to retrieve the complete list of server APIs: <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is c ││ urrently unable to handle the request                                                                                                                                                     ││ longhorn-admission-webhook E1117 19:55:39.717452       1 gvks.go:69] failed to sync schemas: unable to retrieve the complete list of server APIs: <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is c ││ urrently unable to handle the request
Getting this in
csi-attacher
pods
I1117 20:01:48.295114       1 csi_handler.go:231] Error processing "csi-33c686b932de2bfeb4b47b87e2228f8d67d8ddf5b464ec5b2694bda03c934c27": failed to attach: rpc error: code = DeadlineEx ││ ceeded desc = volume pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6 failed to attach to node nodegrid-dihu0boh                                                                                  ││ I1117 20:01:48.295178       1 csi_handler.go:248] Attaching "csi-33c686b932de2bfeb4b47b87e2228f8d67d8ddf5b464ec5b2694bda03c934c27"                                                        ││ I1117 20:02:21.596999       1 csi_handler.go:231] Error processing "csi-ac21fe8af72459d262f3eb456b3d28955c97b45c52dd389bfe36960bf8ebc1d8": failed to attach: rpc error: code = DeadlineEx ││ ceeded desc = volume pvc-ecb8de0a-16fc-4f41-b33f-74ce555b513f failed to attach to node nodegrid-dihu0boh                                                                                  ││ I1117 20:02:21.597066       1 csi_handler.go:248] Attaching "csi-ac21fe8af72459d262f3eb456b3d28955c97b45c52dd389bfe36960bf8ebc1d8"
Logs for
instance-manager-e
pods
[longhorn-instance-manager] time="2022-11-17T20:03:19Z" level=info msg="wait for gRPC service of process pvc-ecb8de0a-16fc-4f41-b33f-74ce555b513f-e-6a9e056c to start at localhost:10001" ││ [pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6-e-3086ddb2] go-iscsi-helper: command failed: exit status 1                                                                                      ││ [pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6-e-3086ddb2] panic: exit status 1                                                                                                                ││                                                                                                                                                                                           │
│ goroutine 170 [running]:                                                                                                                                                                  │
│ <http://github.com/longhorn/go-iscsi-helper/iscsi.startDaemon(0xc000690000|github.com/longhorn/go-iscsi-helper/iscsi.startDaemon(0xc000690000>, 0x100)                                                                                                                │
│     /go/src/github.com/longhorn/longhorn-engine/vendor/github.com/longhorn/go-iscsi-helper/iscsi/target.go:232 +0x449                                                                     │
│ created by <http://github.com/longhorn/go-iscsi-helper/iscsi.StartDaemon|github.com/longhorn/go-iscsi-helper/iscsi.StartDaemon>                                                                                                                          │
│     /go/src/github.com/longhorn/longhorn-engine/vendor/github.com/longhorn/go-iscsi-helper/iscsi/target.go:196 +0xad                                                                      │
│ [longhorn-instance-manager] time="2022-11-17T20:03:19Z" level=info msg="Process Manager: process pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6-e-3086ddb2 error out, error msg: exit status 2" │
│ [longhorn-instance-manager] time="2022-11-17T20:03:19Z" level=debug msg="Process update: pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6-e-3086ddb2: state error: Error: exit status 2"          │
│ [longhorn-instance-manager] time="2022-11-17T20:03:20Z" level=info msg="wait for gRPC service of process pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6-e-3086ddb2 to start at localhost:10000" │
│ [longhorn-instance-manager] time="2022-11-17T20:03:20Z" level=info msg="stop waiting for gRPC service of process pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6-e-3086ddb2 to start at localhos │
│ [longhorn-instance-manager] time="2022-11-17T20:03:20Z" level=info msg="wait for gRPC service of process pvc-ecb8de0a-16fc-4f41-b33f-74ce555b513f-e-6a9e056c to start at localhost:10001" │
│ [longhorn-instance-manager] time="2022-11-17T20:03:20Z" level=debug msg="Process update: pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6-e-3086ddb2: state error: Error: exit status 2"          ││ [longhorn-instance-manager] time="2022-11-17T20:03:20Z" level=debug msg="Process Manager: start getting logs for process pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6-e-3086ddb2"             │
│ [longhorn-instance-manager] time="2022-11-17T20:03:20Z" level=debug msg="Process Manager: got logs for process pvc-0cbe60b1-45a9-48c4-b31a-a4a97d8eecd6-e-3086ddb2"
h

high-butcher-71851

11/17/2022, 9:13 PM
go-iscsi-helper: command failed: exit status 1
Is that service running on the machine?
https://github.com/longhorn/go-iscsi-helper/blob/master/iscsi/target.go#L231 seems like the start daemon is throwing that error
r

red-printer-88208

11/17/2022, 10:03 PM
Yes, It's not there. I'll try to install this and see if this get resolved. Will update here the results. Thank you for the support.
h

high-butcher-71851

11/18/2022, 3:13 AM
https://staging--longhornio.netlify.app/docs/0.8.1/deploy/install/ theres a script here called envrionment_check.sh that could help too
r

red-printer-88208

11/18/2022, 2:08 PM
I got the tgt installed,
root@nodegrid-dihu0boh:~# tgtd -V
1.0.63
root@nodegrid-dihu0boh:~# tgtadm -V
1.0.63
root@nodegrid-dihu0boh:~# /etc/init.d/tgtd status
tgtd is running
but I am still getting error
[longhorn-instance-manager] time="2022-11-18T13:59:55Z" level=info msg="wait for gRPC service of process pvc-d2297f7c-99be-4040-9c1b-d3d5fe13b522-e-61ca102e to start at localhost:10000" ││ [pvc-d2297f7c-99be-4040-9c1b-d3d5fe13b522-e-61ca102e] go-iscsi-helper: command failed: exit status 1                                                                                      │
│ [pvc-d2297f7c-99be-4040-9c1b-d3d5fe13b522-e-61ca102e] panic: exit status 1                                                                                                                │
│                                                                                                                                                                                           │
│ goroutine 166 [running]:                                                                                                                                                                  │
│ <http://github.com/longhorn/go-iscsi-helper/iscsi.startDaemon(0xc000010008|github.com/longhorn/go-iscsi-helper/iscsi.startDaemon(0xc000010008>, 0x100)                                                                                                                │
│     /go/src/github.com/longhorn/longhorn-engine/vendor/github.com/longhorn/go-iscsi-helper/iscsi/target.go:232 +0x449                                                                     ││ created by <http://github.com/longhorn/go-iscsi-helper/iscsi.StartDaemon|github.com/longhorn/go-iscsi-helper/iscsi.StartDaemon>                                                                                                                          ││     /go/src/github.com/longhorn/longhorn-engine/vendor/github.com/longhorn/go-iscsi-helper/iscsi/target.go:196 +0xad                                                                      ││ [longhorn-instance-manager] time="2022-11-18T13:59:55Z" level=info msg="Process Manager: process pvc-d2297f7c-99be-4040-9c1b-d3d5fe13b522-e-61ca102e error out, error msg: exit status 2" │
│ [longhorn-instance-manager] time="2022-11-18T13:59:55Z" level=debug msg="Process update: pvc-d2297f7c-99be-4040-9c1b-d3d5fe13b522-e-61ca102e: state error: Error: exit status 2"          ││ [longhorn-instance-manager] time="2022-11-18T13:59:56Z" level=info msg="wait for gRPC service of process pvc-d2297f7c-99be-4040-9c1b-d3d5fe13b522-e-61ca102e to start at localhost:10000" │
│ [longhorn-instance-manager] time="2022-11-18T13:59:56Z" level=debug msg="Process update: pvc-d2297f7c-99be-4040-9c1b-d3d5fe13b522-e-61ca102e: state error: Error: exit status 2"          ││ [longhorn-instance-manager] time="2022-11-18T13:59:56Z" level=debug msg="Process Manager: start getting logs for process pvc-d2297f7c-99be-4040-9c1b-d3d5fe13b522-e-61ca102e"             ││ [longhorn-instance-manager] time="2022-11-18T13:59:56Z" level=debug msg="Process Manager: got logs for process pvc-d2297f7c-99be-4040-9c1b-d3d5fe13b522-e-61ca102e"
h

high-butcher-71851

11/18/2022, 6:55 PM
What is
tgt
? What does the output of that check_environment.sh script say ?
r

red-printer-88208

11/18/2022, 7:09 PM
The script doesn't work on our linux distribution, its not supported. We are running yocto linux system.
h

high-butcher-71851

11/18/2022, 8:11 PM
Cool. I never heard of that linux. Not sure if its supported by longhorn. I wonder if longhorn is expecting that executable anyway . You could try searching longhorn issues, not sure this may be relevant to you, https://github.com/longhorn/longhorn/issues/1984
r

red-printer-88208

11/18/2022, 8:16 PM
Longhorn works fine there as long as everything is running on the boot disk. I have installed all the dependencies manually with .ipk files. This only happens when we try to mount an additional disk for longhorn data storage
👍 1
I was able to add additional disk to longhorn by using the longhorn UI. So at least all the required packages are working. Now the question is, how do I tell longhorn to use the additional disk I mounted though the configuration. Any values that I can update in the helm values.yaml file to achieve this. I am not able to figure this out. Docs says that we can set
defaultSettings.defaultDataPath
. I set it to my mount point of additional disk, but it does nothing.
h

high-butcher-71851

11/21/2022, 5:45 PM
I’m not sure if you can do it outside of the UI
r

red-printer-88208

11/21/2022, 5:47 PM
Yeah, I was hoping there would be an option, but well I'll probably work on writing a go cli which can maybe talk with longhorn API to do that stuff. Will have to check if that something like that works.