This message was deleted.
# longhorn-storage
a
This message was deleted.
q
I read that it may mean that we have hardware issue. So maybe the disks are not doing well.
oh seeing that king of stuff in dmesg is not good
Copy code
[202325.420295] blk_update_request: critical medium error, dev sdd, sector 290816 op 0x1:(WRITE) flags 0x800 phys_seg 5 prio class 0
[202325.423010] EXT4-fs warning (device sdd): ext4_end_bio:344: I/O error 7 writing to inode 13 starting block 36357)
[202325.423099] Buffer I/O error on device sdd, logical block 36352
[202325.424493] Buffer I/O error on device sdd, logical block 36353
[202325.425952] Buffer I/O error on device sdd, logical block 36354
[202325.427393] Buffer I/O error on device sdd, logical block 36355
[202325.429003] Buffer I/O error on device sdd, logical block 36356
All the error messages I see are related to scsi devices created by CSI, not actual hardware devices. (sda on this server).
I see
Copy code
[Thu Aug 10 09:40:45 2023] EXT4-fs (sdb): I/O error while writing superblock
[Thu Aug 10 09:40:45 2023] EXT4-fs (sdb): Remounting filesystem read-only
and also
Copy code
sd 4:0:0:1: Power-on or device reset occurred
many times on different CSI devices.
in the longhorn-csi-plugin container on the kubelet I see this related to the same volume
Copy code
time="2023-08-10T09:44:12Z" level=error msg="NodeGetVolumeStats: req: {\"volume_id\":\"pvc-2e1be23b-4021-45d2-878b-d37cd71679dc\",\"volume_path\":\"/var/snap/microk8s/common/var/lib/kubelet/pods/3a29992b-070e-4072-9e54-1411274dfbff/volumes/kubernetes.io~csi/pvc-2e1be23b-4021-45d2-878b-d37cd71679dc/mount\"} err: rpc error: code = Internal desc = Get \"<http://longhorn-backend:9500/v1/volumes/pvc-2e1be23b-4021-45d2-878b-d37cd71679dc>\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
I also see that there is a ton of debug/info level logs. Is that because this cluster uses longhorn version 1.4.0-dev ?
The workaround now is to redeploy the deployment/terminate the faulty pod.