This message was deleted Rancher Users #longhorn-storage

Join Slack

This message was deleted.

# longhorn-storage

adamant-kite-43734

04/10/2025, 5:47 AM

This message was deleted.

cuddly-vase-67379

04/10/2025, 5:59 AM

These document might help: recover-volume#automatic-workload-pod-deletion auto-salvage-support-with-revision-counter-disabled

👓 1

billowy-needle-49036

04/10/2025, 6:02 AM

so do i want revision counter or not? in this case (i have a few faults), there is only one img file

cuddly-vase-67379

04/10/2025, 6:53 AM

What do you mean a few faults and only one image file? Do you mean you have some failed volumes and you can only see only one volume-head-xxx.img in directory

/var/lib/longhorn/replicas/

on the worker node?

billowy-needle-49036

04/10/2025, 6:59 AM

after a 2h net outage between k8s nodes, i have 3 new faulted vols out of 20

billowy-needle-49036

04/10/2025, 7:00 AM

considering the ollama fault, i indeed have only 1 volume-head-*.img and it is mountable (uotside of LH)

billowy-needle-49036

04/10/2025, 7:02 AM

/d4/longhorn/replicas/pvc-ec1d7d67-ab8f-416d-b1b2-f876add59c61-20194b66# head *meta ==> volume-head-000.img.meta <== {"Name":"volume-head-000.img","Parent":"","Removed":false,"UserCreated":false,"Created":"2025-02-20T012220Z","Labels":null} ==> volume.meta <== {"Size":53687091200,"Head":"volume-head-000.img","Dirty":true,"Rebuilding":false,"Error":"","Parent":"","SectorSize":512,"BackingFilePath":""}

billowy-needle-49036

04/10/2025, 7:02 AM

i e2fsck'd the img myself, so that Dirty: true field is probably wrong

billowy-needle-49036

04/10/2025, 7:06 AM

clicking salvage in the UI fails with this: "unable to salvage volume pvc-ec1d7d67-ab8f-416d-b1b2-f876add59c61: disk with UUID 8a3b890b-5111-4821-ba57-cbcc99669852 on node dash is unschedulable for replica pvc-ec1d7d67-ab8f-416d-b1b2-f876add59c61-r-76a68e05"

cuddly-vase-67379

04/10/2025, 7:11 AM

> /d4/longhorn/replicas/pvc-ec1d7d67-ab8f-416d-b1b2-f876add59c61-20194b6 It is the directory for the volume

pvc-ec1d7d67-ab8f-416d-b1b2-f876add59c61

, it should have other directories for volumes in

d4/longhorn/replicas/

. > unable to salvage volume pvc-ec1d7d67-ab8f-416d-b1b2-f876add59c61: disk with UUID 8a3b890b-5111-4821-ba57-cbcc99669852 on node dash is unschedulable for replica pvc-ec1d7d67-ab8f-416d-b1b2-f876add59c61-r-76a68e05 Do you have enough storage to schedule the replica? Could you provide the support bundle for investigation?

billowy-needle-49036

04/10/2025, 7:12 AM

correct dir; diskfree looks ok

cuddly-vase-67379

04/10/2025, 7:17 AM

Is node

dash

available or ready?

billowy-needle-49036

04/10/2025, 7:17 AM

yes- GH issue comig up

billowy-needle-49036

04/10/2025, 7:22 AM

https://github.com/longhorn/longhorn/issues/10706

billowy-needle-49036

04/10/2025, 7:23 AM

support bundle has been going for many minutes: 33%

👍 1

cuddly-vase-67379

04/10/2025, 7:24 AM

Could you provide the information with the command

Copy code

kubectl -n longhorn-system get <http://nodes.longhorn.io|nodes.longhorn.io> dash -oyaml

cuddly-vase-67379

04/10/2025, 7:25 AM

And

Copy code

kubectl -n longhorn-system get lhr pvc-ec1d7d67-ab8f-416d-b1b2-f876add59c61-r-76a68e05 -oyaml

cuddly-vase-67379

04/10/2025, 7:30 AM

Copy code

d4:
      conditions:
      - lastProbeTime: ""
        lastTransitionTime: "2024-04-22T07:25:14Z"
        message: Disk d4(/d4/longhorn) on node dash is ready
        reason: ""
        status: "True"
        type: Ready
      - lastProbeTime: ""
        lastTransitionTime: "2025-04-06T05:02:41Z"
        message: Disk d4 (/d4/longhorn) on the node dash has 247463936000 available,
          but requires reserved 0, minimal 25% to schedule more replicas
        reason: DiskPressure
        status: "False"
        type: Schedulable
      diskDriver: ""
      diskName: ""
      diskPath: /d4/longhorn
      diskType: filesystem
      diskUUID: 8a3b890b-5111-4821-ba57-cbcc99669852

This disk

d4

on the node

dash

is not able to be scheduled so it got the error.

billowy-needle-49036

04/10/2025, 7:30 AM

Filesystem Size Used Avail Use% Mounted on /dev/sdb2 938G 708G 183G 80% /d4

billowy-needle-49036

04/10/2025, 7:31 AM

message: Disk d4 (/d4/longhorn) on the node dash has 247463936000 available, but requires reserved 0, minimal 25% to schedule more replicas ^ you're seeing that?

cuddly-vase-67379

04/10/2025, 7:32 AM

Yes.

status

needs to be “True”

Copy code

status: "False"
type: Schedulable

billowy-needle-49036

04/10/2025, 7:33 AM

80% used needs to be under 75% ?

billowy-needle-49036

04/10/2025, 7:34 AM

and would you agree this is a UI bug? 🙂

cuddly-vase-67379

04/10/2025, 7:35 AM

and would you agree this is a UI bug?

Do you mean UI should not allow users do the salvage if the disk is not able to be scheduled?

billowy-needle-49036

04/10/2025, 7:36 AM

the status of the volume should include "my disk is unsched because it is 80% full; need 75% or below"

cuddly-vase-67379

04/10/2025, 7:37 AM

Yes, that could be an improvement.

billowy-needle-49036

04/10/2025, 7:38 AM

i'll add it to the bug

billowy-needle-49036

04/10/2025, 7:38 AM

thx for looking at this- the data was just LLM caches, but we got to make a good bugreport 🙂

👍 1

billowy-needle-49036

04/10/2025, 7:51 AM

"failed to create SupportBundle: admission webhook \"validator.longhorn.io\" denied the request: please try again later. Another support-bundle-2025-04-10t07-19-12z is in Generating phase" <-- this is kind of a mess

10 Views

Open in Slack

Previous Next