This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

03/04/2025, 5:49 PM

This message was deleted.

flat-librarian-14243

03/04/2025, 8:29 PM

Since it mounts the label I am not sure that this is an issue here. But Longhorn sure doesn't like the disk.

enough-australia-5601

03/05/2025, 8:15 AM

The device nodes

/dev/sdX

exist mostly for backwards compatibility. They are convenient if you only have one disk in your system but they have problems otherwise. You can use

/dev/disk/by-id/$ID

/dev/disk/by-path/$PATH

naming schemes during installation to configure the disks. The ID and the path are fixed attributes that don't change on reboot - they only change if you change the disk hardware https://docs.harvesterhci.io/v1.4/install/harvester-configuration#installdata_disk The device nodes

/dev/sda, /dev/sdb, /dev/sdc

can change which disk they refer to because they are populated during boot in the order that the devices show up in. When booting the Linux kernel queries the device buses of your motherboard, to find what hardware is plugged in. This happens in parallel to speed things up, but it is not guaranteed to yield repeatable results, and therefore the order in which device nodes are populated can switch around. To get around this you have to use a device naming based on physical properties. For disks device nodes with stable names are the aforementioned

/dev/disk/by-*

. Same reason why network devices aren't named

eth0, eth1, eth2...

anymore, but

enp3s0f7

wlp3s0

. These names are generated based on physical properties and it's guaranteed that the same interface will get the same name regardless of when it shows up.

flat-librarian-14243

03/05/2025, 8:49 AM

I agree, the disk is correct and is mounted via the label. But it for some reason says: Message: Disk default-disk-e863a01910a9739(/var/lib/harvester/defaultdisk) on node harvester-cnf2j is not ready: record diskUUID doesn't match the one on the disk But the disk is mounted and the UUID matches the longhorn-disk.cfg file

flat-librarian-14243

03/05/2025, 6:30 PM

Is there any other place I need to verify that the UUID matches?

flat-librarian-14243

03/05/2025, 8:30 PM

I am looking into the longhorn logs:

Copy code

kubectl logs -n longhorn-system -l app=longhorn-manager  --max-log-requests 6 --follow
time="2025-03-05T20:29:37Z" level=error msg="Dropping Longhorn replica out of the queue" func=controller.handleReconcileErrorLogging file="utils.go:79" Replica=longhorn-system/pvc-623160fe-c5d5-412f-9683-f07f221d4c24-r-87452d70 controller=longhorn-replica error="failed to sync replica for longhorn-system/pvc-623160fe-c5d5-412f-9683-f07f221d4c24-r-87452d70: Failed to get instance process pvc-623160fe-c5d5-412f-9683-f07f221d4c24-r-87452d70: invalid Instance Manager instance-manager-e90d2bfef72ec289d8bcea1eb1e3262c, state: error, IP: 10.52.3.97" node=harvester-cnf2j
time="2025-03-05T20:29:37Z" level=error msg="Dropping Longhorn replica out of the queue" func=controller.handleReconcileErrorLogging file="utils.go:79" Replica=longhorn-system/pvc-b0bc2fae-00b8-4f80-a666-cc9c84520969-r-17eb71b2 controller=longhorn-replica error="failed to sync replica for longhorn-system/pvc-b0bc2fae-00b8-4f80-a666-cc9c84520969-r-17eb71b2: Failed to get instance process pvc-b0bc2fae-00b8-4f80-a666-cc9c84520969-r-17eb71b2: invalid Instance Manager instance-manager-e90d2bfef72ec289d8bcea1eb1e3262c, state: error, IP: 10.52.3.97" node=harvester-cnf2j
time="2025-03-05T20:29:37Z" level=error msg="Dropping Longhorn replica out of the queue" func=controller.handleReconcileErrorLogging file="utils.go:79" Replica=longhorn-system/pvc-14a02af3-dbc1-4df1-b37c-2431b474fe87-r-129c1a86 controller=longhorn-replica error="failed to sync replica for longhorn-system/pvc-14a02af3-dbc1-4df1-b37c-2431b474fe87-r-129c1a86: Failed to get instance process pvc-14a02af3-dbc1-4df1-b37c-2431b474fe87-r-129c1a86: invalid Instance Manager instance-manager-e90d2bfef72ec289d8bcea1eb1e3262c, state: error, IP: 10.52.3.97" node=harvester-cnf2j

flat-librarian-14243

03/05/2025, 8:31 PM

I am not really sure where to go here.

enough-australia-5601

03/06/2025, 8:38 AM

Hi John, is this the same issue as Christian's or a different one. It would probably be better to ask in the Longhorn channel #CC2UQM49Y, since there are more knowledgeable people there wrt. Longhorn. In any case, first I'd make sure that you won't have the problem of device nodes switching around by using one of the stable references. Longhorn also keeps information about what disks it expects on what node in the corresponding

<http://node.longhorn.io/v1beta2|node.longhorn.io/v1beta2>

objects. Check that these match what's actually there. That being said, without a more detailed description of your environment it's hard to tell what goes on.

wooden-area-49191

03/06/2025, 10:16 AM

@flat-librarian-14243 and I are referring to the same cluster - so yes, it is the same issue 🙂 Thanks for pointing us to the correct channel. I’ll repost this thread there.

ripe-dog-29158

03/06/2025, 2:55 PM

Same cluster, more information. We are using more stable references, for everything but the extra disk for Storage. When adding the disk, we are getting the disk listed with "legacy" names only. And then when the server boots things move around, but I will search in the storage channel.

48 Views

Open in Slack

Previous Next