This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

12/19/2023, 9:43 PM

This message was deleted.

salmon-city-57654

12/26/2023, 3:15 PM

Hi @quaint-alarm-7893, could you provide the support bundle? We can check for it. BTW, are the problematic disks extra?

quaint-alarm-7893

12/27/2023, 2:41 AM

@salmon-city-57654 thanks! attached is the bundle. and yes, these are all extra disks.

salmon-city-57654

12/27/2023, 9:58 AM

Thanks @quaint-alarm-7893, I will take a look for this issue.

salmon-city-57654

12/28/2023, 9:05 AM

Hi @quaint-alarm-7893, could you try to delete your node-disk-manager pods on node-1. And it should respawn automatically. Then, check the disk again (for status) on node 1.

quaint-alarm-7893

12/28/2023, 4:41 PM

@salmon-city-57654 i ran:

Copy code

k delete po harvester-node-disk-manager-vv7qw -n harvester-system

the pod removed and a new one spun up. i still cant add any disks back. using the edit storage / add disk option, and the no disks show up in longhorn.

salmon-city-57654

12/28/2023, 5:45 PM

Hi @quaint-alarm-7893, could you generate a SB again? I would like to check the failure after the node-disk-manager respawned

quaint-alarm-7893

12/28/2023, 10:52 PM

@salmon-city-57654 here you go.

supportbundle_c53f87ef-6f0a-4fae-825b-06cc3091b637_2023-12-28T22-33-25Z.zip

quaint-alarm-7893

01/24/2024, 4:15 PM

hi @salmon-city-57654 any update on this?

quaint-alarm-7893

01/24/2024, 4:40 PM

here's a new bundle incase you need it.

supportbundle_c53f87ef-6f0a-4fae-825b-06cc3091b637_2024-01-24T16-15-27Z.zip

salmon-city-57654

01/25/2024, 2:11 AM

Hi @quaint-alarm-7893, sorry for the late update here. I checked your first SB, and found the issue [ENHANCEMENT] Should respawn the udev monitor if any errors Then, but the the

StorageReserved

on your environment is

. This value should be updated by the longhorn-manager. I will check on the code level to find the root cause. So, is it still in the same situation in your environment?

quaint-alarm-7893

01/25/2024, 2:34 AM

yeah, same issue. basically i have a bunch of disks i cant re-configure and start using again.

quaint-alarm-7893

01/25/2024, 2:34 AM

@salmon-city-57654 i tried formatting them and everything. i cant seem to figure out how to get l.h. or harvester to let me use them

salmon-city-57654

01/29/2024, 8:11 AM

Hi @quaint-alarm-7893, I checked your SB. I thought that because your

instance-manager-r

on the node2 is missing. Did you do any operations on node2? So you can not schedule any replicas to node2.

salmon-city-57654

01/29/2024, 8:12 AM

The last delete timestamp is

2023-12-19

quaint-alarm-7893

01/29/2024, 2:58 PM

@salmon-city-57654 nothing i recall. but it looks like it's disabled in l.h. no disks show up.

quaint-alarm-7893

01/29/2024, 3:01 PM

attached is a lh bundle, incase that's helpful

supportbundle_e655c9ed-6156-4678-9fda-f1421f845b5a_2024-01-29T14-59-21Z.zip

quaint-alarm-7893

02/01/2024, 1:43 PM

@salmon-city-57654 anything?

salmon-city-57654

02/01/2024, 3:31 PM

Hi @quaint-alarm-7893, did you remember any operations on the node2? seems all disk statuses on the harvester-node2 are removed

Copy code

spec:
    allowScheduling: true
    disks: {} <-- should not empty
    engineManagerCPURequest: 0
    evictionRequested: false
    name: harvester-02
    replicaManagerCPURequest: 0
    tags: []

I still saw some blockdevices CR on the harvester-node-2. Could you try to delete the node-disk-manager on the harvester-node-2?

quaint-alarm-7893

02/01/2024, 6:11 PM

i've deleted the pod, where can i check the disk statuses again?

salmon-city-57654

02/02/2024, 8:50 AM

check the

<http://nodes.longhorn.io|nodes.longhorn.io>

quaint-alarm-7893

02/02/2024, 4:42 PM

@salmon-city-57654 still shows the same, empty. i think when i was initially trying to get it to work, i deleted the disks in lh. maybe that's why? any way i can bring them back?

salmon-city-57654

02/02/2024, 4:59 PM

Can you try to check with the following command?

Copy code

# kubectl get replicas |grep harvester-node-2

salmon-city-57654

02/02/2024, 5:00 PM

if no reaplica on the node-2, you can remove all blockdevice CR on the node-2 and re-provision this disk again.

quaint-alarm-7893

02/02/2024, 5:14 PM

there are no replicas. can you tell me the command to remove blockdevice?

salmon-city-57654

02/02/2024, 5:21 PM

Copy code

# kubectl get blockdevices -A |grep harvester-node-2

you should get all blockdevices on the node-2

quaint-alarm-7893

02/05/2024, 3:29 PM

okay, so i got the devices available to "add" now in the storage tab in harvester. but quick question. looks like when i cleared out everything, i also cleared the "default-disk" storage too. is that going to be an issue? or is it fine? i guess i just want to make sure if it's an issue or not, because at this point, i can just nuke the node and make a new one if that's a better solution.

quaint-alarm-7893

02/05/2024, 3:29 PM

@salmon-city-57654 ^^

salmon-city-57654

02/05/2024, 3:39 PM

default disk is not involved in blockdevices CR. (If you means the longhorn default disk after installation)

salmon-city-57654

02/05/2024, 3:41 PM

Did you see the default disk on the add drop-down menu on the harvester-> storage page?

quaint-alarm-7893

02/05/2024, 4:30 PM

just /dev/sda - /dev/sdq

quaint-alarm-7893

02/05/2024, 4:30 PM

when i'm talking about the default-disk. i'm talking about the one automatically provisioned as part of the install of harvester.

quaint-alarm-7893

02/06/2024, 4:28 PM

@salmon-city-57654

quaint-alarm-7893

02/06/2024, 4:28 PM

@salmon-city-57654 ^^

salmon-city-57654

02/06/2024, 5:09 PM

Hi @quaint-alarm-7893, I cannot find the related default disk on your node 2. Do you remember what operations after the previous time you saw the replica on the node2? Or you can move the VM away from node2 then kick out -> rejoin the node2? That will be more clean. Or you can skip the default disk. We still have other disks for the replicas

quaint-alarm-7893

02/06/2024, 5:10 PM

i deleted all disks in both longhorn and harvester when i was having the issues. i likely deleted the default disk too. is it needed to operate properly? or is it not really nessessary?

salmon-city-57654

02/06/2024, 5:12 PM

If you have other disks, the default disk can be ignored

👍 1

Open in Slack

Previous Next