This message was deleted.
# harvester
a
This message was deleted.
s
Hi @quaint-alarm-7893, could you provide the support bundle? We can check for it. BTW, are the problematic disks extra?
q
@salmon-city-57654 thanks! attached is the bundle. and yes, these are all extra disks.
s
Thanks @quaint-alarm-7893, I will take a look for this issue.
Hi @quaint-alarm-7893, could you try to delete your node-disk-manager pods on node-1. And it should respawn automatically. Then, check the disk again (for status) on node 1.
q
@salmon-city-57654 i ran:
Copy code
k delete po harvester-node-disk-manager-vv7qw -n harvester-system
the pod removed and a new one spun up. i still cant add any disks back. using the edit storage / add disk option, and the no disks show up in longhorn.
s
Hi @quaint-alarm-7893, could you generate a SB again? I would like to check the failure after the node-disk-manager respawned
q
hi @salmon-city-57654 any update on this?
s
Hi @quaint-alarm-7893, sorry for the late update here. I checked your first SB, and found the issue [ENHANCEMENT] Should respawn the udev monitor if any errors Then, but the the
StorageReserved
on your environment is
0
. This value should be updated by the longhorn-manager. I will check on the code level to find the root cause. So, is it still in the same situation in your environment?
q
yeah, same issue. basically i have a bunch of disks i cant re-configure and start using again.
@salmon-city-57654 i tried formatting them and everything. i cant seem to figure out how to get l.h. or harvester to let me use them
s
Hi @quaint-alarm-7893, I checked your SB. I thought that because your
instance-manager-r
on the node2 is missing. Did you do any operations on node2? So you can not schedule any replicas to node2.
The last delete timestamp is
2023-12-19
q
@salmon-city-57654 nothing i recall. but it looks like it's disabled in l.h. no disks show up.
@salmon-city-57654 anything?
s
Hi @quaint-alarm-7893, did you remember any operations on the node2? seems all disk statuses on the harvester-node2 are removed
Copy code
spec:
    allowScheduling: true
    disks: {} <-- should not empty
    engineManagerCPURequest: 0
    evictionRequested: false
    name: harvester-02
    replicaManagerCPURequest: 0
    tags: []
I still saw some blockdevices CR on the harvester-node-2. Could you try to delete the node-disk-manager on the harvester-node-2?
q
i've deleted the pod, where can i check the disk statuses again?
s
check the
<http://nodes.longhorn.io|nodes.longhorn.io>
q
@salmon-city-57654 still shows the same, empty. i think when i was initially trying to get it to work, i deleted the disks in lh. maybe that's why? any way i can bring them back?
s
Can you try to check with the following command?
Copy code
# kubectl get replicas |grep harvester-node-2
if no reaplica on the node-2, you can remove all blockdevice CR on the node-2 and re-provision this disk again.
q
there are no replicas. can you tell me the command to remove blockdevice?
s
Copy code
# kubectl get blockdevices -A |grep harvester-node-2
you should get all blockdevices on the node-2
q
okay, so i got the devices available to "add" now in the storage tab in harvester. but quick question. looks like when i cleared out everything, i also cleared the "default-disk" storage too. is that going to be an issue? or is it fine? i guess i just want to make sure if it's an issue or not, because at this point, i can just nuke the node and make a new one if that's a better solution.
@salmon-city-57654 ^^
s
default disk is not involved in blockdevices CR. (If you means the longhorn default disk after installation)
Did you see the default disk on the add drop-down menu on the harvester-> storage page?
q
just /dev/sda - /dev/sdq
when i'm talking about the default-disk. i'm talking about the one automatically provisioned as part of the install of harvester.
@salmon-city-57654
@salmon-city-57654 ^^
s
Hi @quaint-alarm-7893, I cannot find the related default disk on your node 2. Do you remember what operations after the previous time you saw the replica on the node2? Or you can move the VM away from node2 then kick out -> rejoin the node2? That will be more clean. Or you can skip the default disk. We still have other disks for the replicas
q
i deleted all disks in both longhorn and harvester when i was having the issues. i likely deleted the default disk too. is it needed to operate properly? or is it not really nessessary?
s
If you have other disks, the default disk can be ignored
👍 1