While launching `test-vm001`, the VM is stuck in t...
# harvester
m
While launching
test-vm001
, the VM is stuck in the "Starting" state and not reported as running. The error shown in the events section is: It appears that the volume attachment to node
orion1
is timing out. Requesting help in debugging this issue—could be related to Longhorn volume scheduling or node availability.
s
Can you describe your cluster? How many nodes, what are node names, which nodes have Longhorn disks? Also if you check out what is on Longhorn UI ?
m
Here’s my cluster setup: • Total nodes: 3 • Node names:
orion
,
orion0
, and
orion1
Longhorn disks: All three nodes have storage allocated for Longhorn, as seen in the screenshot. Also, I’d like to know how to access the Longhorn UI and view/manage the Longhorn volumes. Could you please guide me on that?
🙌 1
Please look @sparse-vr-27407
i want to know what is the exactly issue ?
t
You can access the longhorn UI by enabling Extension Developer feature
then the link is available in the support page by clicking “support” in the lower left.
m
Thnx @thousands-advantage-10804
s
This "Pereferences" setting is for your user profile, click on the top right user icon and there will it be. It took me a while to find it and after i had to reload the UI (F5) to make it work. In the left menu on the bottom there is a blue text "Support" and after clicking, on the right side there will be 2 extra menu entries, one of them leading you to the Longhorn UI. A screenshot of the dashboard would be nice.
m
@sparse-vr-27407 I'm encountering problems creating VMs when all nodes in the 3-node Harvester cluster are active. • When only the master node is online, VMs create and start successfully. • But when the full cluster is active, VM creation fails with volume attach errors. • VM pod gets scheduled, but the
virt-launcher
fails
s
You say when "the" master -> does it mean you have only one, despite 3 nodes and no HA on your control plane ? Please post a
kubectl get nodes
against your cluster. Still, a look at Longhorn dashboard and longhorn node list would my first step to check if it was my problem. I'd check also the volume in the error state under Longhorn UI / Volumes by clicking on it. From the events it looks to me a PVC/PV binding issue, and i'd debug that as such, not focusing on the purpose of the volume. Or rather i'd change the vm's config to prefer scheduling to the master and see if that works out
m
ya we can set option for prefer schedulling to master node but that will not sort our issue.
Attached volume section on Longhorn UI
s
Sure, but as you said all is fine when only the master is running, i would try to run all 3 nodes but schedule or migrate the vm to the master. If this works (3 nodes running, vm is fine on master, fails on the other) then its a strong clue
Your Longhorn dashboard shows you have some volumes degraded. A scrennshot of the Nodes in the same Longhorn UI would be interesting
m
I'm facing a problem where the VM fails before even reaching the scheduling stage due to PVC-related issues. • PVC attachment fails, leading to
AttachVolume.Attach failed
errors. • This happens only when all nodes in the 3-node cluster are online. • When I keep only the master node active, VM creation works fine and PVCs are attached successfully.
Copy code
AttachVolume.Attach failed for volume "<pvc-id>" : rpc error: code = Aborted desc = volume <pvc-id> is not ready for workloads
s
'This happens only when all nodes in the 3-node cluster are online.' -> understood, my next step would be to check is it specific to a node or not. Configuring the VM to consider only the master node, can be achived by changing 'Node Scheduling' under the VM's setting. Do you want to try this test ?
m
ya , can you provide the steps for doing that ?
s
Actually, it is a click under the 'Node Scheduling', which can be found under the VM's setting. Can you post a screenshot of the "Nodes" tab in the same Longhorn UI ?
m
Yes, I understand. That option under 'Node Scheduling' becomes configurable after the VM is created. What I’m trying to highlight is this: If we enable this setting during VM creation, it would benefit the HCI setup using Harvester by allowing better control over node selection for VM scheduling. Right now, I’m observing that once all three nodes in the cluster are up and joined, the master node avoids scheduling VMs on itself, and the scheduler tries to place the VM on other worker nodes. It’s during this placement — particularly at the volume attachment stage — that the issue arises, preventing the VM from being created
s
Debugging that volume issue, can you post a screenshot of the "Nodes" tab in the same Longhorn UI ?