This message was deleted.
# harvester
a
This message was deleted.
w
Yes it should evict all VMs from that node - see https://docs.harvesterhci.io/v1.0/host/host/#node-maintenance
Are all 3 nodes management nodes? Was your cluster installed as v1.0.2 or was it upgraded from v1.0.0 or v1.0.1?
w
Do you know of any way to debug this behavior? • Yes, all three nodes are management nodes • Cluster was a clean install of v1.0.2
FWIW: Cordoning has the same effect (no logs,
SchedulingDisabled
set on node, no other events)
w
I reported an issue with Disable Maintenance Mode option not doing anything and one of comments might help check status "under the hood, it adds/removes the following content to the node manifest
Copy code
metadata:
  annotations:
      <http://harvesterhci.io/maintain-status|harvesterhci.io/maintain-status>: running
spec:
  taints:
  - effect: NoSchedule
    key: <http://kubevirt.io/drain|kubevirt.io/drain>
    value: scheduling
" from https://github.com/harvester/harvester/issues/1731#issuecomment-1015171613
Just found this nugget in https://github.com/harvester/harvester/issues/573#issuecomment-851969763 "Must have at least 4 nodes to support live migration" though wonder if that's correct?
w
Ah nice, thank you! I'm gonna check for those changes to the node manifest in my next test.
"Must have at least 4 nodes to support live migration"
This would be very weird 🤔 I didn't ever see this statement before. When I kill a node, the VMs will be migrated, even in my 3 node setup, so I also wonder whether this is true or if it's maybe only a requirement for the maintenance mode?
w
There was some confusion as to whether 3 or 4 nodes minimum were required for there to be >1 management node (it's 3 ) so perhaps the 4 nodes here is a hangup to that? If you can't get this working I'd log an issue (I don't yet have 3 nodes installed)
w
According to the KubeVirt docs, the value should be
Draining
to take effect
Note Prior to v0.34 the drain process with live migrations was detached from the
kubectl drain
itself and required in addition specifying a special taint on the nodes:
kubectl taint nodes foo <http://kubevirt.io/drain=draining:NoSchedule|kubevirt.io/drain=draining:NoSchedule>
. This is no longer needed. The taint will still be respected if provided but is obsolete.
But since v1.0.2 is using KubeVIrt v0.49.0 it can even be disregarded
@ancient-pilot-51731 maybe you have a second to expand on your comment and clear up the confusion? 🙂
a
live migration now works even with only 2 nodes from v0.3.0 with #798 fix, in ur case can u please simply refresh the browser to check whether the cert is updated? if so the websocket is disconnected and can not push the latest data to your browser.
👍 1
If not also feel free to submit a GH issue with a support bundle file, it will be easier for us to debug, thanks.
w
Thanks for your answer @ancient-pilot-51731!
if so the websocket is disconnected and can not push the latest data to your browser.
Doesn't seem to be the case. I'm also watching the cluster in parallel to see if there's something happening there after I press the button. I can see the node being set to
SchedulingDisabled
, but VMs are not being migrated even after like 30 minutes. And the UI does not change from "Entering Maintenance Mode". Yeah, I think I'll open an issue with more debug info 👍
Hmm.. I just noticed, that all created VMs, no matter the settings do not have the
evictionStrategy: LiveMigrate
set, which is why nothing happens and the node never finishes entering the maintenance mode. I edited the VM objects manually and recreated the VMIs to get the strategy set and then the maintenance mode works 🤔
As far as I understand, the only things that could prevent this from being set is • the number of nodes in the cluster (though this seems to be fixed as per @ancient-pilot-51731’s comment above - it's still mentioned in the docs) • Scheduling Rules -
Run VM on any available node
is set on all of them • Using a management network in bridge mode -
masquerade
is set for all of them Now I'm lost again 😅
g
@wide-garage-9465 did you create a GH issue
and may i please also have a support bundle if possible?
w
Hi @great-bear-19718, thanks for chiming in. I'm still investigating... I guess the problem must be on my side. I just created a new VM and surely the eviction strategy is set properly here...
OK, I've figured out, that the VMs created from a template don't have it set. I'll create an issue now.
@ancient-pilot-51731 & @great-bear-19718 I noticed, that it only happens for VMs created from a template and now while creating the issue I saw, that there's no
evictionStrategy
set in the template YAML. I guess that may be because you only set the node scheduling options later when creating the VM and this would influence the evictionStrategy. Issue is here: https://github.com/harvester/harvester/issues/2357
👍 3
g
thanks.. i have already highlighted this internally
w
@wide-garage-9465 good work!
w
Thanks for your support!