This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

06/06/2022, 12:29 PM

This message was deleted.

witty-jelly-95845

06/06/2022, 1:11 PM

Yes it should evict all VMs from that node - see https://docs.harvesterhci.io/v1.0/host/host/#node-maintenance

witty-jelly-95845

06/06/2022, 1:13 PM

Are all 3 nodes management nodes? Was your cluster installed as v1.0.2 or was it upgraded from v1.0.0 or v1.0.1?

wide-garage-9465

06/06/2022, 1:13 PM

Do you know of any way to debug this behavior? • Yes, all three nodes are management nodes • Cluster was a clean install of v1.0.2

wide-garage-9465

06/06/2022, 1:14 PM

FWIW: Cordoning has the same effect (no logs,

SchedulingDisabled

set on node, no other events)

witty-jelly-95845

06/06/2022, 1:18 PM

I reported an issue with Disable Maintenance Mode option not doing anything and one of comments might help check status "under the hood, it adds/removes the following content to the node manifest

Copy code

metadata:
  annotations:
      <http://harvesterhci.io/maintain-status|harvesterhci.io/maintain-status>: running
spec:
  taints:
  - effect: NoSchedule
    key: <http://kubevirt.io/drain|kubevirt.io/drain>
    value: scheduling

" from https://github.com/harvester/harvester/issues/1731#issuecomment-1015171613

witty-jelly-95845

06/06/2022, 1:20 PM

Just found this nugget in https://github.com/harvester/harvester/issues/573#issuecomment-851969763 "Must have at least 4 nodes to support live migration" though wonder if that's correct?

wide-garage-9465

06/06/2022, 1:38 PM

Ah nice, thank you! I'm gonna check for those changes to the node manifest in my next test.

"Must have at least 4 nodes to support live migration"

This would be very weird 🤔 I didn't ever see this statement before. When I kill a node, the VMs will be migrated, even in my 3 node setup, so I also wonder whether this is true or if it's maybe only a requirement for the maintenance mode?

witty-jelly-95845

06/06/2022, 1:41 PM

There was some confusion as to whether 3 or 4 nodes minimum were required for there to be >1 management node (it's 3 ) so perhaps the 4 nodes here is a hangup to that? If you can't get this working I'd log an issue (I don't yet have 3 nodes installed)

wide-garage-9465

06/06/2022, 1:43 PM

wide-garage-9465

06/06/2022, 1:53 PM

According to the KubeVirt docs, the value should be

Draining

to take effect

Note Prior to v0.34 the drain process with live migrations was detached from the
kubectl drain
itself and required in addition specifying a special taint on the nodes:
kubectl taint nodes foo <http://kubevirt.io/drain=draining:NoSchedule|kubevirt.io/drain=draining:NoSchedule>
. This is no longer needed. The taint will still be respected if provided but is obsolete.

But since v1.0.2 is using KubeVIrt v0.49.0 it can even be disregarded

wide-garage-9465

06/06/2022, 1:54 PM

@ancient-pilot-51731 maybe you have a second to expand on your comment and clear up the confusion? 🙂

ancient-pilot-51731

06/06/2022, 2:16 PM

live migration now works even with only 2 nodes from v0.3.0 with #798 fix, in ur case can u please simply refresh the browser to check whether the cert is updated? if so the websocket is disconnected and can not push the latest data to your browser.

👍 1

ancient-pilot-51731

06/06/2022, 2:16 PM

If not also feel free to submit a GH issue with a support bundle file, it will be easier for us to debug, thanks.

wide-garage-9465

06/06/2022, 3:23 PM

Thanks for your answer @ancient-pilot-51731!

if so the websocket is disconnected and can not push the latest data to your browser.

Doesn't seem to be the case. I'm also watching the cluster in parallel to see if there's something happening there after I press the button. I can see the node being set to

SchedulingDisabled

, but VMs are not being migrated even after like 30 minutes. And the UI does not change from "Entering Maintenance Mode". Yeah, I think I'll open an issue with more debug info 👍

wide-garage-9465

06/06/2022, 7:39 PM

Hmm.. I just noticed, that all created VMs, no matter the settings do not have the

evictionStrategy: LiveMigrate

set, which is why nothing happens and the node never finishes entering the maintenance mode. I edited the VM objects manually and recreated the VMIs to get the strategy set and then the maintenance mode works 🤔

wide-garage-9465

06/06/2022, 7:43 PM

As far as I understand, the only things that could prevent this from being set is • the number of nodes in the cluster (though this seems to be fixed as per @ancient-pilot-51731’s comment above - it's still mentioned in the docs) • Scheduling Rules -

Run VM on any available node

is set on all of them • Using a management network in bridge mode -

masquerade

is set for all of them Now I'm lost again 😅

great-bear-19718

06/06/2022, 10:16 PM

@wide-garage-9465 did you create a GH issue

great-bear-19718

06/06/2022, 10:16 PM

and may i please also have a support bundle if possible?

wide-garage-9465

06/07/2022, 5:59 AM

Hi @great-bear-19718, thanks for chiming in. I'm still investigating... I guess the problem must be on my side. I just created a new VM and surely the eviction strategy is set properly here...

wide-garage-9465

06/07/2022, 6:00 AM

OK, I've figured out, that the VMs created from a template don't have it set. I'll create an issue now.

wide-garage-9465

06/07/2022, 6:08 AM

@ancient-pilot-51731 & @great-bear-19718 I noticed, that it only happens for VMs created from a template and now while creating the issue I saw, that there's no

evictionStrategy

set in the template YAML. I guess that may be because you only set the node scheduling options later when creating the VM and this would influence the evictionStrategy. Issue is here: https://github.com/harvester/harvester/issues/2357

👍 3

wide-garage-9465

06/07/2022, 7:30 AM

Seems like it's a bug in the templates part: https://github.com/harvester/harvester/issues/2357#issuecomment-1148300189

great-bear-19718

06/07/2022, 7:32 AM

thanks.. i have already highlighted this internally

witty-jelly-95845

06/07/2022, 7:49 AM

@wide-garage-9465 good work!

wide-garage-9465

06/07/2022, 7:52 AM

Thanks for your support!

19 Views

Open in Slack

Previous Next