This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

07/25/2024, 1:52 PM

This message was deleted.

salmon-city-57654

07/25/2024, 3:06 PM

Hi @quaint-alarm-7893, could you create a support bundle for us to check the current status of your cluster?

quaint-alarm-7893

07/25/2024, 3:10 PM

generating it now, sorry, it takes FOREVER! 🙂

quaint-alarm-7893

07/25/2024, 3:19 PM

@salmon-city-57654 sorry i didnt mean to dm you. failed to generate supportbundle: timeout could that be because the node that's messed up?

quaint-alarm-7893

07/25/2024, 3:46 PM

@salmon-city-57654 support bundle, and supportconfig colleciton for issue node attached.

salmon-city-57654

07/25/2024, 4:02 PM

Thanks! I will check it

quaint-alarm-7893

07/26/2024, 4:21 PM

so fyi, ultimately the server came back online. we're talking like 4hrs later it came back into the cluster... 🤷‍♂️ the big issue i had was by the node being offline, all volumes that were primary to that node, were not accessable, and the replicas disappeared from longhorn, which is a huge risk.

salmon-city-57654

07/29/2024, 10:21 AM

Hi @quaint-alarm-7893, sorry I did not really understand your situation. Did you mean the node somehow comes back but some replicas are disappeared?

quaint-alarm-7893

07/29/2024, 1:21 PM

@salmon-city-57654 yes. basically the node was rebooted. when it came back up, it got stuck in a etcd error and wouldnt come online (it showed in harvester that the kubelet was nto running and the rke2-server service was not running on the node). at that point, one of the vms i had, the volumes that were attached to that node, but had replicas on other nodes had an odd issue where the replicas were dropped, and there was only one replica. because it was on the node that was offline, i couldnt get the vm up.

6 Views

Open in Slack

Previous Next