urgent need for help, production cluster down. I w...
# general
i
urgent need for help, production cluster down. I was trying to update rancher downsteam cluster from GUI and it was some how one of the etcd nodes stuck with draining node (after the two first nodes was upgraded whit out issues). in my attempt to get it back up I made a restore snapshot from GUI but know nothing is running 😞 and I dont know what to do
a
Hi @important-architect-36609, I'm certain this is not the most efficient venue for P1 production issues. That said, if you are still having trouble getting your cluster back online, you will need to provide more information in order for anyone to be able to help. One question I have is, did you take backups just prior to the upgrade; and did you take a backup just before your attempted restore?
i
@adamant-traffic-5372 nope it was not.. but desperate times calls for desperate measures 🙂 we had to "restore" from our iac, no production critical data was lost only so so many hours of work (and sleep). looking at the bright side we got to test your DR plan and nis2 compliance
a
Sounds like it was a nightmare. I’m glad at least a good DR exercise came out of it. Hopefully also some lessons learned.