Hi Team, I've deployed a 3-node Harvester cluster ...
# harvester
m
Hi Team, I've deployed a 3-node Harvester cluster on my local server. I’d like to know if there’s a recommended method to take a snapshot or backup of the node servers, so that I can restore the server data easily in case I need to redeploy Orion on the same servers.
b
They're not really meant for backup/snapshots like VMs are.
The basic thing is that you have an etcd backup, and all of your VMs backedup to an external source (s3 or nfs)
I don't know what Orion is.
w
the only way i can think of doing this is shut all 3 nodes down then dd each disk. if you need to restore you'd need to dd each disk's image back to the respective server.
t
harvester is meant for bare metal. Running it in as a VM is not recommended.
b
And if you have a single host, there's no real benefit to running 3 nodes vs 1
🙃 1
t
thanks I was going to add that too!
m
Hi Team,
I had a 3-node Harvester cluster that encountered some network and backend issues. As a result, the nodes didn't return to a "Ready" state. I ended up redeploying the cluster. Going forward, is there any recommended method or process—such as using a backup or configuration file—that I can follow to recover the cluster in such situations without needing a full redeployment?
b
Yeah etcd
m
Could you please elaborate on how I can take an etcd backup and use it to restore a Harvester node, so I can bring the cluster back from a faulty state to a healthy state? I’m looking for a clear understanding of the recovery process using etcd, so I can avoid a full redeployment in the future.
b
One way is to use the rke cli tool
Like for a backup to an s3 bucket:
Copy code
/opt/rke2/bin/rke2 etcd-snapshot save --etcd-s3 --etcd-s3-bucket=<YourBucket> --etcd-s3-access-key=<YourAccessKey> --etcd-s3-secret-key=<YourSecret> --etcd-s3-endpoint=<http://s3.endpoint.example.com|s3.endpoint.example.com> --name=clustername-etcd-backup
m
Thanks, now I understand how to back up etcd. But I have a doubt: which command should I use to restore the etcd file and make Harvester healthy again?
b
There's lots of things that can make a Harvester cluster unhealthy. A lot of those things circle around networking, and storage. Etcd is only going to help with things like where things are running and who is running what, and certs and secrets.
Also if you don't have a healthy backup, restoring the etcd snapshot isn't going to fix anything.