This message was deleted.
# k3s
a
This message was deleted.
g
If you do that option you mentioned, you’ll end up with the cluster losing quorum and having to do a cluster-reset. I actually recommend going straight to a cluster-reset, but restoring it to the new machine. Essentially, your steps would be this: 1. Take snapshot — I generally use S3 because that makes it easy, but you can take it locally if you need to and then scp the file over. 2. Make sure you know your join token. If you don’t know it, grab the value from
/var/lib/rancher/k3s/server/node-token
on the master node. Then, stop all nodes in cluster. 3. On your new machine, restore the snapshot. For example, if using s3, your commands might look something like this:
Copy code
$ curl -sfL <https://get.k3s.io/> | sudo INSTALL_K3S_VERSION=$VERSION INSTALL_K3S_SKIP_ENABLE=true sh -

$ sudo k3s server --cluster-reset --etcd-s3 \
 --cluster-reset-restore-path=SNAPSHOT_NAME \
 --etcd-s3-bucket=YOUR_BUCKET_NAME --etcd-s3-folder=OPTIONAL_FOLDER --etcd-s3-region=YOUR_REGION \
 --etcd-s3-access-key=YOURKEY --etcd-s3-secret-key="YOURSECRET" \
 --token=TOKEN
4. Ensure you have whatever settings and volumes and whatnot present on your new machine, and ensure you have your k3s config.yaml setup as you’d like it, then you can start the k3s service and after a little while (5-10 minutes) you should have your cluster running just as it was. This new node will show Ready and the other nodes will show NotReady. Then you should be able to do any additional migrations you’d like within the cluster and then terminate the old stuff!
I do always recommend doing this in insolation in a test cluster first just so you understand the full steps and workflow though before going straight to prod of course!
p
@gray-lawyer-73831 thanks for the help! I forgot to mention that we are using Longhorn to manage volumes, would this setup allow to transfer the volumes to the new nodes during step 4, with nodes in a
NotReady
state? I have another question, since this is a somewhat old cluster, it started out with k3s 1.19 on sqlite mode and got updated to 1.24. From what I understand it should still be using sqlite, would you suggest converting to ETCD first and then migrating?
g
I don’t know enough about longhorn to be able to answer your first question, I’m sorry. But yeah actually you can’t do any of the restore stuff unless you’re using etcd!
p
Longhorn uses the local storage of the nodes and exposes it for block storage, so volumes are managed locally and made available by longhorn pods. To transfer volumes then i would need the other nodes "online" with at least the longhorn pods available to move replicas to the new machine. From what I understand the cluster-reset would cut off the old machines, making this impossible, is it correct?
At this point, would you suggest I just start from a new cluster, restore from velero and the volumes from backup? It looks like I would need to do a similar thing anyway, but with added steps and uncertanty. In such a case, would you still suggest to use ETCD or just stick with sqlite? I see that ETCD would use a bit more ram, would performance be otherwise better/similar?
g
Hey sorry for the delay! I generally recommend using etcd because it makes it easier in the future if you ever want to add more server nodes or do any snapshot/cluster restores since you can’t do that with sqlite in the current state. As for the restoration, using velero, or waiting until next month to potentially use longhorn’s new backup/restore feature that is coming out then are probably good ideas! I do think the new cluster is the easiest/safest at this point for what you want 🙂
p
No problem. Thank you for the help! This new longhorn feature looks fantastic! If the people higher up give me more time I think i might just do that
🦜 1
380 Views