Hey all, I have an airgapped 3 node rke2 cluster t...
# rke2
c
Hey all, I have an airgapped 3 node rke2 cluster that was running 1.30.2+rke2r1 and I was beginning to upgrade to 1.34 latest release. My understanding is that I have to go through 1.31, then 1.32, then 1.33, then finally 1.34. I downloaded the artifacts for the upgrade, drained the first node, copied the images into /var/lib/rancher/rke2/agent/images and then started the rke2-server service. However, I mistakenly grabbed rke2-images.linux-arm64.tar.gz instead of rke2-images.linux-amd64.tar.gz and I was getting a docker exec format error. I have since grabbed the correct architecture and placed it in the /var/lib/rancher/rke2/agent/images folder and started the rke2-server service but it's still getting the same error. I think it's because it's skipping some step that it thinks it previously did with the arm image. Is there a way for me to resolve this?
c
What's the specific error? It'll import all the images from the tarballs on startup, did you restart the service?
c
I did restart the service, yes: containerd exited: fork/exec /var/lib/rancher/rke2/data/v1.31.13-rke2r1-21cbaff92f/bin/containerd: exec format error"
what happened, I think, when I started it with the wrong architecture image tar.gz file in place, is it extracted containerd binary from somewhere and that ended up being the arm64 version which was leading to that error
I tried removing the files from /var/lib/rancher/rke2/agent/bin hoping it would extract the good architecture but then it complains it can't find the containerd executable. I just tried copying the /var/lib/rancher/rke2/agent/bin/containerd from one of the other nodes in the cluster to see if I could get past the error but it exits after the log message "Running containerd -c /var/lib/rancher/rke2/agent/etc/containerd/config.toml" so I tried just running that command myself to see what the actual error message was and it's saying that "version 3" is not allowed in the toml, it only supports up to "version 2". That makes sense because containerd would have been updated going from version 1.30.x to 1.31.x and the newer one probably supports version 3
c
Just delete the rke2/data dir
c
ok
c
The bit that extracts from the rke2-runtime image doesn't actually check the image platform because it's not running it, just copying things out of it. And then it doesn't extract again if the files exist.
c
ok that's looking much better
thank you 🙂