This message was deleted.
# k3d
a
This message was deleted.
g
(To be clear, I did simply give it a try, but with mixed results. Depending on the K8s version upgraded to, the Rancher-initiated upgrade sometimes ended in unresponsive nodes, which then sometimes worked again after restarting the node-container.)
w
Hey đź‘‹ To be honest: I've never tried it, so I have no idea. There's definitely nothing built into k3d to specifically support this (if it could do anything there).
If you can figure out specific problems there that may be due to k3d handling the cluster (i.e. having containerized nodes), I'd be happy to look into this.
r
@gorgeous-pizza-36569 - My guess is that if you tried having Rancher upgrade a K3D hosted K3S, it might work, but since K3D will restart the old version of K3S containers if you do something like reboot, it wouldn't necessarily be a stable upgrade and you might get weird behavior with some of your mounted in volumes and other persistent data. So I'd test that with a cluster you don't care about before trying it with one you do care about.
w
@rough-farmer-49135 I tried this: 1. Create a k3d cluster (v1.26.5 by default) 2. Download K3s binary v1.27.3 to my host 3.
docker cp
it into the server container
/bin/k3s
and
chmod a+x /bin/k3s
inside the container 4.
docker restart
5. Profit. That actually worked flawlessly.
So I'd test that with a cluster you don't care about before trying it with one you do care about.
k3d is not made to be running stuff you really care about (i.e. production)
Update: also survived a
service docker restart
, so I assume it should also survive a reboot.
r
The docker service restart won't tell you as much as a stop/start or reboot. Default on Docker daemons for several years now is a live reload so it keeps containers running when you restart the daemon, so unless you turned that off in /etc/docker/daemon.json you might want to try a stop/start or reboot the host system to be more certain.
k3d is not made to be running stuff you really care about (i.e. production)
Oh how I wish the people who made decisions listened to whether or not things are intended for running things they care about...
w
Oh really? I thought it'd restart the containers as well (at least the status output suggests that). `docker stop`/`docker start` also worked fine. Didn't try a reboot yet.
Oh how I wish the people who made decisions listened to whether or not things are intended for running things they care about...
🥲
r
In a previous job we were using the mesos-based DC/OS with Docker and we'd run into problems that when we had to restart a docker daemon it'd also result in Marathon thinking it needed to restart all those containers so we ended up with a bunch of zombie services. That was Docker 18 or 19 CE, as I recall and we had to explicitly turn off it keeping the containers alive as the default was to do so. It might've changed since then and now kill them, but from what I've seen my containers have survived any Docker daemon reboots I've done.
While
docker ps
might just register the daemon time, the logs might indicate if it was alive for longer (either by not showing the beginning or going past the beginning of the docker service)? Though the default may have changed too.
g
So far I’ve had two issues, though I must mention that I’m using Rancher’s cluster upgrade mechanism and also … well, it’s a 1.20.x cluster. Yeah. The first issue is that after Rancher has replaced the k3s binaries, it is not restarted and the k3s service/process is not running. After a container restart it correctly starts with the new binary (whether via Docker or k3d, either works). So restarts of existing nodes work fine. However, after again upgrading from 1.21 to 1.22, the master/“server” is stuck in a Restarting loop (the container itself), mentioning the entrypoint script (
No help topic for '/bin/k3d-entrypoint.sh'
). My first guess is that just repeatedly replacing the k3s binary will sooner or later lead to issues, due to the rest of the container / its config (entrypoint and so on) being outdated wrt the k3s binary.
At this point I’m of course questioning not how I can do this, but why I’m doing it, and my sanity. Pretty sure it’s better to tear it down.
Through some trial and error I’ve narrowed down the entrypoint issue to v1.22, all others (1.21, 1.23, 1.24) worked (in that regard; in all cases I needed to manually restart the node container). Obviously I don’t expect a 1.22-specific issue to be fixed/investigated — just a point of curiosity.
The “manual node restart required” thing otoh might be worth looking into, perhaps I’ll open an issue. Though the existing issue on node upgrades is probably the better one to pursue.
In case anyone is curious, the in-place upgrades work mostly fine — see caveats above, plus: you may have to edit the k3s CLI flags in case a newer version no longer supports one, doing so is possible by editing the docker container config JSON (this too is definitely hacky, but it does work just fine).
w
Editing the container config.json is pretty dope. Thanks for the input!
g
note that Docker also (over)writes that file on stopping (daemon stop? container stop? – not sure), so I had to fiddle with the order to get that change to apply
110 Views