This message was deleted.
# rke2
a
This message was deleted.
i
did you check the output from systemd process? You might not have the images pre-loaded for containerd to import.
m
I have defined a registries.yaml which should rewrite the images correctly. In the docs it says preload the archives into the images folder OR provide a registries config
i
I have not used the registry myself, so I cannot comment how it actually downloads the images. Can you try to put the images in the folder to figure out if that is what is missing? If that works, you can investigate the registry - perhaps it cannot download the images for some reason
m
https://gist.github.com/tboerger/2522f21ea61acc4617365905d144dc49 I also don't get the errors entirely... I think it should start containerd and etcd and so on, but it just prints errors like etcd isn't starting and the server doesn't get read with messages like
Waiting to retrieve agent configuration; server is not ready: yaml: line 11: did not find expected key
i
I assume this is a fresh installed cluster
m
Yes it's fresh installed. I can't easily get the images now, I need to talk to the engineer to add it to the mirror since I can't download the archives on my own.
i
In my experience, it is normal that you get lots of errors at startup. Things start and keep failing because of unmet dependencies until everything is up and running.
Hard to troubleshoot
Check if the containerd backend is running and if it has images loaded already
m
I can't find any containerd process and there are also no binaries (crictl, kubectl etc) created so far.
i
For example with the following command:
# /var/lib/rancher/rke2/bin/crictl images
RKE2, when started, installs them in /var/lib/rancher/rke2/bin
Copy code
# ls /var/lib/rancher/rke2/bin
containerd  containerd-shim  containerd-shim-runc-v1  containerd-shim-runc-v2  crictl  ctr  kubectl  kubelet  runc
m
But where should these come from? Does RKE2 download them from somewhere? Or are they really embedded into the rke2 binary?
i
in my case bin is a symlink to /var/lib/rancher/rke2/data/v1.29.3-rke2r1-106ea6e3639d/bin\
I believe those are installed when you run the installer.sh
But I might be wrong and they are actually installed, when rke2 service is started. The binary contains everything
m
You are talking about https://get.rke2.io? This doesn't install kubectl etc
i
It does install everything needed, including kubectl
Try to run it in a stand-alone VM just for testing
When I run the installer, I find the binary in /usr/local/bin
And the startup scriopts in /usr/local/lib/systemd/system/. Then I install the startup script and run it - it will create the /var/lib/rancher/rke2/bin for me (among other things)
m
The
curl -sfL <https://get.rke2.io> | sh -
script just installs rke2, rke2-killall.sh and rke2-uninstall.sh, nothing else
Oh yeah and the systemd service
i
There are other files in /usr/local as well
But the magic happens when rke2 binary is run
m
Yeah but looks like it's not doing it for me on the airgapped environment 😞
i
It needs the images to proceed, so you have to preinstall them or have a local repository which can download them from the internet
m
I have a local registry where I can get them
But to fetch them I need containerd πŸ˜„
i
I have installed it in one air-gapped environment so far, so not much experience, but it worked
containerd is part of the rke2 binary. Maybe something prevented it to install the binaries
m
I have never installed Kubernetes on air-gapped environments before
i
I would suggest trying it on a local vm (in your pc) with linux without network connection. When that works, you can figure out why it does not work in the more restricted environment
It could be some other security tool causing trouble (selinux, apparmor, virus scanning, whatever)
You need the rke2-images-amd64.tar.zst file
Perhaps the problem is cilium. I have used the default (whatever it is). In that case you need a different file: rke2-images-cilium.linux-amd64.tar.zst
I guess, without networking CNI, the etcd won't start
m
I guess the images tarball doesn't matter as long as there is not any containerd process running at all?
i
What do you see with
systemctl status rke2-server.service
?
m
Copy code
# systemctl status rke2-server.service
● rke2-server.service - Rancher Kubernetes Engine v2 (server)
     Loaded: loaded (/etc/systemd/system/rke2-server.service; enabled; vendor preset: enabled)
     Active: activating (start) since Mon 2024-04-29 10:19:03 UTC; 39s ago
       Docs: <https://github.com/rancher/rke2#readme>
    Process: 80027 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited, status=0/SUCCESS)
   Main PID: 80029 (rke2)
      Tasks: 10
     Memory: 115.2M
        CPU: 45.389s
     CGroup: /system.slice/rke2-server.service
             └─80029 "/usr/local/bin/rke2 server"
+ logs
i
I see lots of containerd-shim-runc-v2 (the PODs), but also containerd, kublet
m
I see mostly the errors related to not running etcd
i
Ok, this looks like only the rke2 process is started, and it did not start containerd
You have many options configured in config. If have only write-kubeconfig-mode and tls-san
and token and server
I do not see a server line there (token is optional)
m
The initial master doesn't get a server url?
i
something like "server: https://fqdn:9345"
It does, because it needs to talk to itself
m
At least that's what I have read
i
OK, maybe I am wrong
I just tell you what I see in my config πŸ˜‰
Perhaps you need to add your private CA to the node configuration. I see some certificate errors in your log
m
Which private CA? πŸ˜„
There is only the generated CA from RKE2
And the Registry CA is part of the ca-certificates.
i
A sorry, it is the internal ca from kubernetes which is causing this error. This is automatically generated.
Do have a containerd.sock file here: # ls -l /run/k3s/containerd/
If not, containerd is not starting.
It's rke2 not k3s, but they did not change the name.
What Linux are you installing it in?
m
Just tried to add server and token, than it tries to fetch the certs from the server url. So it really have to be undefined for the initial master. Everything on ubuntu 22.04.
Not even /run/k3s exists.