I am trying to install rke2 on rhel9 clone. I have...
# general
a
I am trying to install rke2 on rhel9 clone. I have tried with 1.31 and with 1.30. Also i have tried to use different version of klipper-help, using something like
--helm-job-image=rancher/klipper-helm:v0.9.2-build20240828
I run
Copy code
rke2 server \
 --debug \
 --selinux
but canal and coredns does not start https://paste.opensuse.org/pastes/c20ad01078ed also i get some errors: https://paste.opensuse.org/pastes/bc1f72e68bb4
m
To not spam the chat, can you please use https://paste.opensuse.org/ for log sharing. Can you verify that rke2-selinux and container-selinux package is installed? Also please share the logs from the helm canal pod. Thanks
a
I will edit the post and move output to paste. I just reinstalled rke2, after running
/usr/bin/rke2-uninstall.sh
Copy code
rpm -qa | egrep -i rke\|container
container-selinux-2.232.1-1.el9.noarch
rke2-selinux-0.18-1.el9.noarch
rke2-common-1.31.7~rke2r1-0.el9.x86_64
rke2-server-1.31.7~rke2r1-0.el9.x86_64
i getr no output from
/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/kubectl logs helm-install-rke2-canal-kkchn -n kube-system
m
hmm, can you grab the events for the helm install pod? sometimes the reason for crashloop gets logged there.
b
a
command
/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/kubectl describe pod helm-install-rke2-canal-kkchn -n kube-system | less
https://paste.opensuse.org/pastes/7993f884d1da
from terraform-null-rke2-install i am using /etc/NetworkManager/conf.d/rke2-canal.conf like in https://docs.rke2.io/known_issues#networkmanager i do not have /etc/NetworkManager/conf.d/global-dns.conf nm-cloud are not installed i have ipv6 disabled like in https://access.redhat.com/solutions/8709#rhel789disable
Copy code
# nmcli connection modify <Connection Name> ipv6.addresses "" ipv6.gateway ""
# nmcli connection modify <Connection Name> ipv6.method "disabled"
sed -i 's/^[[:space:]]*::/#::/' /etc/hosts
cat /etc/sysctl.d/ipv6.conf
# Disable IPv6 for all interfaces
net.ipv6.conf.all.disable_ipv6 = 1

# Disable IPv6 for each network interface
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv6.conf.<interface>.disable_ipv6 = 1
m
Same error as in your logs.
Copy code
Last State:     Terminated
      Reason:       StartError
      Message:      failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: exec: "entry": executable file not found in $PATH
      Exit Code:    128
Which rhel9 clone are you using. Version would be appreciated. Would like to replicate in a vm.
a
AlmaLinux 9.5
i think the issue might be with the /usr/bin/rke2-uninstall.sh After clean install of AlmaLinux, applying oscap DISA STIG, the first rke2 server --debug is running ok. Then, if i running rke2-uninstall, restart, add repo again, then run rke2 server --debug, i get the errors from my post. I need to do more testing.
m
If you ran the install script and the node partially provisioned before, the rke2-uninstall script won't remove the agent tokens from /etc/rancher and some data from /var/lib/rancher. To do a full "clean" of a node, I normally
rm -rvf /etc/rancher /var/lib/rancher
. Just make sure you want to full remove the previous install attempt back to scratch. Deleting these folders will remove join secrets and other essential data from the node.