Is there a page in the documentation explaining an...
# rke2
c
Is there a page in the documentation explaining any differences between the rpm install and tar ? Because i'm trying to get an install working with RHEL 9 but after installation i'm getting tls errors in pod communication towards external and in cluster. Even if these are set to use plain http internal. ( I'm maybe thinking it's selinux related, although it doesn't show any denied attempts )
a
Maybe this can help https://deepwiki.com/rancher/rke2/3.1-linux-installation Ive installed RKE2 using the RPM method on RHEL9 and got no issues, so maybe it is a configuration error.
c
RPM installation pulls in selinux dependencies via the rke2-selinux package which is required by rke2-common. that’s pretty much it.
c
yes that i noticed. Just weird i'm getting the tls errors with old certificates even if the call was originally http call between containers. Only 2 things changed my end. testing on RHEL and the newer version that came out. So trying to debug what caused it.
The weird thing is, the calls with https to external get the same error as in cluster calls with http that get redirected to https. So it's some sort of policy forcing it. Could possible also calico?
Copy code
E0722 07:10:47.292147       1 sync.go:62] "error setting up issuer" err="Get \"<https://acme-v02.api.letsencrypt.org/directory>\": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-07-22T07:10:47Z is after 2022-11-27T18:11:37Z" logger="cert-manager.controller" resource_name="cloudflare-testcluster" resource_namespace="" resource_kind="ClusterIssuer" resource_version="v1"
E0722 07:10:47.292235       1 controller.go:157] "re-queuing item due to error processing" err="Get \"<https://acme-v02.api.letsencrypt.org/directory>\": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-07-22T07:10:47Z is after 2022-11-27T18:11:37Z" logger="cert-manager.controller"
This is a cert-manager call to external. Doing a curl from the node host doesn't give tls errors.
Copy code
ts=2025-07-22T07:12:40.186672034Z caller=sidecar.go:423 level=warn err="check exists: stat s3 object: 301 Moved Permanently" uploaded=0
Prometheus getting 301's
c
I would bet that you have a wildcard dns entry somewhere in your environment that is pointing at a host with a certificate that expired in 2022
which is why you only get errors in a pod that is hitting that wildcard record
exec into a pod, and curl that address, and see what you’re actually hitting instead of letsencrypt
c
yeah trying to figure it out. I put a debug pod in the cluster and it seems its my firewall cert that's being used shown if i do a curl from a pod. While doing it from the node to the same url it just goes straight to the letencrypt servers
c
why’s your firewall got an expired cert? and why are dns lookups hitting your firewall? that usually indicates an overzealous wildcard dns entry for a domain in your search list.
c
the cert is self-singed so that prob why invalid. But now need to figure out why the cluster is throwing all the traffic at it
thanks already for all the help.
think i figuered it out. Overlap of internal and external dns names