This message was deleted.
# rke2
a
This message was deleted.
h
perhaps simplifying the question works ? what is responsible for registering a new cluster with rancher? what logs are right to look at, when i see kube-apiserver coming and going in the provisionong log ?
@creamy-pencil-82913 can i impose on you what logs to look at to find out why the rancher downstream rke cluster does not register ? when looking at the cluster with the host kubectl it looks sane and normal.
c
… did you check the cattle cluster agent pod logs?
cluster agent pod is what talks to rancher on behalf of the cluster. If you’re using Rancher to provision the cluster, then there is also a rancher-system-agent systemd unit on each node that handles node-specific config.
h
kubectl logs -n cattle-system cattle-cluster-agent-85bc86fd4f-r2l9f
RTNETLINK answers: Network is unreachable
INFO: Environment: CATTLE_ADDRESS= CATTLE_CA_CHECKSUM= CATTLE_CLUSTER=true CATTLE_CLUSTER_AGENT_PORT=tcp://[fd00:43::c6fa]:80 CATTLE_CLUSTER_AGENT_PORT_443_TCP=tcp://[fd00:43::c6fa]:443 CATTLE_CLUSTER_AGENT_PORT_443_TCP_ADDR=fd00
:43::c6fa CATTLE_CLUSTER_AGENT_PORT_443_TCP_PORT=443 CATTLE_CLUSTER_AGENT_PORT_443_TCP_PROTO=tcp CATTLE_CLUSTER_AGENT_PORT_80_TCP=tcp://[fd00:43::c6fa]:80 CATTLE_CLUSTER_AGENT_PORT_80_TCP_ADDR=fd00:43::c6fa CATTLE_CLUSTER_AGENT_P
ORT_80_TCP_PORT=80 CATTLE_CLUSTER_AGENT_PORT_80_TCP_PROTO=tcp CATTLE_CLUSTER_AGENT_SERVICE_HOST=fd00:43::c6fa CATTLE_CLUSTER_AGENT_SERVICE_PORT=80 CATTLE_CLUSTER_AGENT_SERVICE_PORT_HTTP=80 CATTLE_CLUSTER_AGENT_SERVICE_PORT_HTTPS_
INTERNAL=443 CATTLE_CLUSTER_REGISTRY= CATTLE_FEATURES=embedded-cluster-api=false,fleet=false,monitoringv1=false,multi-cluster-management=false,multi-cluster-management-agent=true,provisioningv2=false,rke2=false CATTLE_INGRESS_IP_
DOMAIN=<http://sslip.io|sslip.io> CATTLE_INSTALL_UUID=3be759df-0114-466c-a70d-76a0070868cd CATTLE_INTERNAL_ADDRESS= CATTLE_IS_RKE=false CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cattle-cluster-agent-85bc86fd4f-r2l9f CATTLE_RANCHER_WEBHOOK_VERSION=1
03.0.6+up0.4.7 CATTLE_SERVER=<https://rke-v2.fjit.no> CATTLE_SERVER_VERSION=v2.8.5
INFO: Using resolv.conf: search cattle-system.svc.cluster.local svc.cluster.local cluster.local <http://dst.fjit.no|dst.fjit.no> nameserver fd00:43::a options ndots:5
INFO: <https://rke-v2.fjit.no/ping> is accessible
INFO: <http://rke-v2.fjit.no|rke-v2.fjit.no> resolves to
[lots of info messages] the resolves to [empty] seems wrong
but when doing host from inside the pod it resolves normally.
now that was odd... when doing a curl -v https://rke-v2.fjit.no/ from the host (7.88.1) it works as expected. from the pod (8.0.1) i get SSL certificate problem: unable to get local issuer certificate
c
What is that certificate issued by? If you’re using private PKI, adding the CA cert to the node won’t have any effect on pods… they each have their own CA bundle.
h
cert is normal letsencrypt
checking with openssl s_client -showcerts -connect rke-v2.fjit.no:443 the intermidiate is presented as expected. but the preamble is different, could just be openssl version differences tho host : OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024)
1 s:C = US, O = Let's Encrypt, CN = R10
i:C = US, O = Internet Security Research Group, CN = ISRG Root X1
a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA256
v:NotBefore: Mar 13 00:00:00 2024 GMT; NotAfter: Mar 12 23:59:59 2027 GMT
-----BEGIN CERTIFICATE-----
MIIFBTCCAu2gAwIBAgIQS6hSk/eaL6JzBkuoBI110DANBgkqhkiG9w0BAQsFADBP
pod: OpenSSL 1.1.1l-fips 24 Aug 2021 SUSE release 150500.17.28.2
1 s:C = US, O = Let's Encrypt, CN = R10
i:C = US, O = Internet Security Research Group, CN = ISRG Root X1
-----BEGIN CERTIFICATE-----
MIIFBTCCAu2gAwIBAgIQS6hSk/eaL6JzBkuoBI110DANBgkqhkiG9w0BAQsFADBP
the whole part about validity and
the loadbalancer in front of rancher just deal with tcp ports. the certs are from the rancher installation that used the helm method with cert-manager. so this is the cert we use in browsers. and the other clusters (ipv4) that is working from rancher allready
c
Do you perhaps need to add intermediate certs to your bundle?
h
what is the bundle in this context ? i do not import any certificates manually anywhere. Rancher uses a normal public letsencrypt cert obtasined via cert-manager for the ingress. And i can see the intermidiate is provided from rancher when i try to connect with openssl s_client -showcerts -connect rke-v2.fjit.no:443
c
But openssl in the pod shows that it’s not trusted as well?
h
I am not an expert in openssl , but It looks trusted, both say: Verification: OK Verify return code: 0 (ok) and i can write GET / and get web content
i notice with curl i get Unknown CA from inside the pod. perhaps the image have a outdated ca cert bundle. https://paste.debian.net/hidden/c97e386f/ But i also tested with a rke2 cluster installed from get.rke2.io, that I imported in rancher. and that imported successfully with no issues. and the cluster-agent pods in that cluster as well have the exact same issue with curl, but here the cluster works in rancher. that is a slightly different version v1.28.12 +rke2r1 vs the rancher downstream cluster of v1.28.11+rke2r1 and i believe it uses a different cni, since there is no calico or cilium pods but a rke2-canal instead.