This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

08/16/2024, 1:16 PM

This message was deleted.

hallowed-window-565

08/19/2024, 7:40 AM

perhaps simplifying the question works ? what is responsible for registering a new cluster with rancher? what logs are right to look at, when i see kube-apiserver coming and going in the provisionong log ?

hallowed-window-565

08/21/2024, 6:26 PM

@creamy-pencil-82913 can i impose on you what logs to look at to find out why the rancher downstream rke cluster does not register ? when looking at the cluster with the host kubectl it looks sane and normal.

creamy-pencil-82913

08/21/2024, 6:48 PM

… did you check the cattle cluster agent pod logs?

creamy-pencil-82913

08/21/2024, 6:48 PM

cluster agent pod is what talks to rancher on behalf of the cluster. If you’re using Rancher to provision the cluster, then there is also a rancher-system-agent systemd unit on each node that handles node-specific config.

hallowed-window-565

08/21/2024, 6:54 PM

kubectl logs -n cattle-system cattle-cluster-agent-85bc86fd4f-r2l9f

RTNETLINK answers: Network is unreachable

INFO: Environment: CATTLE_ADDRESS= CATTLE_CA_CHECKSUM= CATTLE_CLUSTER=true CATTLE_CLUSTER_AGENT_PORT=tcp://[fd00:43::c6fa]:80 CATTLE_CLUSTER_AGENT_PORT_443_TCP=tcp://[fd00:43::c6fa]:443 CATTLE_CLUSTER_AGENT_PORT_443_TCP_ADDR=fd00

:43::c6fa CATTLE_CLUSTER_AGENT_PORT_443_TCP_PORT=443 CATTLE_CLUSTER_AGENT_PORT_443_TCP_PROTO=tcp CATTLE_CLUSTER_AGENT_PORT_80_TCP=tcp://[fd00:43::c6fa]:80 CATTLE_CLUSTER_AGENT_PORT_80_TCP_ADDR=fd00:43::c6fa CATTLE_CLUSTER_AGENT_P

ORT_80_TCP_PORT=80 CATTLE_CLUSTER_AGENT_PORT_80_TCP_PROTO=tcp CATTLE_CLUSTER_AGENT_SERVICE_HOST=fd00:43::c6fa CATTLE_CLUSTER_AGENT_SERVICE_PORT=80 CATTLE_CLUSTER_AGENT_SERVICE_PORT_HTTP=80 CATTLE_CLUSTER_AGENT_SERVICE_PORT_HTTPS_

INTERNAL=443 CATTLE_CLUSTER_REGISTRY= CATTLE_FEATURES=embedded-cluster-api=false,fleet=false,monitoringv1=false,multi-cluster-management=false,multi-cluster-management-agent=true,provisioningv2=false,rke2=false CATTLE_INGRESS_IP_

DOMAIN=<http://sslip.io|sslip.io> CATTLE_INSTALL_UUID=3be759df-0114-466c-a70d-76a0070868cd CATTLE_INTERNAL_ADDRESS= CATTLE_IS_RKE=false CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cattle-cluster-agent-85bc86fd4f-r2l9f CATTLE_RANCHER_WEBHOOK_VERSION=1

03.0.6+up0.4.7 CATTLE_SERVER=<https://rke-v2.fjit.no> CATTLE_SERVER_VERSION=v2.8.5

INFO: Using resolv.conf: search cattle-system.svc.cluster.local svc.cluster.local cluster.local <http://dst.fjit.no|dst.fjit.no> nameserver fd00:43::a options ndots:5

INFO: <https://rke-v2.fjit.no/ping> is accessible

INFO: <http://rke-v2.fjit.no|rke-v2.fjit.no> resolves to

[lots of info messages] the resolves to [empty] seems wrong

hallowed-window-565

08/21/2024, 6:59 PM

but when doing host from inside the pod it resolves normally.

hallowed-window-565

08/21/2024, 7:06 PM

now that was odd... when doing a curl -v https://rke-v2.fjit.no/ from the host (7.88.1) it works as expected. from the pod (8.0.1) i get SSL certificate problem: unable to get local issuer certificate

creamy-pencil-82913

08/21/2024, 7:21 PM

What is that certificate issued by? If you’re using private PKI, adding the CA cert to the node won’t have any effect on pods… they each have their own CA bundle.

hallowed-window-565

08/21/2024, 7:36 PM

cert is normal letsencrypt

hallowed-window-565

08/21/2024, 7:43 PM

checking with openssl s_client -showcerts -connect rke-v2.fjit.no:443 the intermidiate is presented as expected. but the preamble is different, could just be openssl version differences tho host : OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024)

1 s:C = US, O = Let's Encrypt, CN = R10

i:C = US, O = Internet Security Research Group, CN = ISRG Root X1

a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA256

v:NotBefore: Mar 13 00:00:00 2024 GMT; NotAfter: Mar 12 23:59:59 2027 GMT

-----BEGIN CERTIFICATE-----

MIIFBTCCAu2gAwIBAgIQS6hSk/eaL6JzBkuoBI110DANBgkqhkiG9w0BAQsFADBP

pod: OpenSSL 1.1.1l-fips 24 Aug 2021 SUSE release 150500.17.28.2

1 s:C = US, O = Let's Encrypt, CN = R10

i:C = US, O = Internet Security Research Group, CN = ISRG Root X1

-----BEGIN CERTIFICATE-----

MIIFBTCCAu2gAwIBAgIQS6hSk/eaL6JzBkuoBI110DANBgkqhkiG9w0BAQsFADBP

the whole part about validity and

hallowed-window-565

08/21/2024, 7:50 PM

the loadbalancer in front of rancher just deal with tcp ports. the certs are from the rancher installation that used the helm method with cert-manager. so this is the cert we use in browsers. and the other clusters (ipv4) that is working from rancher allready

creamy-pencil-82913

08/21/2024, 8:13 PM

Do you perhaps need to add intermediate certs to your bundle?

hallowed-window-565

08/21/2024, 8:20 PM

what is the bundle in this context ? i do not import any certificates manually anywhere. Rancher uses a normal public letsencrypt cert obtasined via cert-manager for the ingress. And i can see the intermidiate is provided from rancher when i try to connect with openssl s_client -showcerts -connect rke-v2.fjit.no:443

creamy-pencil-82913

08/21/2024, 9:30 PM

But openssl in the pod shows that it’s not trusted as well?

hallowed-window-565

08/22/2024, 7:02 AM

I am not an expert in openssl , but It looks trusted, both say: Verification: OK Verify return code: 0 (ok) and i can write GET / and get web content

hallowed-window-565

08/22/2024, 7:14 AM

i notice with curl i get Unknown CA from inside the pod. perhaps the image have a outdated ca cert bundle. https://paste.debian.net/hidden/c97e386f/ But i also tested with a rke2 cluster installed from get.rke2.io, that I imported in rancher. and that imported successfully with no issues. and the cluster-agent pods in that cluster as well have the exact same issue with curl, but here the cluster works in rancher. that is a slightly different version v1.28.12 +rke2r1 vs the rancher downstream cluster of v1.28.11+rke2r1 and i believe it uses a different cni, since there is no calico or cilium pods but a rke2-canal instead.

18 Views

Open in Slack

Previous Next