https://rancher.com/ logo
Title
g

gorgeous-iron-45755

11/21/2022, 7:30 PM
Hello everybody, We have deployed Rancher server on Kubernetes using certificates signed by our private intermediate CA, its certificate in turn being signed by our private root CA. Before deploying Rancher server, we had created the certificate secret resource for the ingress as well as the CA certificate secret resource as instructed by this guide: https://docs.ranchermanager.rancher.io/v2.5/getting-started/installation-and-upgrade/resources/update-rancher-certificate The ingress secret contains the certificate corresponding to the ingress FQDN, and the intermediate CA certificate, plus the ingress private key. Rancher server has been deployed using the following command:
helm install rancher rancher-stable/rancher --namespace cattle-system \
--set hostname=<FQDN> \
--set replicas=3 \
--set ingress.tls.source=secret \
--set privateCA=true
The
https://<FQDN>/v3/settings/_cacerts_
endpoint returns the same root CA with that contained in the corresponding secret. However, Rancher server seems not to have taken into account the ingress secret as its certificate has an invalid CN=dynamic, its signing CA certificate having CN=dynamiclistener-ca@1669050359 and being self signed. Therefore, the CA chain appears to be broken, and as a consequence, the import of an existing generic Kubernetes cluster, results in the infamous x509: certificate signed by unknown authority as shown below:
time="2022-11-21T18:47:26Z" level=error msg="Issuer of last certificate found in chain (CN=dynamiclistener-ca@1669050359,O=dynamiclistener-org) does not match with CA certificate Issuer (CN=...,OU=...,O=...). Please check if the configured server certificate contains all needed intermediate certificates and make sure they are in the correct order (server certificate first, intermediates after)"
time="2022-11-21T18:47:26Z" level=fatal msg="Certificate chain is not complete, please check if all needed intermediate certificates are included in the server certificate (in the correct order) and if the cacerts setting in Rancher either contains the correct CA certificate (in the case of using self signed certificates) or is empty (in the case of using a certificate signed by a recognized CA). Certificate information is displayed above. error: Get \"<https://rancher-demo.mas.local>\": x509: certificate signed by unknown authority"
Can anybody help? Have we missed anything before/during the deployment?
m

microscopic-diamond-94749

11/22/2022, 8:31 AM
your issue seems to be with the ingress controller, not rancher. I assume you're using ingress-nginx? Are there any hints in the ingress controller logs? Did the ingress resource get created correctly? Check the
spec.tls[0]
part if it has the right
host
and
secretName
section. and if the secret was created in the right namespace.
g

gorgeous-iron-45755

11/22/2022, 6:12 PM
@microscopic-diamond-94749 thanks a lot for your hint; it was right to the point! Your assumption regarding the type of the ingress controller was correct. I found the following error in the ingress controller's log:
"Ignoring ingress because of error while validating ingress class" ingress="cattle-system/rancher" error="ingress does not contain a valid IngressClass"
So, I fortunately immediately realized my error. I had set the Rancher server service type to LoadBalancer, and mapped the FQDN to the load balancer IP. After setting the ingress class to nginx, and the service type back to ClusterIP, mapping the FQDN to the ingress controller's IP address, and restarting Rancher server, the certificate chain got finally fixed. However, now the cluster agent gets stuck to the following point as shown in its log:
INFO: Using resolv.conf: search cattle-system.svc.mas.local svc.mas.local mas.local nameserver 169.254.25.10 options ndots:5
and finally gets terminated not being able to download the CA root certificate from the /v3/settings/cacerts endpoint. In my opinion, it is a network connectivity-related issue as the connections between the central station on which Rancher server is running and the peripheral stations whose clusters we want to import is too slow. Therefore, we may have to abandon the idea of having a central Rancher server for centralized monitoring, and instead deploy a separate Rancher server to each peripheral cluster.
Also I found the helm option (ingress.ingressClassName) to set the ingress class to nginx.
The remote cluster has been finally imported!
m

microscopic-diamond-94749

11/24/2022, 7:51 AM
what was the issue?
g

gorgeous-iron-45755

11/24/2022, 7:52 AM
I have not yet found out. It may have needed time as the connection is slow.
As it was not a certificate-related issue, Rancher server can hopefully be deployed with Rancher-generated certificates too.