https://rancher.com/ logo
Title
g

gifted-agent-35161

03/24/2023, 9:47 AM
Hello guys 🙂 , i have a problem about changing TLS settings on my downstream cluster. We are using single container rancher on docker with k3s embedded for local cluster. Information :
rancher/sever : 2.7.1
rke2 : v1.24.9+rke2r2 (2f4571a879954e1ea8d4560023eaf57c567df737)
go : version go1.18.7b7
k8s : v1.24.9+rke2r2
Os : Ubuntu 22.04LTS
Description : I have changed the certificates on docker rancher container from self signed to a custom CA. I followed the documentation there : https://ranchermanager.docs.rancher.com/getting-started/installation-and-upgrade/resources/update-rancher-certificate For step
4. Reconfigure Rancher agents to trust the private CA
, i have choosen method 2 by injecting the custom CA checksum inside rancher deployments/daemonsets. For a reason that i don't understand, there's no
daemonset/cattle-node-agent
on cattle-system namespace, so i didn't edited the rancher agent, just the
cattle-cluster-agent
. After i rebooted my rancher docker container with the new certificates bind to it. What happened ? Cluster sets itself into 'Updating' state, and master nodes are in state
Reconciling
, with message "Waiting for plan to be applied". Here's the YAML status of the kind
Machine
of my stuck master node :
conditions:
    - lastTransitionTime: '2022-10-14T09:53:34Z'
      status: 'True'
      type: Ready
    - lastTransitionTime: '2022-10-14T09:53:32Z'
      status: 'True'
      type: BootstrapReady
    - lastTransitionTime: '2022-10-14T09:56:29Z'
      status: 'True'
      type: InfrastructureReady
    - lastTransitionTime: '2023-03-01T14:13:52Z'
      status: 'True'
      type: NodeHealthy
    - lastTransitionTime: '2022-10-14T09:53:38Z'
      status: 'True'
      type: PlanApplied
    - lastTransitionTime: '2023-03-24T09:14:48Z'
      message: waiting for plan to be applied
      reason: Waiting
      status: Unknown
      type: Reconciled
  infrastructureReady: true
  lastUpdated: '2022-10-14T09:57:38Z'
So i dive into machine logs, didn't find anything in
journalctl
like empty for all rancher / rke services. So i decided to launch
rancher-system-agent sentinel
by hand, and here we go :
rancher-system-agent sentinel
INFO[0000] Rancher System Agent version v0.2.13 (4fa9427) is starting 
INFO[0000] Using directory /var/lib/rancher/agent/work for work 
INFO[0000] Starting remote watch of plans               
INFO[0000] Initial connection to Kubernetes cluster failed with error Get "<https://192.168.10.203/version>": x509: certificate signed by unknown authority, removing CA data and trying again 
panic: error while connecting to Kubernetes cluster with nullified CA data: Get "<https://192.168.10.203/version>": x509: certificate signed by unknown authority

 

goroutine 10 [running]:
<http://github.com/rancher/system-agent/pkg/k8splan.(*watcher).start(0xc0002be280|github.com/rancher/system-agent/pkg/k8splan.(*watcher).start(0xc0002be280>, {0x18bd5c0?, 0xc0002b8740})
    /go/src/github.com/rancher/system-agent/pkg/k8splan/watcher.go:99 +0x9b4
created by <http://github.com/rancher/system-agent/pkg/k8splan.Watch|github.com/rancher/system-agent/pkg/k8splan.Watch>
    /go/src/github.com/rancher/system-agent/pkg/k8splan/watcher.go:63 +0x155
So this means that my custom CA certificate didn't propagate to the host, as a "workaround", I added the certificate to the node local openssl truststore and it worked fine after. But every time i want to scale a new node it fails to install rke2. Also that it triggers a rollout of the
cattle-cluster-agent
deployment with the old certificate checksum in
CATTLE_CA_CHECKSUM
variable. Do you have any advices / tips to help ? Thank you so muuuch 🙂
b

big-hydrogen-97240

03/24/2023, 11:41 AM
Firstly, you don’t have the node agent daemonset because you’re using RKE2 for your downstream clusters. The node agent is only used with RKE. Second, step 4 looks like it is designed to get the agents to reconnect and get the new deployment from Rancher. That deployment should have the new CA checksum in it, but it sounds like it does not.
đź‘€ 1
I know that doesn’t sound very helpful, but inspecting the secrets and Rancher settings will hopefully tell you where the problem is.
g

gifted-agent-35161

03/24/2023, 12:36 PM
Thanks for your answer 🙂 I'll check my secrets in cattle-system namespace. Do you know if secrets
tls-rancher
and
tls-rancher-internal-ca
are relevant ? Otherwise i've already set up
tls-ca
with the new CA certificate and
tls-rancher-ingress
with the new key/cert.