This message was deleted.
# harvester
a
This message was deleted.
a
3+ nodes cluster are quite normal. When installation, did you use VIP or DNS for this 4th node to join ?
b
Yes! I use DNS of my cluster
a
if you use the self-signed CA, then there is a known issue
b
Version 1.1.2
The erro is different
a
in this 4th node:
journalctl --unit=rancherd
b
Copy code
Log rancherd service this 4rd node /usr/bin/rancherd bootstrap -f
INFO[0000] Loading config file [/usr/share/rancher/rancherd/config.yaml.d/50-defaults.yaml]
INFO[0000] Loading config file [/usr/share/rancher/rancherd/config.yaml.d/91-harvester-bootstrap-repo.yaml]
INFO[0000] Loading config file [/etc/rancher/rancherd/config.yaml]
INFO[0000] Bootstrapping Rancher (v2.6.11/v1.24.11+rke2r1)
INFO[0000] Writing plan file to /var/lib/rancher/rancherd/plan/plan.json
INFO[0000] Applying plan with checksum
INFO[0000] No image provided, creating empty working directory /var/lib/rancher/rancherd/plan/work/20231004-232518-applied.plan/_0
INFO[0000] Running command: /usr/bin/env [sh /var/lib/rancher/rancherd/install.sh]
INFO[0000] [stdout]: [INFO]  Using default agent configuration directory /etc/rancher/agent
INFO[0000] [stdout]: [INFO]  Using default agent var directory /var/lib/rancher/agent
INFO[0000] [stderr]: [WARN]  /usr/local is read-only or a mount point; installing to /opt/rancher-system-agent
INFO[0000] [stdout]: [INFO]  Determined CA is necessary to connect to Rancher
INFO[0000] [stdout]: [INFO]  Successfully downloaded CA certificate
INFO[0000] [stdout]: [INFO]  Value from <https://br1-harvester.infra.azion.net:443/cacerts> is an x509 certificate
INFO[0000] [stdout]: [INFO]  Successfully tested Rancher connection
INFO[0000] [stdout]: [INFO]  Downloading rancher-system-agent binary from <https://br1-harvester.infra.azion.net:443/assets/rancher-system-agent-amd64>
INFO[0000] [stdout]: [INFO]  Successfully downloaded the rancher-system-agent binary.
INFO[0000] [stdout]: [INFO]  Downloading rancher-system-agent-uninstall.sh script from <https://br1-harvester.infra.azion.net:443/assets/system-agent-uninstall.sh>
INFO[0000] [stdout]: [INFO]  Successfully downloaded the rancher-system-agent-uninstall.sh script.
INFO[0000] [stdout]: [INFO]  Generating Cattle ID
INFO[0000] [stdout]: [INFO]  Cattle ID was already detected as 6676d2bf46ee5753a6fbbae5dca53810b89638b832e318a9393c32f8d1b2f44. Not generating a new one.
INFO[0000] [stdout]: [INFO]  Successfully downloaded Rancher connection information
INFO[0000] [stdout]: [INFO]  systemd: Creating service file
INFO[0000] [stdout]: [INFO]  Creating environment file /etc/systemd/system/rancher-system-agent.env
INFO[0000] [stdout]: [INFO]  Enabling rancher-system-agent.service
INFO[0001] [stdout]: [INFO]  Starting/restarting rancher-system-agent.service
INFO[0001] No image provided, creating empty working directory /var/lib/rancher/rancherd/plan/work/20231004-232518-applied.plan/_1
INFO[0001] Running command: /usr/bin/rancherd [probe]
INFO[0001] [stderr]: time="2023-10-04T23:25:19Z" level=info msg="Running probes defined in /var/lib/rancher/rancherd/plan/plan.json"
INFO[0002] [stderr]: time="2023-10-04T23:25:20Z" level=info msg="Probe [kubelet] is unhealthy"
a
journalctl --unit=rancher-system-agent
b
Copy code
Oct 05 04:19:39 vmh-cgh-act002p systemd[1]: Started Rancher System Agent.
Oct 05 04:19:39 vmh-cgh-act002p rancher-system-agent[4267]: time="2023-10-05T04:19:39Z" level=info msg="Rancher System Agent version v0.2.13 (4fa9427) is starting"
Oct 05 04:19:39 vmh-cgh-act002p rancher-system-agent[4267]: time="2023-10-05T04:19:39Z" level=info msg="Using directory /var/lib/rancher/agent/work for work"
Oct 05 04:19:39 vmh-cgh-act002p rancher-system-agent[4267]: time="2023-10-05T04:19:39Z" level=info msg="Starting remote watch of plans"
Oct 05 04:19:39 vmh-cgh-act002p rancher-system-agent[4267]: time="2023-10-05T04:19:39Z" level=info msg="Initial connection to Kubernetes cluster failed with error Get \"<https://br1-harvester.infra.azion.net/version>\": x509: certificate s>
Oct 05 04:19:39 vmh-cgh-act002p rancher-system-agent[4267]: panic: error while connecting to Kubernetes cluster with nullified CA data: Get "<https://br1-harvester.infra.azion.net/version>": x509: certificate signed by unknown authority
Oct 05 04:19:39 vmh-cgh-act002p rancher-system-agent[4267]: goroutine 99 [running]:
It makes no sense! The other nodes entered the cluster without this problem.
All with the same certificate
a
Copy code
Oct 05 04:19:39 vmh-cgh-act002p rancher-system-agent[4267]: time="2023-10-05T04:19:39Z" level=info msg="Initial connection to Kubernetes cluster failed with error Get \"<https://br1-harvester.infra.azion.net/version>\": x509: certificate s>
Oct 05 04:19:39 vmh-cgh-act002p rancher-system-agent[4267]: panic: error while connecting to Kubernetes cluster with nullified CA data: Get "<https://br1-harvester.infra.azion.net/version>": x509: certificate signed by unknown authority
Oct 05 04:19:39 vmh-cgh-act002p rancher-system-agent[4267]: goroutine 99 [running]:
the error here, shows the CA is invalid, and the
rancher-system-agent
can't continue
try this
Copy code
head -n10 /var/lib/rancher/rancherd/install.sh
the issue 4511, also applies to your issue, they look to be same, even though 4511 is observed in v120
g
did you add the custom certs after creating the initial 3 node cluster?
b
Yes. Exactly @great-bear-19718
My ca this valid
g
was the ca added in the ssl cert chain?
b
yes
@great-bear-19718 This is what I find strangest, with the same certificate, I installed 3 nodes, why is the 4th one having a problem?
That's why it doesn't make sense for me to have to generate a new CA.
Guys, this error is crazy! How I solved it: For some reason the value returned on this cluster command line was empty (
kubectl edit setting.management server-url
). I changed it to my cluster's URL, restarted the rancherd service and everything worked! The strangest thing is that at no point did I have to do this on the other 2 nodes that I inserted into the cluster through join mode.
g
this should not be needed since that change was only for 1.2.0
did your cert-chain contain the ca cert as well?