adamant-kite-43734
09/16/2024, 9:51 PMbland-article-62755
09/18/2024, 1:41 PMbland-article-62755
09/20/2024, 3:17 PMred-king-19196
09/23/2024, 8:33 AMbland-article-62755
10/16/2024, 3:31 PMred-king-19196
10/21/2024, 10:47 AMbland-article-62755
10/21/2024, 3:15 PMbland-article-62755
10/21/2024, 3:15 PMbland-article-62755
10/21/2024, 3:16 PMcf8:~ # ./check.sh
==============================
Starting Host check...
W1021 15:14:30.188424 70976 loader.go:222] Config not found: /etc/rancher/rke2/rke2.yaml
E1021 15:14:30.189691 70976 memcache.go:265] couldn't get current server API group list: Get "<http://localhost:8080/api?timeout=32s>": dial tcp 127.0.0.1:8080: connect: connection refused
E1021 15:14:30.190114 70976 memcache.go:265] couldn't get current server API group list: Get "<http://localhost:8080/api?timeout=32s>": dial tcp 127.0.0.1:8080: connect: connection refused
E1021 15:14:30.192123 70976 memcache.go:265] couldn't get current server API group list: Get "<http://localhost:8080/api?timeout=32s>": dial tcp 127.0.0.1:8080: connect: connection refused
E1021 15:14:30.192835 70976 memcache.go:265] couldn't get current server API group list: Get "<http://localhost:8080/api?timeout=32s>": dial tcp 127.0.0.1:8080: connect: connection refused
E1021 15:14:30.194542 70976 memcache.go:265] couldn't get current server API group list: Get "<http://localhost:8080/api?timeout=32s>": dial tcp 127.0.0.1:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?
This script is intended to be run from one of the Harvester cluster's Control Plane nodes.
It seems like you're running this script from cf8 and not one of the following nodes:
Do you want to contiue running this script even though it seems like you're not on a node? [Y/N] y
Host Test: Skipped
==============================
Starting Certificates check...
find: '/var/lib/rancher/rke2/server/tls/': No such file or directory
bland-article-62755
10/21/2024, 3:16 PMbland-article-62755
10/21/2024, 3:38 PMred-king-19196
10/22/2024, 10:03 AMcp_nodes=$(kubectl get nodes |grep control-plane |awk '{ print $1 }')
does not break because the last command returns 0 even if nothing comes in through the pipe.
But this:
prom_ip=$(kubectl get services/rancher-monitoring-prometheus -n cattle-monitoring-system -o yaml | yq -e '.spec.clusterIP')
does break because the last command yq
got a flag -e
, which returns non-zero code if the specified key doesn’t exist.red-king-19196
10/22/2024, 10:10 AM-e
at the top of the script as is. Otherwise, we’ll have to handle all the possible errors that might be encountered during the execution and show them to the user at the end. This helper script gives users confidence that they can initiate the cluster upgrade. If something needs to be put in a retry loop, that implies cluster instability. The script should point that out immediately.bland-article-62755
10/22/2024, 2:14 PMbland-article-62755
10/22/2024, 2:15 PMbland-article-62755
10/22/2024, 2:16 PMbland-article-62755
10/22/2024, 2:19 PMkubectl
can talk to an API and bail out early if it can't (ie it's running on a worker node or something went super wrong with the kubeapi.red-king-19196
10/23/2024, 6:34 AMred-king-19196
10/23/2024, 6:40 AMbland-article-62755
10/23/2024, 3:41 PM