adamant-kite-43734
11/01/2023, 8:15 AMgifted-cricket-25537
11/01/2023, 8:50 AMcurl -H "rke2-Node-Name: <hostname>" -H "rke2-Node-Password: <password>" <https://192.168.0.48:9345/v1-rke2/serving-kubelet.crt> -k --cert /var/lib/rancher/rke2/agent/client-kubelet.crt --key /var/lib/rancher/rke2/agent/client-kubelet.key -vvv
* Trying 192.168.0.48:9345...
* TCP_NODELAY set
* Connected to 192.168.0.48 (192.168.0.48) port 9345 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, CERT verify (15):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: O=rke2; CN=rke2
* start date: Oct 19 11:35:18 2023 GMT
* expire date: Oct 18 11:37:59 2024 GMT
* issuer: CN=rke2-server-ca@1697715318
* SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55e9935ee300)
> GET /v1-rke2/serving-kubelet.crt HTTP/2
> Host: 192.168.0.48:9345
> user-agent: curl/7.68.0
> accept: */*
> rke2-node-name: <hostname>
> rke2-node-password: <password>
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
gifted-cricket-25537
11/01/2023, 8:51 AM/readyz
endpoint is okgifted-cricket-25537
11/01/2023, 8:53 AMgifted-cricket-25537
11/01/2023, 8:53 AMgifted-cricket-25537
11/01/2023, 8:53 AMgifted-cricket-25537
11/01/2023, 8:57 AMgifted-cricket-25537
11/01/2023, 8:58 AMgifted-cricket-25537
11/01/2023, 9:02 AMgifted-cricket-25537
11/01/2023, 10:49 AMcurl -H "rke2-Node-Name: <hostname>" -H "rke2-Node-Password: <password>" <https://apiserver:9345/v1-rke2/serving-kubelet.crt> -k --cert /var/lib/rancher/rke2/agent/client-kubelet.crt --key /var/lib/rancher/rke2/agent/client-kubelet.key -vvv
* Trying 192.168.0.48:9345...
* TCP_NODELAY set
* Connected to 192.168.0.48 (192.168.0.48) port 9345 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, CERT verify (15):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: O=rke2; CN=rke2
* start date: Oct 19 11:35:18 2023 GMT
* expire date: Oct 18 11:37:59 2024 GMT
* issuer: CN=rke2-server-ca@1697715318
* SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x56372e206300)
> GET /v1-rke2/serving-kubelet.crt HTTP/2
> Host: 192.168.0.48:9345
> user-agent: curl/7.68.0
> accept: */*
> rke2-node-name: <hostname>
> rke2-node-password: <password>
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
< HTTP/2 200
< content-type: text/plain; charset=utf-8
< content-length: 1506
< date: Wed, 01 Nov 2023 10:46:24 GMT
<
-----BEGIN CERTIFICATE-----
...
gifted-cricket-25537
11/01/2023, 10:49 AM# curl -H "rke2-Node-Name: <hostname>" -H "rke2-Node-Password: <password>" <https://127.0.0.1:6444/v1-rke2/serving-kubelet.crt> -k --cert /var/lib/rancher/rke2/agent/client-kubelet.crt --key /var/lib/rancher/rke2/agent/client-kubelet.key -vvv
* Trying 127.0.0.1:6444...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 6444 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, CERT verify (15):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: O=rke2; CN=rke2
* start date: Oct 19 11:35:18 2023 GMT
* expire date: Oct 18 11:37:59 2024 GMT
* issuer: CN=rke2-server-ca@1697715318
* SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55da0c6b7300)
> GET /v1-rke2/serving-kubelet.crt HTTP/2
> Host: 127.0.0.1:6444
> user-agent: curl/7.68.0
> accept: */*
> rke2-node-name: <hostname>
> rke2-node-password: <password>
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
gifted-cricket-25537
11/01/2023, 10:53 AMrke2-agent-load-balancer
indeed does not reply, and the one that works was removed as it wasn't working during the startup of the rke2-agentgifted-cricket-25537
11/01/2023, 10:55 AMmsg="Adding server to load balancer rke2-agent-load-balancer: <apiserver 1>:9345"
msg="Adding server to load balancer rke2-agent-load-balancer: <apiserver 3>:9345"
msg="Removing server from load balancer rke2-agent-load-balancer: <apiserver 1>:9345"
msg="Running load balancer rke2-agent-load-balancer 127.0.0.1:6444 -> [<apiserver 3>:9345] [default: <apiserver 1>:9345]"
msg="failed to get CA certs: Get \"<https://127.0.0.1:6444/cacerts>\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
gifted-cricket-25537
11/01/2023, 11:00 AMapiserver 3
and removes the working apiserver 1
from the agent LBcreamy-pencil-82913
11/01/2023, 4:24 PMkubectl get endpoints kubernetes
gifted-cricket-25537
11/01/2023, 4:38 PMgifted-cricket-25537
11/01/2023, 4:40 PMcreamy-pencil-82913
11/01/2023, 5:21 PMcreamy-pencil-82913
11/01/2023, 5:22 PMgifted-cricket-25537
11/01/2023, 5:25 PMgifted-cricket-25537
11/01/2023, 5:25 PMgifted-cricket-25537
11/01/2023, 5:31 PMgifted-cricket-25537
11/01/2023, 5:41 PMNov 01 17:36:37 production-island10-overcloud-etcd-2 rke2[708]: time="2023-11-01T17:36:37Z" level=info msg="Waiting for cri connection: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /run/k3s/containerd/containerd.sock: connect: no such file or directory\""
even after restarting the rke2-server on this etcd node, I don't see containerd coming upgifted-cricket-25537
11/01/2023, 5:44 PMgifted-cricket-25537
11/01/2023, 5:44 PMgifted-cricket-25537
11/01/2023, 5:46 PMgifted-cricket-25537
11/01/2023, 5:47 PMgifted-cricket-25537
11/01/2023, 6:10 PMgifted-cricket-25537
11/01/2023, 6:20 PMsh-4.4# etcdctl --cacert="/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt" --cert="/var/lib/rancher/rke2/server/tls/etcd/server-client.crt" --key="/var/lib/rancher/rke2/server/tls/etcd/server-client.key" --endpoints="<https://192.168.0.16:2379>,<https://192.168.0.17:2379>,<https://192.168.0.18:2379><https://192.168.0.19:2379>,<https://192.168.0.20:2379>" endpoint status --write-out table
{"level":"warn","ts":"2023-11-01T18:20:04.314357Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"<etcd-endpoints://0xc0003c8380/192.168.0.16:2379>","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing dial tcp: address 192.168.0.18:2379https:: too many colons in address\""}
Failed to get the status of endpoint <https://192.168.0.18:2379><https://192.168.0.19:2379> (context deadline exceeded)
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| <https://192.168.0.16:2379> | 90854c1b43df1af2 | 3.5.9 | 82 MB | false | false | 6 | 54149274 | 54149274 | |
| <https://192.168.0.17:2379> | 6336293dad683176 | 3.5.9 | 82 MB | false | false | 6 | 54149274 | 54149274 | |
| <https://192.168.0.20:2379> | 6df555c9f60573b | 3.5.9 | 394 MB | true | false | 6 | 54149345 | 54149345 | |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
gifted-cricket-25537
11/01/2023, 6:21 PMkubectl get endpoints kubernetes
NAME ENDPOINTS AGE
kubernetes 192.168.0.48:6443,192.168.0.49:6443,192.168.0.50:6443 13d
creamy-pencil-82913
11/01/2023, 6:22 PMgifted-cricket-25537
11/01/2023, 6:27 PMgifted-cricket-25537
11/01/2023, 6:31 PMNov 01 18:30:23 production-island10-overcloud-controlplane-3 rke2[5989]: time="2023-11-01T18:30:23Z" level=info msg="certificate CN=production-island10-overcloud-worker-95 signed by CN=rke2-server-ca@1697715318: notBefore=2023-10-19 11:35:18 +0000 UTC notAfter=2024-10-31 18:30:23 +0000 UTC"
Nov 01 18:30:24 production-island10-overcloud-controlplane-3 rke2[5989]: time="2023-11-01T18:30:24Z" level=info msg="certificate CN=production-island10-overcloud-worker-190 signed by CN=rke2-server-ca@1697715318: notBefore=2023-10-19 11:35:18 +0000 UTC notAfter=2024-10-31 18:30:24 +0000 UTC"
Nov 01 18:30:24 production-island10-overcloud-controlplane-3 rke2[5989]: time="2023-11-01T18:30:24Z" level=info msg="certificate CN=production-island10-overcloud-worker-76 signed by CN=rke2-server-ca@1697715318: notBefore=2023-10-19 11:35:18 +0000 UTC notAfter=2024-10-31 18:30:24 +0000 UTC"
this goes constantly for all 281 NotReady nodes I have nowgifted-cricket-25537
11/01/2023, 6:34 PMNotReady
node:
Nov 01 17:30:14 production-island10-overcloud-worker-99 rke2[927]: time="2023-11-01T17:30:14Z" level=info msg="Adding server to load balancer rke2-agent-load-balancer: 192.168.0.48:9345"
Nov 01 17:30:14 production-island10-overcloud-worker-99 rke2[927]: time="2023-11-01T17:30:14Z" level=info msg="Adding server to load balancer rke2-agent-load-balancer: 192.168.0.50:9345"
Nov 01 17:30:14 production-island10-overcloud-worker-99 rke2[927]: time="2023-11-01T17:30:14Z" level=info msg="Removing server from load balancer rke2-agent-load-balancer: 192.168.0.48:9345"
Nov 01 17:30:14 production-island10-overcloud-worker-99 rke2[927]: time="2023-11-01T17:30:14Z" level=info msg="Running load balancer rke2-agent-load-balancer 127.0.0.1:6444 -> [192.168.0.50:9345] [default: 192.168.0.48:9345]"
gifted-cricket-25537
11/01/2023, 6:35 PM192.168.0.50 is production-island10-overcloud-controlplane-3
gifted-cricket-25537
11/01/2023, 6:35 PMgifted-cricket-25537
11/01/2023, 6:36 PMproduction-island10-overcloud-worker-99
gifted-cricket-25537
11/01/2023, 6:38 PMNov 01 18:37:59 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:37:59Z" level=info msg="Starting rke2 agent v1.25.14+rke2r1 (36d7417e024e8dad34ebbf94b210ab3dd0f52cd7)"
Nov 01 18:37:59 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:37:59Z" level=info msg="Adding server to load balancer rke2-agent-load-balancer: 192.168.0.48:9345"
Nov 01 18:37:59 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:37:59Z" level=info msg="Adding server to load balancer rke2-agent-load-balancer: 192.168.0.50:9345"
Nov 01 18:37:59 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:37:59Z" level=info msg="Removing server from load balancer rke2-agent-load-balancer: 192.168.0.48:9345"
Nov 01 18:37:59 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:37:59Z" level=info msg="Running load balancer rke2-agent-load-balancer 127.0.0.1:6444 -> [192.168.0.50:9345] [default: 192.168.0.48:9345]"
Nov 01 18:37:59 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:37:59Z" level=warning msg="Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation."
Nov 01 18:38:00 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:38:00Z" level=info msg="Adding server to load balancer rke2-api-server-agent-load-balancer: 192.168.0.48:6443"
Nov 01 18:38:00 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:38:00Z" level=info msg="Adding server to load balancer rke2-api-server-agent-load-balancer: 192.168.0.50:6443"
Nov 01 18:38:00 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:38:00Z" level=info msg="Removing server from load balancer rke2-api-server-agent-load-balancer: 192.168.0.48:6443"
Nov 01 18:38:00 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:38:00Z" level=info msg="Running load balancer rke2-api-server-agent-load-balancer 127.0.0.1:6443 -> [192.168.0.50:6443] [default: 192.168.0.48:6443]"
Nov 01 18:38:11 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:38:11Z" level=info msg="Waiting to retrieve agent configuration; server is not ready: Get \"<https://127.0.0.1:6444/v1-rke2/serving-kubelet.crt>\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
gifted-cricket-25537
11/01/2023, 6:39 PMgifted-cricket-25537
11/01/2023, 6:39 PMkube-apiserver-production-island10-overcloud-controlplane-1 1/1 Running 24 (39m ago) 8d
kube-apiserver-production-island10-overcloud-controlplane-2 1/1 Running 24 (40m ago) 8d
kube-apiserver-production-island10-overcloud-controlplane-3 1/1 Running 17 (39m ago) 8d
apiservers are up for ~40mgifted-cricket-25537
11/01/2023, 6:40 PMrke2-server
on production-island10-overcloud-controlplane-3
the other two apiservers will start to receive requestsgifted-cricket-25537
11/01/2023, 6:43 PM192.168.0.48 is production-island10-overcloud-controlplane-1
gifted-cricket-25537
11/01/2023, 6:43 PMproduction-island10-overcloud-worker-99
is still timing out neverthelessgifted-cricket-25537
11/01/2023, 6:43 PMNov 01 18:43:33 production-island10-overcloud-worker-99 rke2[1163]: time="2023-11-01T18:43:33Z" level=info msg="Waiting to retrieve agent configuration; server is not ready: Get \"<https://127.0.0.1:6444/v1-rke2/serving-kubelet.crt>\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
gifted-cricket-25537
11/01/2023, 6:48 PMroot@production-island10-overcloud-worker-99:~# curl -H "rke2-Node-Name: production-island10-overcloud-worker-99" -H "rke2-Node-Password: <password>" <https://127.0.0.1:6444/v1-rke2/readyz> -k --cert /var/lib/rancher/rke2/agent/client-kubelet.crt --key /var/lib/rancher/rke2/agent/client-kubelet.key
ok
gifted-cricket-25537
11/01/2023, 6:48 PMroot@production-island10-overcloud-worker-99:~# cat /var/lib/rancher/rke2/agent/etc/rke2-agent-load-balancer.json
{
"ServerURL": "<https://192.168.0.48:9345>",
"ServerAddresses": [
"192.168.0.50:9345"
],
"Listener": null
}
creamy-pencil-82913
11/01/2023, 6:48 PM192.168.0.50
the only one that it has in the json cache file? 192.168.0.48
is the default (from the --server uri) but I am curious why it only has that one address in the list.creamy-pencil-82913
11/01/2023, 6:49 PMgifted-cricket-25537
11/01/2023, 6:50 PMgifted-cricket-25537
11/01/2023, 6:50 PMcreamy-pencil-82913
11/01/2023, 6:51 PMgifted-cricket-25537
11/01/2023, 6:51 PMroot@production-island10-overcloud-worker-99:~# curl -H "rke2-Node-Name: production-island10-overcloud-worker-99" -H "rke2-Node-Password: password<>" <https://192.168.0.48:9345/v1-rke2/serving-kubelet.crt> -k --cert /var/lib/rancher/rke2/agent/client-kubelet.crt --key /var/lib/rancher/rke2/agent/client-kubelet.key
^C
root@production-island10-overcloud-worker-99:~# curl -H "rke2-Node-Name: production-island10-overcloud-worker-99" -H "rke2-Node-Password: <password>" <https://192.168.0.49:9345/v1-rke2/serving-kubelet.crt> -k --cert /var/lib/rancher/rke2/agent/client-kubelet.crt --key /var/lib/rancher/rke2/agent/client-kubelet.key
-----BEGIN CERTIFICATE-----
...
root@production-island10-overcloud-worker-99:~# curl -H "rke2-Node-Name: production-island10-overcloud-worker-99" -H "rke2-Node-Password: <password>" <https://192.168.0.50:9345/v1-rke2/serving-kubelet.crt> -k --cert /var/lib/rancher/rke2/agent/client-kubelet.crt --key /var/lib/rancher/rke2/agent/client-kubelet.key
-----BEGIN CERTIFICATE-----
...
gifted-cricket-25537
11/01/2023, 6:52 PMgifted-cricket-25537
11/01/2023, 6:53 PMcreamy-pencil-82913
11/01/2023, 6:53 PMcreamy-pencil-82913
11/01/2023, 6:53 PMcreamy-pencil-82913
11/01/2023, 6:53 PMcreamy-pencil-82913
11/01/2023, 6:54 PMgifted-cricket-25537
11/01/2023, 6:55 PM<https://192.168.0.48:9345/v1-rke2/readyz>
says ok
gifted-cricket-25537
11/01/2023, 6:55 PMgifted-cricket-25537
11/01/2023, 6:56 PMcreamy-pencil-82913
11/01/2023, 6:56 PMgifted-cricket-25537
11/01/2023, 6:56 PMNov 01 18:55:23 production-island10-overcloud-controlplane-1 rke2[6447]: time="2023-11-01T18:55:23Z" level=info msg="certificate CN=production-island10-overcloud-worker-99 signed by CN=rke2-server-ca@1697715318: notBefore=2023-10-19 11:35:18 +0000 UTC notAfter=2024-10-31 18:55:23 +0000 UTC"
Nov 01 18:56:14 production-island10-overcloud-controlplane-1 rke2[6447]: time="2023-11-01T18:56:14Z" level=info msg="certificate CN=production-island10-overcloud-worker-99 signed by CN=rke2-server-ca@1697715318: notBefore=2023-10-19 11:35:18 +0000 UTC notAfter=2024-10-31 18:56:14 +0000 UTC"
gifted-cricket-25537
11/01/2023, 6:58 PMcreamy-pencil-82913
11/01/2023, 6:59 PMgifted-cricket-25537
11/01/2023, 6:59 PMcreamy-pencil-82913
11/01/2023, 7:00 PMgifted-cricket-25537
11/01/2023, 7:00 PMgifted-cricket-25537
11/01/2023, 7:01 PMcreamy-pencil-82913
11/01/2023, 7:01 PMgifted-cricket-25537
11/01/2023, 7:01 PMroot@production-island10-overcloud-controlplane-1:~# systemctl restart rke2-server
gifted-cricket-25537
11/01/2023, 7:02 PMcreamy-pencil-82913
11/01/2023, 7:04 PMcreamy-pencil-82913
11/01/2023, 7:04 PMcreamy-pencil-82913
11/01/2023, 7:05 PMcreamy-pencil-82913
11/01/2023, 7:06 PMcreamy-pencil-82913
11/01/2023, 7:07 PMcreamy-pencil-82913
11/01/2023, 7:08 PMgifted-cricket-25537
11/01/2023, 7:08 PM/var/lib/rancher/rke2/agent/etc/rke2-agent-load-balance.json
and update once the connection comes back?creamy-pencil-82913
11/01/2023, 7:10 PMgifted-cricket-25537
11/01/2023, 7:10 PMServerURL
and whatever is in ServerAddresses
if there's a non nil errcreamy-pencil-82913
11/01/2023, 7:11 PMgifted-cricket-25537
11/01/2023, 7:12 PMgifted-cricket-25537
11/01/2023, 7:12 PMcreamy-pencil-82913
11/01/2023, 7:13 PMcreamy-pencil-82913
11/01/2023, 7:13 PMgifted-cricket-25537
11/01/2023, 7:14 PMgifted-cricket-25537
11/01/2023, 7:14 PMgifted-cricket-25537
11/01/2023, 7:15 PM# /etc/systemd/system/rke2-agent.service.d/10-override.conf
[Service]
ExecStartPre=/bin/bash -xc '/bin/hostname | grep -q worker && /bin/sleep $((RANDOM % 60)) || /bin/true'
gifted-cricket-25537
11/01/2023, 7:15 PMcreamy-pencil-82913
11/01/2023, 7:17 PMcreamy-pencil-82913
11/01/2023, 7:18 PMgifted-cricket-25537
11/01/2023, 7:23 PMgifted-cricket-25537
11/01/2023, 7:23 PMcreamy-pencil-82913
11/01/2023, 7:27 PMgifted-cricket-25537
11/01/2023, 7:27 PMcreamy-pencil-82913
11/01/2023, 7:27 PMcreamy-pencil-82913
11/01/2023, 7:27 PMkill -ABRT
and then grab the dump from the logscreamy-pencil-82913
11/01/2023, 7:28 PMcreamy-pencil-82913
11/01/2023, 7:29 PMcreamy-pencil-82913
11/01/2023, 7:31 PMgifted-cricket-25537
11/01/2023, 7:32 PMcreamy-pencil-82913
11/01/2023, 7:33 PMcreamy-pencil-82913
11/01/2023, 7:34 PMgifted-cricket-25537
11/01/2023, 7:34 PM${rancher2_cluster_v2.default.cluster_registration_token.0.node_command}
gifted-cricket-25537
11/01/2023, 7:34 PMcreamy-pencil-82913
11/01/2023, 7:34 PMcreamy-pencil-82913
11/01/2023, 7:34 PMgifted-cricket-25537
11/01/2023, 7:34 PMcreamy-pencil-82913
11/01/2023, 7:35 PMcreamy-pencil-82913
11/01/2023, 7:35 PMgifted-cricket-25537
11/01/2023, 7:35 PMgifted-cricket-25537
11/01/2023, 7:37 PMServerURL
is not doing anything, but still tried the swarmed server and didn't update the endpoints:
Nov 01 19:34:27 production-island10-overcloud-worker-96 rke2[1186]: time="2023-11-01T19:34:27Z" level=info msg="Running load balancer rke2-agent-load-balancer 127.0.0.1:6444 -> [192.168.0.50:9345] [default: 192.168.0.48:9345]"
creamy-pencil-82913
11/01/2023, 7:44 PMgifted-cricket-25537
11/01/2023, 7:45 PMgifted-cricket-25537
11/01/2023, 7:45 PMServerAddresses
and never tries ServerURL
insteadcreamy-pencil-82913
11/01/2023, 7:46 PMcreamy-pencil-82913
11/01/2023, 7:46 PMgifted-cricket-25537
11/01/2023, 7:47 PMcreamy-pencil-82913
11/01/2023, 7:47 PMgifted-cricket-25537
11/01/2023, 7:48 PM# cat /var/lib/rancher/rke2/agent/etc/rke2-agent-load-balancer.json
{
"ServerURL": "<https://192.168.0.48:9345>",
"ServerAddresses": [
"192.168.0.50:9345"
],
"Listener": null
gifted-cricket-25537
11/01/2023, 7:49 PM# cat /var/lib/rancher/rke2/agent/etc/rke2-api-server-agent-load-balancer.json
{
"ServerURL": "<https://192.168.0.48:6443>",
"ServerAddresses": [
"192.168.0.50:6443"
],
"Listener": null
gifted-cricket-25537
11/01/2023, 7:49 PMgifted-cricket-25537
11/01/2023, 7:51 PMcreamy-pencil-82913
11/01/2023, 7:52 PMrke2-killall.sh
is probably the easiest way to do that.gifted-cricket-25537
11/01/2023, 7:56 PMcreamy-pencil-82913
11/01/2023, 7:58 PMgifted-cricket-25537
11/01/2023, 7:58 PMgifted-cricket-25537
11/01/2023, 8:03 PM