big-dawn-71012
07/15/2022, 5:02 PMrke up
fails at this point:
WARN[0304] [etcd] host [10.10.1.86] failed to check etcd health: failed to get /health for host [10.10.154.86]: Get "<https://10.10.1.86:2379/health>": net/http: TLS handshake timeout
FATA[0304] [etcd] Failed to bring up Etcd Plane: etcd cluster is unhealthy: hosts [10.10.1.86] failed to report healthy. Check etcd container logs on each host for more information
Checking the etcd logs on 10.10.1.86, I see lots of "connection refused" failure from rafthttp/probing
as follows:
{"level":"warn","ts":"2022-07-15T17:00:22.813Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"fa5a1d6f91d877ef","rtt":"0s","error":"dial tcp 10.10.1.88:2380: connect: connection refused"}
{"level":"warn","ts":"2022-07-15T17:00:22.813Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"b8ffb4898a3dabab","rtt":"0s","error":"dial tcp 10.10.1.87:2380: connect: connection refused"}
where 10.10.1.87 and 88 are the two worker nodes (non etcd, non controlplane). Why in the world is it trying to do those probes against those nodes?tall-school-18125
07/18/2022, 5:44 AMbig-dawn-71012
07/18/2022, 11:25 AM