This message was deleted Rancher Users #general

Join Slack

This message was deleted.

# general

adamant-kite-43734

08/03/2023, 2:00 PM

This message was deleted.

✅ 1

swift-eve-48927

08/03/2023, 3:36 PM

Here are the provisioning logs for one of the rke2 cluster. If everything else succeeds, what would stop the cluster agent from connecting?

Copy code

[INFO ] waiting for infrastructure ready
[INFO ] waiting for at least one control plane, etcd, and worker node to be registered
[INFO ] waiting for viable init node
[INFO ] configuring bootstrap node(s) test-rke-pool1-7f995b8df9-xvm7d: waiting for agent to check in and apply initial plan
[INFO ] configuring bootstrap node(s) test-rke-pool1-7f995b8df9-xvm7d: waiting for probes: calico, etcd, kube-apiserver, kube-controller-manager, kube-scheduler, kubelet
[INFO ] configuring bootstrap node(s) test-rke-pool1-7f995b8df9-xvm7d: waiting for probes: calico, etcd, kube-apiserver, kube-controller-manager, kube-scheduler
[INFO ] configuring bootstrap node(s) test-rke-pool1-7f995b8df9-xvm7d: waiting for probes: calico, kube-apiserver, kube-controller-manager, kube-scheduler
[INFO ] configuring bootstrap node(s) test-rke-pool1-7f995b8df9-xvm7d: waiting for probes: calico, kube-controller-manager, kube-scheduler
[INFO ] configuring bootstrap node(s) test-rke-pool1-7f995b8df9-xvm7d: waiting for probes: calico
[INFO ] configuring bootstrap node(s) test-rke-pool1-7f995b8df9-xvm7d: waiting for cluster agent to connect

swift-eve-48927

08/03/2023, 7:27 PM

I seem to be having issues with the tunnel and/or websocket connections between downstream nodes and the Rancher Server.

journalctl -u rke2-server

on one of the nodes shows the following logs. Seems the tunnel between the node and rancher isn’t holding?

Copy code

level=info msg="Stopped tunnel to 127.0.0.1:9345"
level=info msg="Proxy done" err="context canceled" url="<wss://127.0.0.1:9345/v1-rke2/connect>"
level=info msg="Connecting to proxy" url="<wss://10.3.3.126:9345/v1-rke2/connect>"
level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"

I see a similar error in the Rancher Server docker container logs.

Copy code

[INFO] error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF

along with lots of errors that end with

"unable to decode an event from the watch stream: tunnel disconnect"

creamy-pencil-82913

08/03/2023, 7:30 PM

those message on the rke2 side are normal.

creamy-pencil-82913

08/03/2023, 7:30 PM

has nothing to do with a tunnel to rancher, those are about internal REK2 tunnels between cluster nodes. The rancher stuff all runs in pods, you won’t find anything about rancher in the rke2 service logs.

swift-eve-48927

08/03/2023, 7:32 PM

Gotcha. Thank you, I’ll keep looking.

swift-eve-48927

08/03/2023, 7:40 PM

I’m seeing the cattle-cluster-agent pod is in a

Pending

state on the node.

Copy code

Warning  FailedScheduling  18m (x10 over 68m)  default-scheduler  0/1 nodes are available: 1 node(s) had untolerated taint {<http://node.cloudprovider.kubernetes.io/uninitialized|node.cloudprovider.kubernetes.io/uninitialized>: true}. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..

Could this be an issue with Harvester? It is my cloud provider in this case.

creamy-pencil-82913

08/03/2023, 7:44 PM

well it’s certainly an issue with the cloud provider. Have you looked at the cloud provider pod logs to see why it’s not working?

creamy-pencil-82913

08/03/2023, 7:44 PM

the pod on the downstream cluster, not on the harvester side

swift-eve-48927

08/03/2023, 7:52 PM

Ahhh, ok. The issue is clear. I didn’t realize the downstream nodes needed to be able to reach the Harvester network. I have the Harvester cluster and downstream nodes deployed in separate subnets with no routes between, so I see I need to rethink how I have things set up. Thank you for pointing me in the right direction!

creamy-pencil-82913

08/03/2023, 8:00 PM

yes, the cloud controller manager needs to actually be able to talk to the cloud provider

374 Views

Open in Slack

Previous Next