08/24/2022, 11:34 AM
I figured out that the real issue I am facing in my previous post is that not all my worker nodes are added to the load balancer backend pool and this is preventing them from having connectivity and reaching the Rancher server. Once I added the stuck worker nodes to the load balance backend pool manually, all of them got registered. I don't use labels yet so cannot be a label issue. I am wondering which component is responsible for adding nodes to the backend pool. Is it from Rancher itself or in Kubernetes?


08/24/2022, 12:26 PM
I think the issue may be that the Azure cloud provider isn't set up properly - I have commented on the GitHub issue.


08/24/2022, 12:28 PM
Yes, I saw it and have answered in the issue.
Is there no way to get more help on this ? 1 node out of 3 (from 3 different worker node pools) gets registered and added to the load balancer automatically after cluster creation. But I tried to add 1 stuck node manually to the load balancer and suddenly it got registered and added to the load balancer. The last stuck node got registered in the kube-controller-manager after that too, but still not added to the load balancer and it has missing connectivity to reach the rancher server (because not added to the load balancer backend pool) and be in a Ready state. This seems to be a bug to me. I am not doing anything particularly complex. This is only not working when using more than 1 worker node pool.
Also, even if I use 1 single worker node pool (with 1 node) and my cluster registers both control-plane and the worker, the node scaling feature advertised in the doc for RKE does not work. The node stays forever in "IP resolved" and at some point it delete the node and start creating it again but it never reaches the "Registering" state. It just creates the VM in azure. This is all only when using an external load balancer.