This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

07/28/2023, 2:52 AM

This message was deleted.

creamy-pencil-82913

07/28/2023, 3:02 AM

Check the logs on the nodes. Specifically the cloud provider pod logs. Vsphere config is probably missing something.

swift-sunset-4572

07/28/2023, 3:09 AM

Okay understood , let me have a look at it and see if i can find anything will share it here.

swift-sunset-4572

07/28/2023, 3:33 AM

Everything seems to be in a pending state

swift-sunset-4572

07/28/2023, 3:36 AM

when i try to describe it i get this

swift-sunset-4572

07/28/2023, 3:36 AM

Do i have to have worker nodes seperately ?

swift-sunset-4572

07/28/2023, 3:42 AM

more into exploring when i check the logs on

kubectl logs rancher-vsphere-cpi-cloud-controller-manager-89wzf  -n kube-system

it shows this

Copy code

E0728 03:32:36.545852       1 nodemanager.go:185] shakeOutNodeIDLookup failed. Err=nodeID is empty
I0728 03:32:36.545922       1 search.go:76] WhichVCandDCByNodeID nodeID: externalc1-pool1-08946451-8nzdb
E0728 03:34:08.754796       1 connection.go:177] Failed to create new client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:34:08.754901       1 connection.go:63] Failed to create govmomi client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:34:08.754941       1 connectionmanager.go:147] Cannot connect to vCenter with err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed o            ut
E0728 03:36:19.826776       1 connection.go:177] Failed to create new client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:36:19.826934       1 connection.go:63] Failed to create govmomi client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:36:19.826976       1 connectionmanager.go:147] Cannot connect to vCenter with err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed o            ut
E0728 03:38:30.898832       1 connection.go:177] Failed to create new client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:38:30.899014       1 connection.go:63] Failed to create govmomi client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:38:30.899058       1 connectionmanager.go:147] Cannot connect to vCenter with err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed o            ut
E0728 03:38:31.899770       1 search.go:118] WhichVCandDCByNodeID error vc:Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:38:31.899916       1 nodemanager.go:185] shakeOutNodeIDLookup failed. Err=Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:38:31.899989       1 node_controller.go:229] error syncing 'externalc1-pool1-08946451-8nzdb': failed to get provider ID for node externalc1-pool1-08946451-8nzdb at cloudpro            vider: failed to get instance ID from cloud provider: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out, requeuing
I0728 03:38:31.900141       1 node_controller.go:415] Initializing node externalc1-pool1-08946451-8nzdb with cloud provider
I0728 03:38:31.900287       1 search.go:76] WhichVCandDCByNodeID nodeID: externalc1-pool1-08946451-8nzdb
E0728 03:40:41.970929       1 connection.go:177] Failed to create new client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:40:41.970966       1 connection.go:63] Failed to create govmomi client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:40:41.970973       1 connectionmanager.go:147] Cannot connect to vCenter with err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: conne

swift-sunset-4572

07/28/2023, 3:45 AM

I think this might be the reason its not able to remove the toleration that is causing the pods to go into pending state

creamy-pencil-82913

07/28/2023, 4:34 AM

Yes that would do it. Why can't it connect to your vcenter server?

swift-sunset-4572

07/28/2023, 4:37 AM

well thats because the network admins have not opened this url for the ips that are being used by the downstream clusters. so i am not even able to curl on this url from the downstream nodes. this becomes a prerequisite for the cpi and csi to have this url accessible from the whole ip pool thats being utilised to create the downstream nodes and use vsphere as a storage . Right ?

creamy-pencil-82913

07/28/2023, 4:54 AM

Yes, the cloud and storage controllers need to talk to vsphere. All of the clusters - both the rancher cluster, and the clusters it provisions - that you want to integrate into vsphere, need to be able to talk to vsphere.

💯 1

swift-sunset-4572

07/28/2023, 5:57 AM

Point noted. Will test and update once i will get the desired access. 😇

swift-sunset-4572

07/29/2023, 2:53 AM

Hi, @creamy-pencil-82913 I was not provided with the access to the vcenter url , my network team said that its not possible to provide access on all nodes ( master/ worker) of all the clusters . Is there way where i can deploy cpi and csi only on master nodes ?

creamy-pencil-82913

07/29/2023, 3:24 AM

You can configure node selectors in the chart values

swift-sunset-4572

07/29/2023, 3:26 AM

How many nodes do you suggest it should be deployed on ? If i deploy it on lets say 3 nodes would that be suffice for the whole cluster Plus if i am provisioning a cluster and choosing a cloud provider as vsphere there i cannot use the node selector for cpi and csi

creamy-pencil-82913

07/29/2023, 5:40 PM

You would need to deploy HelmChartConfig resources to customize the cpi and csi chart values

swift-sunset-4572

07/29/2023, 6:07 PM

Similar to The process steps on this link will work right ? https://github.com/rancher/rke2/issues/663

swift-sunset-4572

08/01/2023, 3:50 PM

@rich-garage-1614

247 Views

Open in Slack

Previous Next