This message was deleted.
# rke2
a
This message was deleted.
c
Check the logs on the nodes. Specifically the cloud provider pod logs. Vsphere config is probably missing something.
s
Okay understood , let me have a look at it and see if i can find anything will share it here.
Everything seems to be in a pending state
when i try to describe it i get this
Do i have to have worker nodes seperately ?
more into exploring when i check the logs on
kubectl logs rancher-vsphere-cpi-cloud-controller-manager-89wzf  -n kube-system
it shows this
Copy code
E0728 03:32:36.545852       1 nodemanager.go:185] shakeOutNodeIDLookup failed. Err=nodeID is empty
I0728 03:32:36.545922       1 search.go:76] WhichVCandDCByNodeID nodeID: externalc1-pool1-08946451-8nzdb
E0728 03:34:08.754796       1 connection.go:177] Failed to create new client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:34:08.754901       1 connection.go:63] Failed to create govmomi client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:34:08.754941       1 connectionmanager.go:147] Cannot connect to vCenter with err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed o            ut
E0728 03:36:19.826776       1 connection.go:177] Failed to create new client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:36:19.826934       1 connection.go:63] Failed to create govmomi client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:36:19.826976       1 connectionmanager.go:147] Cannot connect to vCenter with err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed o            ut
E0728 03:38:30.898832       1 connection.go:177] Failed to create new client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:38:30.899014       1 connection.go:63] Failed to create govmomi client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:38:30.899058       1 connectionmanager.go:147] Cannot connect to vCenter with err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed o            ut
E0728 03:38:31.899770       1 search.go:118] WhichVCandDCByNodeID error vc:Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:38:31.899916       1 nodemanager.go:185] shakeOutNodeIDLookup failed. Err=Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:38:31.899989       1 node_controller.go:229] error syncing 'externalc1-pool1-08946451-8nzdb': failed to get provider ID for node externalc1-pool1-08946451-8nzdb at cloudpro            vider: failed to get instance ID from cloud provider: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out, requeuing
I0728 03:38:31.900141       1 node_controller.go:415] Initializing node externalc1-pool1-08946451-8nzdb with cloud provider
I0728 03:38:31.900287       1 search.go:76] WhichVCandDCByNodeID nodeID: externalc1-pool1-08946451-8nzdb
E0728 03:40:41.970929       1 connection.go:177] Failed to create new client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:40:41.970966       1 connection.go:63] Failed to create govmomi client. err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: connect: connection timed out
E0728 03:40:41.970973       1 connectionmanager.go:147] Cannot connect to vCenter with err: Post "<https://vc.prgx.com:443/sdk>": dial tcp 10.0.77.254:443: conne
I think this might be the reason its not able to remove the toleration that is causing the pods to go into pending state
c
Yes that would do it. Why can't it connect to your vcenter server?
s
well thats because the network admins have not opened this url for the ips that are being used by the downstream clusters. so i am not even able to curl on this url from the downstream nodes. this becomes a prerequisite for the cpi and csi to have this url accessible from the whole ip pool thats being utilised to create the downstream nodes and use vsphere as a storage . Right ?
c
Yes, the cloud and storage controllers need to talk to vsphere. All of the clusters - both the rancher cluster, and the clusters it provisions - that you want to integrate into vsphere, need to be able to talk to vsphere.
💯 1
s
Point noted. Will test and update once i will get the desired access. 😇
Hi, @creamy-pencil-82913 I was not provided with the access to the vcenter url , my network team said that its not possible to provide access on all nodes ( master/ worker) of all the clusters . Is there way where i can deploy cpi and csi only on master nodes ?
c
You can configure node selectors in the chart values
s
How many nodes do you suggest it should be deployed on ? If i deploy it on lets say 3 nodes would that be suffice for the whole cluster Plus if i am provisioning a cluster and choosing a cloud provider as vsphere there i cannot use the node selector for cpi and csi
c
You would need to deploy HelmChartConfig resources to customize the cpi and csi chart values
s
Similar to The process steps on this link will work right ? https://github.com/rancher/rke2/issues/663
@rich-garage-1614
241 Views