This message was deleted.
# general
a
This message was deleted.
g
Versions: rancher 2.6.8, vsphere 7.0.3
s
Is the VM multi-homed, so is it perhaps using the wrong interface to connect to Rancher? Do you see any relevant logs from Rancher pods?
g
I have only one network interface on the virtual machine. The rancher instance is reachable (https) from the VM. I can't find useful logs from rancher side. Just:
Copy code
2022/09/13 08:49:48 [INFO] [planner] rkecluster fleet-default/rke2: waiting: waiting for viable init node
2022/09/13 08:50:51 [INFO] [planner] rkecluster fleet-default/rke2: waiting: configuring bootstrap node(s) rke2-pool1-5fb5f65fbf-mtwbm: waiting for bootstrap etcd to be available
2022/09/13 08:50:51 [ERROR] [planner] rkecluster fleet-default/rke2: error encountered during plan processing was Operation cannot be fulfilled on <http://machines.cluster.x-k8s.io|machines.cluster.x-k8s.io> "rke2-pool1-5fb5f65fbf-mtwbm": the object has been modified; please apply your changes to the latest version and try again
2022/09/13 08:50:51 [INFO] [planner] rkecluster fleet-default/rke2: waiting: configuring bootstrap node(s) rke2-pool1-5fb5f65fbf-mtwbm: waiting for agent to check in and apply initial plan
And from the node side:
Copy code
error 401 received while downloading Rancher connection information. Sleeping for 5 seconds and trying again
The command used at this step is:
Copy code
curl --connect-timeout 60 --max-time 60 --write-out %{http_code}\n -sS -H "Authorization: Bearer 47DEQp{REDACTED}uFU=" -H "X-Cattle-Id: 2c47306bda0e6f8{REDACTED}d4eaf217eb" -H "X-Cattle-Role-Etcd: false" -H "X-Cattle-Role-Control-Plane: false" -H "X-Cattle-Role-Worker: false" -H "X-Cattle-Node-Name: " -H "X-Cattle-Address: " -H "X-Cattle-Internal-Address: " -H "X-Cattle-Labels: " -H "X-Cattle-Taints: " <https://rancher.REDACTED/v3/connect/agent> -o /var/lib/rancher/agent/rancher2_connection_info.json
If I use a k3s cluster deployment (tech preview), this step succeeds and the node is initialized. After that, I have troubles configuring a CPI/CSI. I'd like to stick to a RKE1 or RKE2 cluster, though.
a
Can you grab the logs from the
rancher-system-agent
service. please
g
Do you mean on the deployed node? If so, I don't have a rancher-system-agent service configured yet.
The binary was downloaded, but I didn't reach the service installation step.
a
And you're using the vSphere node driver?
👍 1
g
I don't understand because the step it fails on is OK when I select a k3s kubernetes flavor instead of rke2. No 401 Unauthorized…
a
Can you enable debug logging in rancher and try again please
g
OK. I don't know what to obfuscate before publishing the whole result, though. Right now, it loops on the 401 error in the node and the rancher logs loop is:
Copy code
2022/09/13 12:29:36 [DEBUG] [CAPI] Reconcile MachineSet
2022/09/13 12:29:36 [DEBUG] [CAPI] Cannot retrieve CRD with metadata only client, falling back to slower listing
2022/09/13 12:29:36 [DEBUG] [CAPI] Cannot retrieve CRD with metadata only client, falling back to slower listing
2022/09/13 12:29:36 [DEBUG] [CAPI] Unable to retrieve Node status, missing NodeRef
2022/09/13 12:29:36 [DEBUG] [CAPI] Some nodes are not ready yet, requeuing until they are ready
2022/09/13 12:29:38 [DEBUG] [rke2configserver] parsed 8698dcff893f96d39ab97dde871968ae56b13a687e1e2680cfd16c5184fd5c6 as machineID
2022/09/13 12:29:38 [DEBUG] [rke2configserver] Got / machine from provisioning SA
2022/09/13 12:29:38 [DEBUG] [rke2configserver] Got / machine from cluster token
2022/09/13 12:29:40 [DEBUG] ObjectsAreEqualResults for machine-xwqg8: statusEqual: true conditionsEqual: false specEqual: true nodeNameEqual: true labelsEqual: true annotationsEqual: true requestsEqual: true limitsEqual: true rolesEqual: true
2022/09/13 12:29:40 [DEBUG] ObjectsAreEqualResults for machine-xwqg8: statusEqual: true conditionsEqual: false specEqual: true nodeNameEqual: true labelsEqual: true annotationsEqual: true requestsEqual: true limitsEqual: true rolesEqual: true
2022/09/13 12:29:40 [DEBUG] Updating machine for node [local-node]
2022/09/13 12:29:40 [DEBUG] Updated machine for node [local-node]
2022/09/13 12:29:40 [DEBUG] DesiredSet - No change(2) <http://provisioning.cattle.io/v1|provisioning.cattle.io/v1>, Kind=Cluster fleet-local/local for provisioning-cluster-create local
2022/09/13 12:29:40 [DEBUG] DesiredSet - No change(2) /v1, Kind=Secret fleet-local/local-kubeconfig for cluster-create fleet-local/local
2022/09/13 12:29:40 [DEBUG] DesiredSet - No change(2) <http://management.cattle.io/v3|management.cattle.io/v3>, Kind=ClusterRoleTemplateBinding local/local-fleet-local-owner for cluster-create fleet-local/local
2022/09/13 12:29:40 [DEBUG] DesiredSet - No change(2) <http://fleet.cattle.io/v1alpha1|fleet.cattle.io/v1alpha1>, Kind=Cluster fleet-local/local for fleet-cluster fleet-local/local
2022/09/13 12:29:43 [DEBUG] [rke2configserver] parsed 8698dcff893f96d39ab97dde871968ae56b13a687e1e2680cfd16c5184fd5c6 as machineID
2022/09/13 12:29:43 [DEBUG] [rke2configserver] Got / machine from provisioning SA
2022/09/13 12:29:43 [DEBUG] [rke2configserver] Got / machine from cluster token
2022/09/13 12:29:45 [DEBUG] DesiredSet - No change(2) /v1, Kind=ServiceAccount fleet-default/cl1-bootstrap-template-x52q7-machine-bootstrap for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:29:45 [DEBUG] DesiredSet - No change(2) /v1, Kind=ServiceAccount fleet-default/cl1-bootstrap-template-x52q7-machine-plan for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:29:45 [DEBUG] DesiredSet - No change(2) /v1, Kind=Secret fleet-default/cl1-bootstrap-template-x52q7-machine-bootstrap for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:29:45 [DEBUG] DesiredSet - No change(2) /v1, Kind=Secret fleet-default/cl1-bootstrap-template-x52q7-machine-plan for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:29:45 [DEBUG] DesiredSet - No change(2) <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>, Kind=Role fleet-default/cl1-bootstrap-template-x52q7-machine-plan for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:29:45 [DEBUG] DesiredSet - No change(2) <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>, Kind=RoleBinding fleet-default/cl1-bootstrap-template-x52q7-machine-plan for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:29:48 [DEBUG] [CAPI] Cannot retrieve CRD with metadata only client, falling back to slower listing
2022/09/13 12:29:48 [DEBUG] [CAPI] Infrastructure provider is not ready, requeuing
2022/09/13 12:29:48 [DEBUG] [CAPI] Cannot reconcile Machine's Node, no valid ProviderID yet
2022/09/13 12:29:48 [DEBUG] Searching for providerID for selector <http://rke.cattle.io/machine=c3492044-4855-44ed-8d9b-c8d4c5185a8f|rke.cattle.io/machine=c3492044-4855-44ed-8d9b-c8d4c5185a8f> in cluster fleet-default/cl1, machine cl1-pool1-769dbbd958-w5tb2: Get "<https://10.43.10.155/k8s/clusters/c-m-stx6kqzc/api/v1/nodes?labelSelector=rke.cattle.io%!F(MISSING)machine%!D(MISSING)c3492044-4855-44ed-8d9b-c8d4c5185a8f>": dial tcp 10.43.10.155:443: connect: no route to host
2022/09/13 12:29:48 [DEBUG] [rke2configserver] parsed 8698dcff893f96d39ab97dde871968ae56b13a687e1e2680cfd16c5184fd5c6 as machineID
2022/09/13 12:29:48 [DEBUG] [rke2configserver] Got / machine from provisioning SA
2022/09/13 12:29:48 [DEBUG] [rke2configserver] Got / machine from cluster token
2022/09/13 12:29:51 [DEBUG] Extras returned map[principalid:[<local://user-lrxjb>] username:[admin]]
2022/09/13 12:29:51 [DEBUG] Triggering auth refresh on user-lrxjb
2022/09/13 12:29:51 [DEBUG] Skipping refresh for user-lrxjb due to max-age
2022/09/13 12:29:51 [DEBUG] [CAPI] Reconcile MachineSet
2022/09/13 12:29:51 [DEBUG] [CAPI] Cannot retrieve CRD with metadata only client, falling back to slower listing
2022/09/13 12:29:51 [DEBUG] [CAPI] Cannot retrieve CRD with metadata only client, falling back to slower listing
2022/09/13 12:29:51 [DEBUG] [CAPI] Unable to retrieve Node status, missing NodeRef
2022/09/13 12:29:51 [DEBUG] [CAPI] Some nodes are not ready yet, requeuing until they are ready
2022/09/13 12:29:53 [DEBUG] [rke2configserver] parsed 8698dcff893f96d39ab97dde871968ae56b13a687e1e2680cfd16c5184fd5c6 as machineID
2022/09/13 12:29:53 [DEBUG] [rke2configserver] Got / machine from provisioning SA
2022/09/13 12:29:53 [DEBUG] [rke2configserver] Got / machine from cluster token
2022/09/13 12:29:59 [DEBUG] [rke2configserver] parsed 8698dcff893f96d39ab97dde871968ae56b13a687e1e2680cfd16c5184fd5c6 as machineID
2022/09/13 12:29:59 [DEBUG] [rke2configserver] Got / machine from provisioning SA
2022/09/13 12:29:59 [DEBUG] [rke2configserver] Got / machine from cluster token
2022/09/13 12:30:03 [DEBUG] DesiredSet - No change(2) /v1, Kind=ServiceAccount fleet-default/cl1-bootstrap-template-x52q7-machine-bootstrap for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:30:03 [DEBUG] DesiredSet - No change(2) /v1, Kind=ServiceAccount fleet-default/cl1-bootstrap-template-x52q7-machine-plan for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:30:03 [DEBUG] DesiredSet - No change(2) /v1, Kind=Secret fleet-default/cl1-bootstrap-template-x52q7-machine-bootstrap for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:30:03 [DEBUG] DesiredSet - No change(2) /v1, Kind=Secret fleet-default/cl1-bootstrap-template-x52q7-machine-plan for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:30:03 [DEBUG] DesiredSet - No change(2) <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>, Kind=Role fleet-default/cl1-bootstrap-template-x52q7-machine-plan for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:30:03 [DEBUG] DesiredSet - No change(2) <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>, Kind=RoleBinding fleet-default/cl1-bootstrap-template-x52q7-machine-plan for rke-machine fleet-default/cl1-bootstrap-template-x52q7
2022/09/13 12:30:04 [DEBUG] [rke2configserver] parsed 8698dcff893f96d39ab97dde871968ae56b13a687e1e2680cfd16c5184fd5c6 as machineID
2022/09/13 12:30:04 [DEBUG] [rke2configserver] Got / machine from provisioning SA
2022/09/13 12:30:04 [DEBUG] [rke2configserver] Got / machine from cluster token
2022/09/13 12:30:06 [DEBUG] [CAPI] Reconcile MachineSet
I'm back on the topic. On a new rancher instance, the RKE2 deployment is initiating correctly, this time. The cluster (one node) is up but still marked as "Updating" in the cluster management page. I had this same issue on a "custom" cluster. The node stays in a "Waiting for Node Ref" state.
I have this error in the rancher logs:
Copy code
2022/09/16 13:24:11 [ERROR] [CAPI] Reconciler error: error creating client and cache for remote cluster: error creating dynamic rest mapper for remote cluster "fleet-default/cdr4": Get "<https://10.43.150.240/k8s/clusters/c-m-shq6vqt7/api?timeout=10s>": dial tcp 10.43.150.240:443: connect: no route to host
I don't understand it. Does rancher try to reach the cluster on an internal private IP address?
Hello @agreeable-oil-87482. I'd much appreciate your insight on this if you have some time.
1273 Views