Since we've updated Rancher to 2.11, newly created...
# general
k
Since we've updated Rancher to 2.11, newly created clusters using our terraform modules can't bootstrap any more. The capi-controller-manager is reporting the following error;
Copy code
"Connect failed" err="error creating HTTP client and mapper: cluster is not reachable: {\"Code\":{\"Code\":\"Forbidden\",\"Status\":403},\"Message\":\"<http://clusters.management.cattle.io|clusters.management.cattle.io> \\\"c-m-xxxxxxxx\\\" is forbidden: User \\\"u-xxxxxxxx\\\" cannot get resource \\\"clusters\\\" in API group \\\"<http://management.cattle.io|management.cattle.io>\\\" at the cluster scope\",\"Cause\":null,\"FieldName\":\"\"}" controller="clustercache" controllerGroup="<http://cluster.x-k8s.io|cluster.x-k8s.io>" controllerKind="Cluster" Cluster="fleet-default/cluster-az1" namespace="fleet-default" name="cluster-az1" reconcileID="dea8fe2f-1f4d-4f1b-a1e4-b7e21e1e1a6e"
Anyone has ideas what is causing this and how to fix it?
The rancher pod is also reporting errors like
Copy code
error syncing '_all_': handler user-controllers-controller: userControllersController: failed to set peers for key _all_: failed to start user controllers for cluster c-m-xxxxxxxx: ClusterUnavailable 503: cluster not found, requeuing
b
hmm, this seems more a problem with the upgrade rather than Terraform. What version of the provider are you using?
If you are wondering if bootstrap is working for Rancher v2.11, it works in all of my testing both when my workstation is controlling Terraform and when GH Actions runner is running it.
k
I also don’t think it is related to how we deploy it using Terraform. I’ve tried to create a second cluster just using the Rancher UI and the same problem appears.
So I guess something is changed during or due to the upgrade. But still can’t figure it out wat is missing, or why the permission for the cluster User is not set or created.