Couldn't get security to sign off on exporting my logs unfortunately, but think I'm getting somewhere. I think I'm ballsing this up conceptually... I've got a management cluster with Rancher, managing a number of downstream clusters. Deploying everything downstream with terraform and the rancher provider. Conceptually, we do the whole cattle not pets thing, and when a cluster goes funky, we just destroy and rebuild either a node or sometimes the whole cluster from code exactly identical.
The issue we are having is rancher-system-agent isn't pulling certs right or something. The TLS directory is non existent, the machine plan for rebuilding the same cluster doesn't have any of the tls folders, rke2 binaries etc.
However, I create a brand new never before seen cluster, and it works fine.
So, I get no one can give me specific advice without logs. But conceptually, am I shooting myself in the foot by destroying and recreating clusters with the same name/details/config?? Should we be protecting the cluster resource once created?