This message was deleted.
# general
a
This message was deleted.
🙏 1
b
OMG Is it? Are you sure, it is because of modifications
e
I am pretty sure that any modification to the cluster configuration applied via terraform cause cluster deletion occasionally…
I’ve lost couple times prod clusters already just by applying manifests change…
I was doing some more testing, and this doesn’t happen if I modify
Cluster
k8s CR that rancher creates, like it was very stable for ~2k iterations
also I’ve tried to reproduce this applying cluster modification via rancher API directly, so this also was very stable for 2k+ iterations
so far this happens for me only If I change my cluster configuration with terraform provider
b
That's really scary I will use UI then for provisioning and managing state
e
well, this doesn’t scale if you have to manage many cluster in gitOps maner
đź‘Ť 1
but yeh this is a disaster, first time it happened to me I was applying some label change in production cluster and it just destroyed the whole cluster in seconds
b
Yeah correct Please share your analysis after resolution
b
Have you considered using a k8s gitops cd tool to apply cluster custom resource updates from git as opposed to using terraform? Unfortunately the rancher experience for gitops provisioning and managing custom clusters is not great. We try to use our IaaS tool as little as possible for this and use our k8s gitops tool as much as possible for managing the clusters and their addons.
e
@best-microphone-20624 yeh, this is something I am currently considering to use instead as using terraform is just dangerous the thing is actually this might be not terraform provider issue, all it does is calling norman (rancher apis) APIs so this might be API bug
b
I would expect fleet-based cluster mgmt at scale to be better tested than terraform-based cluster mgmt. It may be that terraform uses the same underlying apis but does so in a less reliable manner.