Hello, We use Rancher to provision RKE2 clusters....
# general
k
Hello, We use Rancher to provision RKE2 clusters. These clusters are configured with Cilium. We would like to create the ServiceMonitors provided by the rke2-cilium helmChart on cluster creation. But as the ServiceMonitor CRD's are not installed the HelmController used by RKE2 fails to install Cilium in the cluster and the cluster is stuck while creating. We thought to add an additional manifest using a HelmChart which installs the ServiceMonitor CRD's on bootstrap. But Rancher only provides the /var/lib/rancher/rke2/server/manifests/rancher/addons.yaml once (i think) the Rancher cluster-agent is running. But then again we got stuck on waiting untill the rke2-cilium helmChart is installed. Any ideas on how to install the Prometheus CRD's before any pods are able to start on a RKE2 cluster?
Currently we use a work around to create the HelmChart using cloud-init and also provision it via the Rancher UI as additional manifests.
b
It's not EXACTLY on init, but it's pretty close if you use #C013SSBKB6U to deploy the helm. There's all sorts of rules you can enable so it's automagic as well.
We do that to install cert-manager and all our cluster issuers downstream.
k
I think that’s not working using Fleet. Fleet itself is installed after the bootstrap. While in our case we need the chart installed during bootstrap.
b
That's certainly true.
It's just that the additional manifests aren't working?
I might see a potential reason.
k
Not even additional manifests. Those are working fine. The problem is that Rancher provides the additional manifests only once the rancher-cluster-agent is running (and that is only running after bootstrap). So once applying the manifests on the control plane manually everything will boot as expected.
So that’s also how the workaround works, by provisioning the HelmChart via cloud init in the manifests location and bootstrap is working.
b
RKE2?
yes
sorry you answered that.
k
Yes we use Rancher to provision a RKE2 cluster. So the additional manifests is working as intented. But Rancher is provisioning the manifests too late in mine opinion.
b
I think that would be a new feature request then. It doesn't seem like they ever intended for it to work that way.
k
From a Rancher view I think not. From a RKE2 view it is designed to be used to install resources during bootstrap.
b
Then it would be a bug right? Docs specifically mention that "Editing clusters in YAML allows you to set the options available in an RKE2 installation" But the additional options do not honor the
spec.boostrap=true
setting when provisioned through Rancher.
k
Could be. Will create a GitHub issue these days so see what the maintainers think of this
b
I don't see an existing bug report:
k
I already searched indeed but couldn’t find anything. I was more curious on how people handle this problem. It could well be possible I was just using the additional manifest incorrect.
b
I'd suggest opening a ticket for a couple of reasons: • You may have found a bug, just no one was using it that way. • It's totally possible that other people have had this question but Slack only lasts for 3 months. Putting it on GitHub will allow it to live on much longer. • Google Indexes GitHub so by helping you there, there might be countless other people that have the same issue that it'll end up helping too.
👍 1
So like even if it's not a bug, the docs could be clearer and they'll be something to help/fix.