This message was deleted.
# rke2
a
This message was deleted.
c
if you break the cluster with bad CNI config, you will probably have to delete the webhook to get it back up and running.
make a copy of it first, so that you can restore it afterwards
d
how ?
you mean make yaml config copy ?
I am pretty newb with this 🙂
c
how do you what?
d
delete the webhook ?
c
kubectl get -o yaml
first, and save that. then
kubectl delete
d
i just added an interface to the canal config and tried to force correct MTU
c
you can
kubectl get validatingwebhookconfigurations
to find the name of the rancher webhook
d
kubectl get validatingwebhookconfigurations NAME WEBHOOKS AGE rke2-ingress-nginx-admission 1 3d16h rke2-snapshot-validation-webhook 1 3d19h validating-webhook-configuration 12 3d18h
c
its probably that last one, you should be able to tell from the yaml if it references rancher
d
hmm, how do I get it ?
BTW thanks a lot for help 🙂
c
I showed an example up above
kubectl get -o yaml validatingwebhookconfigurations validating-webhook-configuration
I would read up on basic interaction with the cluster via kubectl before getting too far into tinkering with advanced cni config
d
saved the file
so now I should kubectl delete validatingwebhookconfigurations validating-webhook-configuration ?
c
thats the idea
d
done
and now just kubectl create -f savedfile ?
c
no, wait for your cluster to get unbroken first!
d
ah ok
yet I have the same error from helm-install-rke2-canal
c
yes, you’ll need to wait for it to retry
it might take a while if it’s gone into backoff
d
it just retried.
will restarting rke2-server on the affected node spped this up ?
c
not really no
you can try deleting the pod
d
it retried, same result
which pod? helm-install-rke2-canal ?
rancher-webhook ?
c
the helm-install pod that is failing, yes
d
ok done
recreated, still error
hmm, can my config yaml be the reason?
cat rke2-canal-config.yaml apiVersion: helm.cattle.io/v1 kind: HelmChartConfig metadata: name: rke2-canal namespace: kube-system spec: valuesContent: |- flannel: iface: "enX1"
any more pointers?
c
what error are you getting now?
d
same
c
check for
mutatingwebhookconfigurations
as well
there’s probably some there as well
d
btw, is there a way to select the proper interface on rke2 install ?
c
do the same process
d
kubectl get mutatingwebhookconfigurations NAME WEBHOOKS AGE mutating-webhook-configuration 9 3d19h rancher.cattle.io 7 3d19h
get rancher.cattle.io and delete?
c
you might have to save and delete both
d
ok saved and deleted both
ok canal started now
clannel took the correct MTU
from the interface I have configured
but calico still sits on 1450
btw. when I add the -config.yaml manifest it is enough I put it on one node, or I need to put it on multiple and/or do kubectl apply xyz-config.yaml ?
c
the chart is deployed to the cluster, so you only really need it on one
are you trying to hardcode the mtu? note that there are two MTU fields in the chart values: https://github.com/rancher/rke2-charts/blob/main-source/packages/rke2-canal/charts/values.yaml
d
so actually better only on one so i have no conflicting ones?
for now i just put the interface on the config
I have 2 interfaces a 1gig one, and a 112 gig bond
the 1 gig is enX0, 112 gig is enX1, I put the address that is bound to enX1 in the initial RKE2 config on install
I definitely want the inter pod communication to use enX1, and also to use Jumbo Frames.
flannel took up the MTU after parent interface nwo
now
but calico still sits on 1450, should I force it to use 8950 actually via config?
apiVersion: helm.cattle.io/v1 kind: HelmChartConfig metadata: name: rke2-canal namespace: kube-system spec: valuesContent: |- flannel: iface: "enX1" calico: vethuMTU: "8950" like this?
c
I don’t know, I’ve never tried to play with the MTU myself, but that seems likely?
I would defer to the canal docs if that doesn’t do it.
d
ok, will it be picked up automatically after I update the rke2-canal-config.yaml in manifests
or I need to do kubectl apply ?
c
just edit it
d
ok and kubectl rollout restart ds after this? or just wait?
ok, this did the trick, i have a proper MTU now
yet for reference I needed to restart the node, when calico interfaces were already created, they were not recreated with different MTU
now, to repair webhook pod, I should recreate the webhooks from saves I guess?
kubectl get validatingwebhookconfigurations NAME WEBHOOKS AGE rancher.cattle.io 16 31m rke2-ingress-nginx-admission 1 3d17h rke2-snapshot-validation-webhook 1 3d21h validating-webhook-configuration 12 31m
this came back by itself actually
kubectl get mutatingwebhookconfigurations NAME WEBHOOKS AGE mutating-webhook-configuration 9 32m
but not here
this is what i get from webhook pod: 2024-02-29T003533.251853198+01:00 stderr F time="2024-02-28T233533Z" level=error msg="Failed to ensure configuration: failed to create mutating configuration: MutatingWebhookConfiguration.admissionregistration.k8s.io \"\" is invalid: metadata.name: Required value: name or generateName is required" 2024-02-29T003533.2528193+01:00 stderr F time="2024-02-28T233533Z" level=error msg="error syncing 'cattle-system/cattle-webhook-ca': handler secrets: failed to create mutating configuration: MutatingWebhookConfiguration.admissionregistration.k8s.io \"\" is invalid: metadata.name: Required value: name or generateName is required, requeuing"
c
if you restart the rancher pod, it might recreate those for you
d
you mean delete the pod?
restarting whole node is not enough?
c
if you rebooted the whole node, that would also restart the pod.
d
i had to reboot to update MTU
it created mutating webhook config, however there is no CA bundle now in it
solved
🙂 upgrade rancher-webhook recreates the tls secrets.
all running now, thank you for assistance, I have actually learned a lot 🙂
just one more question, is there a way to tell RKE2 to use correct interface from the start? It seems to take up a first one.
and I did not see this in docs.
c
just put that manifest there before you start it the first time
q
Hi How can I resolve this? [root@B3VM-UCTNS-MS01 ~]# helm install rancher rancher-stable/rancher --namespace cattle-system --set hostname=172.21.90.66.sslip.io --set bootstrapPassword=admin Error: INSTALLATION FAILED: 1 error occurred: * Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": failed to call webhook: Post "https://rke2-ingress-nginx-controller-admission.kube-system.svc:443/networking/v1/ingresses?timeout=10s": context deadline exceeded I tried above things but no luck