This message was deleted.
# rke2
a
This message was deleted.
c
How did you get it working the first time?
As you noted, the CNI installation needs to tolerate any NotReady taints the node has, since it will remain NotReady until after the CNI is up
b
it just worked honestly, not sure how I didn't hit this tbh
lemme check the daemonset and see what it has on there
c
just out of curiosity why are you deploying your own Cilium instead of using
cni: cilium
b
we have always managed it manually going back to rke1 days (which is how this particular cluster started life)
c
You did an in-place conversion from RKE1 to RKE2?
b
yes, not recently
this cluster was converted probably 6 or 7 months ago...just now needed to reinstall a specific node because reasons
c
ahh yikes. that’s still got a lot of rough edges to it, I wouldn’t personally have recommended that
b
I'm well aware of the rough edges yes (I helped document/smooth a bunch of them) 😄
This appears to me to tolerate everything? (this is what the cilium chart tolerates)
Copy code
tolerations:
      - operator: Exists
c
That should do it. Is there some other reason why the CNI pod isn’t coming up on that node?
b
doh, thanks for the tip...tunnel vision at the moment
lemme take a look
Copy code
Warning  FailedCreatePodSandBox  8m12s                   kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: expected cgroupsPath to be of format "slice:prefix:name" for systemd cgroups, got "/kubepods/burstable/pod7389c130-8e97-470f-81ca-aed24cccceab/8ab52a02eb8c112a60158ba4c5fd5c639798f57541d5abb2669ff48cce4451db" instead: unknown
  Warning  FailedCreatePodSandBox  4m50s (x16 over 7m57s)  kubelet  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: expected cgroupsPath to be of format "slice:prefix:name" for systemd cgroups, got "/kubepods/burstable/pod7389c130-8e97-470f-81ca-aed24cccceab/e1f032e99b21cdef3f64d4cd304150ac78836900f9f3be84bb139c54c94bcdab" instead: unknown
hmm
c
That usually indicates that the kubelet and container runtime aren’t using the same cgroup driver
are you using the embedded containerd, or docker?
b
embedded containerd
c
are you overriding the kubelet’s --cgroup-driver option for any reason, or providing your own containerd configuration with a config.toml.tmpl ?
the kubelet and containerd config need to agree about what cgroup driver to use. If you leave it to the defaults then RKE2 keeps them in sync, but if you are overriding kubelet config or containerd config its possible that you’ve got one set to something different
b
perhaps, I do have some config in the node ansible scripts to set some kernel args (but the other nodes match)
--cgroup-driver=cgroupfs
Copy code
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
  SystemdCgroup = true
I'm assuming that's a mismatch?
I must be missing some config the other nodes have (or added something the others don't) 😞
c
yep that’s a mismatch
I would remove that kubelet arg
you want to be using the systemd cgroup driver, not cgroupfs
b
yup, I see where I f'd that up
@creamy-pencil-82913 thanks for the tips! I forgot I had mucked with that from the originals because of reasons 😞
651 Views