This message was deleted.
# rke2
a
This message was deleted.
c
It was chosen by Product Management for alignment with RKE.
t
Thank makes sense, which sort of falls under "historical circumstances", which is what I suspected, but thanks for confirming.
c
We do not support changing CNIs on an existing cluster. It just doesn’t work well and is a supportability nightmare. If you had a non-prod cluster to test it on you might be able to come up with some steps that work for you, but we don’t test it and don’t have any documentation that covers that.
t
Sure, I definitely wouldn't expect explicit "support", and I would be doing this at my own risk, more so I wanted to confirm I understood what rke2 was doing under the umbrella of "installing a CNI", that I would need to undo.
c
yeah, rke2 just installs different charts for each CNI. You might have to go as far as deleting and re-registering the nodes to get all the various bits tracked properly.
t
I'll keep that in mind. Thanks again.
Hey, sorry to bother you again, but in doing some more homework on this, it seems like the calico project's position is that new clusters should use calico alone (e.g. the note at the top of https://docs.tigera.io/calico/latest/getting-started/kubernetes/flannel/install-for-flannel). Is canal being the default purely for continuity/alignment, or is it Rancher's position that, unless there is a specific requirement otherwise, new clusters for the foreseeable future should still use canal? The reason I ask, is that calico themselves have an automated migration tool https://docs.tigera.io/calico/latest/getting-started/kubernetes/flannel/migration-from-flannel, is Rancher aware of this? While I totally sympathize with not wanting to support any arbitrary CNI switch, this specific migration path seems like something that more and more clusters will want/need as time goes on, especially if calico's position ever shifts from "you should use calico instead" to "do not use canal", or would Rancher provide a stop-gap, a la dockers-shim, in that case?
c
There’s no position on one CNI over another. We support all of them equally. The default is just the default because thats what folks were used to and we wanted to make it easy for folks to make the switch from RKE to RKE2.
historically Calico was required for Windows support, but the network policy stuff on Windows is super broken because of some issues that Microsoft is apparently unable to fix, so we are now recommending use of Flannel, with the caveat that network policies are not supported.
Windows basically drops all existing connections every time the network policy changes, and they can’t or won’t fix that.
t
Interesting, that's good to know, thank you. Unfortunately, I believe network policies are required for this cluster, windows or not, so we'll just have to put up with that, if nothing else its more justification we can give to this one customer to not use windows anymore. Unless, of course, its possible to use flannel for windows, and can{al,ico} for linux and bridge them, but I've never heard of anything like that being possible. I'm going to try out calico's migration tool with some devtools I have for creating temporary rke2 clusters. If I get it working consistently and wrapped as a HelmRelease object, do you think Rancher would be interested in accepting it as a contribution?
c
from what I’ve heard the netpol stuff makes windows basically unusable because clients are constantly disconnected whenever pods change. @bland-account-99790 might be able to link to the issue in question.
t
Is it just unusable w/ NetworkPolicies, or just at all?
c
it is specifically network policies that do it.
t
Hmm, okay. I think that'd be fine for development, but production is another story, we may have to convince them to re-vector. Thank you, this has been very informative.
c
any time the policies that handle traffic to or from a pod change, windows reloads the whole hns acl policy engine which causes all the existing connections to be reset. Is is super dumb and it is apparently just how windows works.
the issue has been open without MS being able to deliver a fix, long enough that we have added standalone flannel support to RKE2 and are now recommending use of that for Windows instead of Calico
b
MS is supposed to deliver a fix soon. According to the information I have, the problem is addressed with OS builds with build number >=25922.1000 and the plan was to deliver that in the March/April release of Windows 2022 Server, but I would not trust that 100%, it might take a bit longer
t
Thanks again for all the info. If you don't mind me bothering you again, we took your advice and wrangled enough hardware for a separate cluster w/ flannel, but seeing some very bizarre behavior. Running the agent quickstart on a fresh copy of server 2019 standard has the node show up as ready, but running a test pod results in flannel complaining about "duplicate allocations" (sorry, its in a place that'd be hard to copy paste from), followed quickly by the node losing network connectivity. Once we wrestled control back, the logs show that something called "host-local.exe" is upset about the X.Y.Z.2 address of our service CIDR being already allocated to "dummy" (no idea what this is, we didn't do anything with that name), and this resulting in a empty string being passed as the "source-vip" to kube-proxy. Have you seen anything like this before? This is the edge of my expertise, and neither google nor github yield any useful results.
b
When rke2 starts in Windows, before giving any IP, it reserves an IP for the source-vip of kube-proxy. It does this using
host-local
it reserves it to a pod called dummy (which does not exist, it's just because we need to reserve one IP to kube-proxy)
We are slightly changing this in the next release (end of April) but I am surprised about the error
host-local is saving the given IPs in
C:\var\lib\cni\networks\
, could you check what you have there?
This was our first release with rke2+flannel in windows and we are hitting some bugs. That's why we consider it "experimental" in the meanwhile. Thanks for reporting it!
c
@bland-account-99790 maybe the dummy pod reservation could be named something that better ties it to kube-proxy somehow? just for better troubleshooting.
b
yes, that's one of the changes coming in the April release. Already merged: https://github.com/rancher/rke2/blob/master/pkg/windows/flannel.go#L349-L355
I reproduced your issue. I am working on a fix. The problem I see is that if I run rke2 in windows, it reserves an IP for kube-proxy. If I stop rke2 and restart it (without deleting anything), host-local tries reserving another IP instead of reusing the one already reserved. I thought
host-local
was intelligent enough to pick the already reserved IP address
t
Wow, you guys are fast, how nice to wake up to problems being solved. Do you still need the contents of our cni directory, or do you have it from here? You mentioned restarting without deleting things, is there a workaround we can try to get around this/things we need to do between restarts, or do we just have to sit tight and wait for your fix?
b
The workaround for your current problem is easy: go to
/var/lib/cni/networks/vxlan0
and remove the file there. Before removing it, check its content and verify it has the dummy string
then restart RKE2 service
t
Indeed it does, however, it has it twice, with a newline between, is that expected?
In any case, the event viewer shows kube-proxy as having an actual VIP this time, so I think we're in business, thanks again!
b
yes, that's expected 🙂
t
Looks like the actual host network is still cycling on and off once it allocates an IP for the pod. Is that part of the expected fix, or is that an indication something is wrong on our end?
b
when flannel creates the network, then you might see some connectivity bump but after that it should work well
t
Hmm, this doesn't look like a short bump or hiccup. I created a pod ~20m ago, the node stopped heartbeating to the contorlplane within ~1m, I deleted the pod shortly after, and I still can't get a reliable RDP/SSH connection to it (it will start a connection, then die within a few seconds). Unfortunately, I'm not a network engineer, much less a windows one, but I'll see if we have some boots on the ground that can make sense of it.
b
can you check the rke2 logs in the node and see what you get there?
t
Nothing out of the ordinary in the event viewer or the log files under /var/lib/rancher/rke2/agent/logs, outside of connection errors back to the control plane and image reigstry, consistent with it losing network access while a container is running.
I did notice that the etc/resolv.conf that it generated under /var/lib/rancher was set to just use 8.8.8.8 instead of the internal DNS we have set the node's windows networking config
This might be relevant: This machine has two ethernet NICs (though only one is enabled and plugged in)
Both of my colleagues that work in the datacenter were out Friday, so this had to be put on hold, but now that we have eyes on the machine in this bad state, it looks like as soon as the first pod IP is allocated by flannel, the physical ethernet device disappears, with the vEthernet device instead. Is this intended behavior? We also removed the second NIC with identical results.
b
Interesting, I have never seen that. What I have seen is that when the hns network gets created/removed, the physical interface bound to that hns network disappears for some seconds (~20s). But if I understand correctly, in your case, everything is fine: hns is created, physical NICs are available... then you create a pod and the physical interface bound to the hns network disappears for some seconds?
t
It disappears and never comes back
b
I have never seen that before, sorry