This message was deleted.
# general
a
This message was deleted.
g
There's an order of precedence. If you want to use the value in FelixConfig, you have to have NOT set the associated env var on the calico-node pod
Alternatively, just set the value you want on the calico-node daemonset?
d
Thanks Lance, but setting it in the daemonset doesn't work either, because the tigera operator overwrites the changes
What I'm actually trying to do is set
FELIX_FEATUREDETECTOVERRIDE="ChecksumOffloadBroken=true,SNATFullyRandom=false,MASQFullyRandom=false"
- but this parameter isn't showing up at all
g
Definitely ask this in the calico slack - if its a bug that you can't set it in an operator installed system, hopefully someone will see it and fix it.
d
Default value is
FELIX_FEATUREDETECTOVERRIDE="ChecksumOffloadBroken=true
- shouldn't this come up in the calico-node either way?
g
shouldn't this come up in the calico-node either way?
Depends what you mean by this. Where are you setting it, and what do you expect to see happen? If you mean "you set this in felixconfig and it should appear in the calico-node env vars", then I disagree.
d
Or another question: where do the env vars come from? How/where are they set?
When I open the FelixConfiguration in a new rancher/rke2 installation, then the FelixConfiguration has this setting by default
g
Env vars are set by tigera-operator. They're a pretty blunt instrument, because you can't override env vars (as you've discovered) Felixconfig is read directly by Felix in the calico-node pods. It'll output a log at startup to say what config its discovered, and as I said, there's a precedence ordering - env vars win, then per-node felixconfig settings, then global felixconfig settings, then defaults.
d
And I would expect it to show up in the calico-node config parameters (spec)
If env vars exist, the felixconfiguration isn't taken into account at all?
g
Correct.
d
Or param by param?
g
param by param
d
Alright, thanks for the clarification. I'll try some more debugging
๐Ÿ‘ 1
g
For Felix, you want to be checking the info log emitted at startup that lists the discovered config. Note that updating felixconfig causes felix to restart.
d
Got it, thanks
Hey Lance, I got it to work (took me some time to figure out that the config updates in-place without a pod restart (which I expected to happen)). If anyone else finds this thread - set
featureDetectOverride: ChecksumOffloadBroken=true,SNATFullyRandom=false,MASQFullyRandom=false
in your FelixConfiguration to get direct connections with tailscale ๐Ÿ˜‰
Hm, strange behavior though. For some pods it works, for some it doesn't. Also for the ones that are able to establish a direct connection, it doesn't work all the time. So if I ping a pod and it is able to build a direct connection, and then I ping it again 10mins later, it stopped working. But then again a little later, it works again
g
That's weird. There must be a reason. Can you use tcpdump to determine where the packets are going missing? That would narrow down the things to look at, at least
Going to recommend the Calico slack again for this - https://slack.projectcalico.org/ While I'm OK with Calico configuration, for weird dataplane issues you're going to want the folks there, I think.
d
Yeah, totally makes sense to switch to the calico slack, thx ๐Ÿ™‚ ๐Ÿ‘
fyi: the issue was actually a traffic-loadbalancing configuration in the router and enabled p2p traffic in the UDM. It is now working with all pods/nodes. Thanks again for your help, Lance
g
Wow! Good sleuthing, and thanks for the feedback