This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

06/06/2023, 1:17 AM

This message was deleted.

creamy-pencil-82913

06/06/2023, 1:47 AM

ClusterIP addresses are handled by kube-proxy, not the CNI

red-lizard-14453

06/06/2023, 1:48 AM

hmm

creamy-pencil-82913

06/06/2023, 1:48 AM

Unless the CNI includes a kube-proxy replacement, which flannel does not.

creamy-pencil-82913

06/06/2023, 1:48 AM

You won't see routes, just iptables entries

red-lizard-14453

06/06/2023, 1:49 AM

ok, thanks for the hint, I'll see if I can figure out what's amiss in iptables.

red-lizard-14453

06/06/2023, 6:59 PM

Here's the error I'm getting. The ip its trying to talk to is the service cidr range, first address (kubernetes api).

Copy code

$   kubectl logs kustomize-controller-666f8f4b5f-ppwhk --previous
{"level":"info","ts":"2023-06-06T18:40:59.283Z","logger":"controller-runtime.metrics","msg":"Metrics server is starting to listen","addr":":8080"}
{"level":"error","ts":"2023-06-06T18:41:29.308Z","logger":"setup","msg":"unable to create controller","controller":"kustomize-controller","error":"failed setting index fields: failed to get API group resources: unable to retrieve the complete list of server APIs: <http://kustomize.toolkit.fluxcd.io/v1|kustomize.toolkit.fluxcd.io/v1>: Get \"https://[fdde:91f3:4d41:3100::1]:443/apis/kustomize.toolkit.fluxcd.io/v1\": dial tcp [fdde:91f3:4d41:3100::1]:443: i/o timeout"}

When I checkout iptables, I see a rule:

Copy code

-A KUBE-SERVICES -d fdde:91f3:4d41:3100::1/128 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y

That rule says:

Copy code

-A KUBE-SVC-NPX46M4PTMTKRN6Y ! -s fdde:91f4:4d41:3000::/56 -d fdde:91f3:4d41:3100::1/128 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> [2602:814:4000:3::15]:6443" -j KUBE-SEP-CTIXJTB53XIITEK7

and KUBE-MARQ-MASQ sets 0x4000:

Copy code

-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000

Since this pod (kustomize controller) has a Pod IP inside the PodCIDR:

Copy code

IPs:
  IP:           10.42.3.8
  IP:           fdde:91f4:4d41:3003::8

It is getting ignored by NPX46M because the not source would fail. either I'm having a brainfart and it's obvious whats wrong, or there's something amiss here.

creamy-pencil-82913

06/06/2023, 7:54 PM

what are your cluster and service cidrs set to?

red-lizard-14453

06/06/2023, 7:55 PM

one sec and I'll grab the master commission from history

creamy-pencil-82913

06/06/2023, 7:55 PM

any other components you’ve replaced, or args you’ve added, would be good to call out as well

red-lizard-14453

06/06/2023, 7:55 PM

25 curl -sfL https://get.k3s.io | sh -s - --flannel-backend=wireguard-native --node-ip=:: --cluster-cidr=10.24.0.0/16,fdde91f44d413000:/56 --service-cidr=10.43.0.0/16,fdde91f34d413100:/112 --disable traefik --disable servicelb --disable local-storage --datastore-endpoint=postgres://k3s-prod:redacted@redacted:15432/redacted?sslmode=disable

red-lizard-14453

06/06/2023, 7:56 PM

I adjusted the line above to fix a typo in the datastore-endpoint call.

creamy-pencil-82913

06/06/2023, 8:00 PM

that all looks normal. Is the pod in question running on a server, or on an agent node?

red-lizard-14453

06/06/2023, 8:01 PM

on an agent node.

red-lizard-14453

06/06/2023, 8:18 PM

I have a hunch as to what's wrong -- that ula prefix is also partially routable inside the network. I'm replacing both podcidr and servicecidr with completely isolated ULA prefixes

red-lizard-14453

06/06/2023, 8:18 PM

going to rebuild and see if behavior continues

creamy-pencil-82913

06/06/2023, 8:19 PM

wg uses a separate port for ipv4 and ipv6 tunnels, you might check that both are open on the off chance that’s causing problems

creamy-pencil-82913

06/06/2023, 8:19 PM

the routing overlap seems more likely though

red-lizard-14453

06/06/2023, 9:05 PM

Crap, that wasn't it. Same behavior. Created cluster with:

Copy code

curl -sfL <https://get.k3s.io> | sh -s - --flannel-backend=wireguard-native --cluster-cidr=fdc0:9e85:e99e::/56,10.42.0.0/16 --service-cidr=fd62:a374:67c2::/112,10.43.0.0/16 --node-ip=2602:814:4000:3::15,10.174.3.15 --node-external-ip=2602:814:4000:3::15,23.154.40.15  --disable traefik --disable servicelb --disable local-storage --datastore-endpoint=<postgres://k3s-prod>:redacted@redacted:15432/redacted?sslmode=disable

error from kustomize controller:

Copy code

$   kubectl logs kustomize-controller-666f8f4b5f-6n986 --previous
{"level":"info","ts":"2023-06-06T20:58:30.895Z","logger":"controller-runtime.metrics","msg":"Metrics server is starting to listen","addr":":8080"}
{"level":"error","ts":"2023-06-06T20:59:00.900Z","logger":"setup","msg":"unable to create controller","controller":"kustomize-controller","error":"failed setting index fields: failed to get API group resources: unable to retrieve the complete list of server APIs: <http://kustomize.toolkit.fluxcd.io/v1|kustomize.toolkit.fluxcd.io/v1>: Get \"https://[fd62:a374:67c2::1]:443/apis/kustomize.toolkit.fluxcd.io/v1\": dial tcp [fd62:a374:67c2::1]:443: i/o timeout"}

rules:

Copy code

-A KUBE-SERVICES -d fd62:a374:67c2::1/128 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SVC-NPX46M4PTMTKRN6Y ! -s fdc0:9e85:e99e::/56 -d fd62:a374:67c2::1/128 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ

red-lizard-14453

06/06/2023, 9:06 PM

using tcpdump I do see traffic attempting from fdc0:: (podcidr) going to the service cidr ip (its a tcp syn to 443.)

red-lizard-14453

06/06/2023, 9:13 PM

regarding wireguard, all four nodes show connectivity to all the other nodes, so I think wireguard line proto is up at least

red-lizard-14453

06/06/2023, 9:17 PM

Is there a way I can see the running flannel config file?

red-lizard-14453

06/06/2023, 9:18 PM

oh I see it, /var/lib/rancher/k3s/agent/etc/flannel/net-conf.json

red-lizard-14453

06/06/2023, 9:21 PM

You know whats weird? The only packets I see on flannel-wg-v6 are going to a prefix that isn't in the wg allowed ips.

creamy-pencil-82913

06/06/2023, 9:22 PM

is wireguard using the wrong IPs for the tunnels?

red-lizard-14453

06/06/2023, 9:22 PM

22:22:31.993897 IP6 2602:814:4000:3::f203.10250 > fdc0:9e85:e99e::2.41458: Flags [S.], seq 3442809166, ack 2077726981, win 64704, options [mss 1360,sackOK,TS val 2234894331 ecr 914303222,nop,wscale 7], length 0

red-lizard-14453

06/06/2023, 9:23 PM

Copy code

root@tpi2n4:/var/lib/rancher/k3s/agent/etc/flannel# wg show all dump | grep e99e
flannel-wg-v6   qs8Zk8eg1/RVM0O/TNKcMaPkvEsv7HVtvpDae7pSpm0=    (none)  [2602:814:4000:3:7c18:f6ff:feff:f872]:51821     fdc0:9e85:e99e:1::/64   1686086490      6000    1744    25
flannel-wg-v6   jUhUEL2XdIujWzegVZ3wMasDCUo7HLPYmuZT7lUmsxc=    (none)  [2602:814:4000:3:cc73:97ff:feed:86ee]:51821     fdc0:9e85:e99e:3::/64   1686086549      2900    5212    25
flannel-wg-v6   Ny1UczukarE44YGyuLlymHVRW6HVCqewEXK0GcVULHg=    (none)  [2602:814:4000:3:a81d:89ff:fe59:f2]:51821       fdc0:9e85:e99e:6::/64   1686086453      2944    4384    25
flannel-wg-v6   /s8psXkYy60uqydwl8J1HWszgr0O20NDsst7VcVzy14=    (none)  [2602:814:4000:3:f00f::de75]:51821      fdc0:9e85:e99e::/64     1686086476      7216    164432  25

red-lizard-14453

06/06/2023, 9:23 PM

and on the master:

Copy code

root@k3smstr-202306:~# ip -6 a l |grep e99e
    inet6 fdc0:9e85:e99e::/128 scope global
    inet6 fdc0:9e85:e99e::1/64 scope global

red-lizard-14453

06/06/2023, 9:23 PM

... what's

::2

? It seems like it'd be on the master node.

red-lizard-14453

06/06/2023, 9:24 PM

and to be clear, this is a brand new ula generator prefix that is nowhere else on my network -- I just plopped it into k3s to provision without any network setup

creamy-pencil-82913

06/06/2023, 9:27 PM

you can do

kubectl get service -A -o wide

to see what those service ClusterIPs are for

creamy-pencil-82913

06/06/2023, 9:27 PM

kubernetes is always +1 in the range, and dns is +10

creamy-pencil-82913

06/06/2023, 9:29 PM

for pods, the node local IPAM just allocates them sequentially out of the range given to the host. You should be able to find those in the pod list.

creamy-pencil-82913

06/06/2023, 9:30 PM

it is odd that there appear to be missing peers though?

red-lizard-14453

06/06/2023, 9:30 PM

thats the thing though....

Copy code

$   kubectl get services -o wide -A
NAMESPACE     NAME                      TYPE        CLUSTER-IP             EXTERNAL-IP   PORT(S)                  AGE   SELECTOR
default       kubernetes                ClusterIP   fd62:a374:67c2::1      <none>        443/TCP                  52m   <none>
kube-system   kube-dns                  ClusterIP   10.43.0.10             <none>        53/UDP,53/TCP,9153/TCP   51m   k8s-app=kube-dns
kube-system   metrics-server            ClusterIP   fd62:a374:67c2::bf29   <none>        443/TCP                  51m   k8s-app=metrics-server
flux-system   notification-controller   ClusterIP   fd62:a374:67c2::5202   <none>        80/TCP                   31m   app=notification-controller
flux-system   source-controller         ClusterIP   fd62:a374:67c2::e97a   <none>        80/TCP                   31m   app=source-controller
flux-system   webhook-receiver          ClusterIP   fd62:a374:67c2::1f8    <none>        80/TCP                   31m   app=notification-controller
⎈ june-2023/flux-system  ~/
  $   kubectl get nodes -o wide -A
NAME             STATUS   ROLES                  AGE   VERSION        INTERNAL-IP             EXTERNAL-IP           OS-IMAGE                         KERNEL-VERSION   CONTAINER-RUNTIME
tpi2n2           Ready    <none>                 48m   v1.26.5+k3s1   2602:814:4000:3::f202   <none>                Debian GNU/Linux 11 (bullseye)   6.1.14           <containerd://1.7.1-k3s1>
tpi2n4           Ready    <none>                 48m   v1.26.5+k3s1   2602:814:4000:3::f204   <none>                Debian GNU/Linux 11 (bullseye)   6.2.14           <containerd://1.7.1-k3s1>
tpi2n1           Ready    <none>                 48m   v1.26.5+k3s1   2602:814:4000:3::f201   <none>                Debian GNU/Linux 11 (bullseye)   6.1.14           <containerd://1.7.1-k3s1>
tpi2n3           Ready    <none>                 48m   v1.26.5+k3s1   2602:814:4000:3::f203   <none>                Debian GNU/Linux 11 (bullseye)   6.1.14           <containerd://1.7.1-k3s1>
k3smstr-202306   Ready    control-plane,master   52m   v1.26.5+k3s1   2602:814:4000:3::15     2602:814:4000:3::15   Debian GNU/Linux 11 (bullseye)   6.1.21-v8+       <containerd://1.7.1-k3s1>

creamy-pencil-82913

06/06/2023, 9:30 PM

Why do you have peers for 0, 1, 3, 6

creamy-pencil-82913

06/06/2023, 9:30 PM

there would not normally be gaps in the ranges like that

red-lizard-14453

06/06/2023, 9:30 PM

I cannot answer that, sadly, I'm not sure.

creamy-pencil-82913

06/06/2023, 9:31 PM

are you wiping the datastore between tests?

red-lizard-14453

06/06/2023, 9:31 PM

deleting the pgsql database, yes. If you try to rebootstrap a provisioned database it will fail to start.

creamy-pencil-82913

06/06/2023, 9:31 PM

… unless you pass a token, which is what you’re expected to do if you want to reuse the DB

creamy-pencil-82913

06/06/2023, 9:32 PM

but in this case wiping it is good

red-lizard-14453

06/06/2023, 9:32 PM

Right, I mean, when I completely delete the cluster and reprovision, I run

k3s-agent-uninstall.sh

on all agents, run

k3s-uninstall.sh

on the master, delete the pgsql datastore, create new database, run the k3s server install, then use

k3s token create

to make a join token

red-lizard-14453

06/06/2023, 9:32 PM

all four nodes are using the same join token -- that's ok, right?

creamy-pencil-82913

06/06/2023, 9:33 PM

yeah thats fine

creamy-pencil-82913

06/06/2023, 9:33 PM

if you start your server with the same --token you can still use

k3s token create

to make join tokens for the agents. It’ll just keep the DB from erroring out when you uninstall/reinstall but don’t wipe the DB

creamy-pencil-82913

06/06/2023, 9:34 PM

hmm, the server has an external IP but the agents do not? And the external IP is the same as the internal IP?

creamy-pencil-82913

06/06/2023, 9:34 PM

You might see if dropping that makes things less confused

red-lizard-14453

06/06/2023, 9:35 PM

hmm, ok, I'll try that. I have to leave for an appointment but I'll try it tomorrow and letcha know how it works out. Thanks for the help!

red-lizard-14453

06/07/2023, 2:09 AM

caught this in the latest iteration, ha

Copy code

Jun 07 03:08:42 k3smstr-202306 k3s[50185]: E0607 03:08:42.288556   50185 fieldmanager.go:210] "[SHOULD NOT HAPPEN] failed to update managedFields" err="failed to convert new object (/k3smstr-202306; /v1, Kind=Node) to smd typed: .status.addresses: duplicate entries for key [type=\"InternalIP\"]" VersionKind="/, Kind=" namespace="" name="k3smstr-202306"

every time we say something "should not happen" -- surprise!

red-lizard-14453

06/07/2023, 11:16 AM

I rebuilt using vxlan this morning instead of wireguard-native and everything's working just fine now. There may be some issue with wireguard. If there's someone on the dev team that wants to dig deeper into this let me know, I'm willing to delete/recreate a few more times before I start putting workloads on.

430 Views

Open in Slack

Previous Next