This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

04/10/2024, 11:07 AM

This message was deleted.

future-chef-71223

04/10/2024, 3:22 PM

Just as a follow-up. These are the labels and annotations on the node:

Copy code

Labels:             <http://beta.kubernetes.io/arch=amd64|beta.kubernetes.io/arch=amd64>
                    <http://beta.kubernetes.io/instance-type=k3s|beta.kubernetes.io/instance-type=k3s>
                    <http://beta.kubernetes.io/os=linux|beta.kubernetes.io/os=linux>
                    <http://kubernetes.io/arch=amd64|kubernetes.io/arch=amd64>
                    <http://kubernetes.io/hostname=phchbs-st32018|kubernetes.io/hostname=phchbs-st32018>
                    <http://kubernetes.io/os=linux|kubernetes.io/os=linux>
                    <http://node-role.kubernetes.io/control-plane=true|node-role.kubernetes.io/control-plane=true>
                    <http://node-role.kubernetes.io/master=true|node-role.kubernetes.io/master=true>
                    <http://node.kubernetes.io/instance-type=k3s|node.kubernetes.io/instance-type=k3s>
Annotations:        <http://alpha.kubernetes.io/provided-node-ip|alpha.kubernetes.io/provided-node-ip>: <MY-IPV4>
                    <http://csi.volume.kubernetes.io/nodeid|csi.volume.kubernetes.io/nodeid>: {"<http://driver.longhorn.io|driver.longhorn.io>":"phchbs-st32018"}
                    <http://flannel.alpha.coreos.com/backend-data|flannel.alpha.coreos.com/backend-data>: {"VNI":1,"VtepMAC":"7e:65:80:f9:9f:cb"}
                    <http://flannel.alpha.coreos.com/backend-type|flannel.alpha.coreos.com/backend-type>: vxlan
                    <http://flannel.alpha.coreos.com/kube-subnet-manager|flannel.alpha.coreos.com/kube-subnet-manager>: true
                    <http://flannel.alpha.coreos.com/public-ip|flannel.alpha.coreos.com/public-ip>: <MY-IPV4>
                    <http://k3s.io/hostname|k3s.io/hostname>: phchbs-st32018
                    <http://k3s.io/internal-ip|k3s.io/internal-ip>: <MY-IPV4>
                    <http://k3s.io/node-args|k3s.io/node-args>:
                      ["server","--disable","traefik","--flannel-iface","ens160","--data-dir","/opt/k3s","--prefer-bundled-bin","--resolv-conf","/etc...
                    <http://k3s.io/node-config-hash|k3s.io/node-config-hash>: VWJ7GMRNIEVMZ5NDM7NJVQP7REBOJBVASIV4F273RFU4UY3GQXIA====
                    <http://k3s.io/node-env|k3s.io/node-env>:
                      {"K3S_DATA_DIR":"/opt/k3s/data/3fcd4fcf3ae2ba4d577d4ee08ad7092538cd7a7f0da701efa2a8807d44a25f66","K3S_KUBECONFIG_MODE":"644"}
                    <http://node.alpha.kubernetes.io/ttl|node.alpha.kubernetes.io/ttl>: 0
                    <http://volumes.kubernetes.io/controller-managed-attach-detach|volumes.kubernetes.io/controller-managed-attach-detach>: true

creamy-pencil-82913

04/10/2024, 4:50 PM

Did you upgrade directly from 1.26 to 1.29, or did you step through 1.27 and 1.28 first?

creamy-pencil-82913

04/10/2024, 4:51 PM

Have you configured dual-stack cluster-cidr and service-cidr ranges, or is the cluster ipv4 only?

future-chef-71223

04/10/2024, 4:55 PM

I upgraded directly from 1.26 to 1.29. To upgrade I usually run

k3s-killall.sh

first, and then re-run the installer with the new k3s version desired. Should I go through each version instead? The cluster is ipv4 only and I checked from the k3s service logs (systemd) that only ipv4 CIDR ranges are passed as

--service-cluster-ip-range

--cluster-cidr

creamy-pencil-82913

04/10/2024, 4:56 PM

https://docs.k3s.io/upgrades/manual?_highlight=version%20skew#release-channels

creamy-pencil-82913

04/10/2024, 4:56 PM

When attempting to upgrade to a new version of K3s, the Kubernetes version skew policy applies. Ensure that your plan does not skip intermediate minor versions when upgrading.

future-chef-71223

04/10/2024, 4:58 PM

To give you a better overview, the node is behind a company proxy configured using the default

http_proxy/https_proxy

variables. Since the upgrade, it seems that all the networking operations are trying to resolve the proxy domain using ipv6 and that is failing. The workaround I found so far is to use the proxy IP directly instead of the domain, but I'm not sure it won't change in the future and I didn't have this issue before the upgrade, although I was using the same proxy.

creamy-pencil-82913

04/10/2024, 4:58 PM

I can’t say I’ve ever seen the outcome you’re describing due to skipping minors though.

creamy-pencil-82913

04/10/2024, 5:00 PM

What is the output of

ip addr

in the pods? Things running in pods should be smart enough to not try to connect to ipv6 addresses if they don’t have that address family available. However, you said you’re getting “could not resolve host” errors, not “could not connect to host”… which suggests that DNS lookups are failing?

future-chef-71223

04/10/2024, 5:02 PM

This is

ip addr

from a random nginx pod in the cluster (sorry for the screenshot but I can't copy/paste on this node)

future-chef-71223

04/10/2024, 5:04 PM

However, you said you’re getting “could not resolve host” errors, not “could not connect to host”… which suggests that DNS lookups are failing?

Yes, it's failing to resolve the domain of the HTTP proxy, but that doesn't happen if I force ipv4 for example running

curl -4 <http://myproxy.com>

. It seems that even for DNS lookups, it's trying to solve

AAAA

instead of

, and I don't have any

AAAA

entry for the proxy

future-chef-71223

04/10/2024, 5:13 PM

Btw, from the results of

ip addr

it seems like the pod has an

ipv6

address as well, but why is it the case? I never configured the dual-stack here

creamy-pencil-82913

04/10/2024, 5:39 PM

This is all client behavior and not controlled by Kubernetes, but normally it will resolve both A and AAAA and then use whichever ones are appropriate for the available address families.

creamy-pencil-82913

04/10/2024, 5:40 PM

You might try adding this to /etc/sysctl.conf and rebooting?

Copy code

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1

future-chef-71223

04/10/2024, 5:40 PM

> This is all client behavior and not controlled by Kubernetes I see the same behaviour also for Kubernetes operations like image pulling though.

creamy-pencil-82913

04/10/2024, 5:44 PM

You said it wasn’t affecting things that run on the node itself?

creamy-pencil-82913

04/10/2024, 5:45 PM

image pulls happen outside kubernetes, in containerd. which runs on the node, not in a container.

creamy-pencil-82913

04/10/2024, 5:45 PM

if it is happening on the node itself, then you have an OS level problem

creamy-pencil-82913

04/10/2024, 5:45 PM

try those sysctls and see what happens

future-chef-71223

04/10/2024, 5:47 PM

You're right, I was considering containerd as part of kubernetes itself but I get your point here. I confirm though that it doesn't happen on the OS, outside of containerd. The same commands I tested, such as

curl

ping

would fail in the pods but not on the node itself, and I get the same dns lookup errors in containerd when pulling images

future-chef-71223

04/10/2024, 5:48 PM

try those sysctls and see what happens

I'll give it a try tomorrow and let you know. Thanks a lot for your help here, really appreciated!

future-chef-71223

04/17/2024, 12:50 PM

@creamy-pencil-82913 just wanted to give you an update about this. In the end I restarted the server and I wasn't able to reproduce the issue anymore, regardless of the

sysctl

ipv6 configs. Still not sure what caused it in the first place, but it wasn't related to K3s in the end. Thanks again for your help 👍

4 Views

Open in Slack

Previous Next