This message was deleted.
# k3s
a
This message was deleted.
a
@gray-lawyer-73831 this is a follow-up to my earlier message. I have just gone back and read your earlier message. I have tried removing the --flannel-conf param and instead specifying the --advertise-address. The nodes now start without errors, but still I am unable to resolve the names of the pods within or without the cluster. I can if I stop UFW. But I have these rules in UFW confgured for DNS. So it may be other VXLAN traffic; 53/tcp ALLOW OUT Anywhere 53/udp ALLOW OUT Anywhere 53/tcp (v6) ALLOW OUT Anywhere (v6) 53/udp (v6) ALLOW OUT Anywhere (v6) I also have UFW rules on each node to allow traffic on port 8472 from the other nodes, for example: 8472/udp ALLOW 192.168.100.2 # Allow Flannel VXLAN 8472/udp ALLOW 192.168.100.3 # Allow Flannel VXLAN
p
you could use
flannel-iface
flag to specify which interface you want to use for the vxlan
a
I'll try that thanks. Does it just take the interface name? I can't understand why cluster DNS is still failing though, now it isn't impacted by the firewall blocking 53
p
it only needs the name of the interface.
a
ok. Trying that now, thanks
That works (almost). I can now resolve external DNS names, and I can resolve the kubernetes.default.svc domain, but I still can't resolve the pods
I just changed the ansible playbook to add that parameter. Will the cluster need fully tearing down and standing up again for it to work correctly?
p
I am not sure but restarting the node should be enough
you can try to restart the coreDNS pod
a
Actually, I think the DNS might be working despite my dnsutils pod not working as expected. The issue now is the pods trying to connect to HTTPS endpoints. I'm trying to setup argocd to sync a private repo, and it looks to be trying now but auth fails. Looks to be for this reason:
This seems to be something to do with IPv6 outbound traffic being blocked from the pods: git https://github.com/steddyman/ false false false true Failed Unable to connect to repository: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [:1]64344: connect: connection refused"
p
Is IPv6 configured on K3s?
a
I've added no specific configurations to enable IPv6 but I believe it is disabled by default?
I've tried disabling UFW on all the hosts and this still fails with the above transport error
g
openssl s_client -connect
is one way you'd test openssl, but I'd be more keen on validating the host in question here as there seems to be either some issue looking up the CA, or similar in that a known google ip with https is unreachable for you, but the same host is reachable on :80
you're also pre-pending the kubernetes.default.svc, don't do that, it's service-name.namespace.svc.cluster.local
and
.cluster.local
is just the domain field you fed to whatever kubeadm-init process you're using
a
no, I am not appending the full domain name normally. I was only doing that to rule out an issue with the configured dns suffix. Normally I just issue nslookup argocd-repo-server
I am using Hetzner hosts, which on their public IP address requires a /32 subnet mask to force all traffic to the gateway. But you can configure a VLAN i/f on the hosts too, for private network traffic. This is what I have setup, so the node ip's are on the 192.168.100 range over vlan 4000, and the external IP is the hosts original IP. More information is here: https://docs.hetzner.com/robot/dedicated-server/network/net-config-debian-ubuntu
I'm creating a netplan file for the secondary IP as follows: - hosts: hetzner_hosts become: yes tasks: - name: Update netplan configuration for vlan interface copy: dest: /etc/netplan/02-netcfg.yaml content: | network: version: 2 ethernets: {{ nic_name }}: dhcp4: no dhcp6: no vlans: {{ nic_name }}.{{ private_vlan_id }}: id: {{ private_vlan_id }} link: {{ nic_name }} mtu: 1400 addresses: [ "{{ internal_ip }}/24", "{{ internal_ipv6 }}/64" ] become: yes - name: Apply netplan configuration command: netplan apply become: yes
I'm wondering if the issue is perhaps traffic on the 192.168.100.x range is trying to route to internet addresses, and perhaps failing because it is going out the wrong interface'?
This actually isn't looking like a network issue any longer to me. Even though I am having tls issues with a dnsutils image, with a curl image I can download from an https site without issues.
Looking at the results of an argocd app sync command, it always seems to fail on the IPv6 interface on the loopback address. The error I see is below. Can I disable IPv6 in K3s to rule this out? FATA[0001] rpc error: code = FailedPrecondition desc = error resolving repo revision: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [:1]58731: connect: connection refused"
p
you can disable IPv6 on the node with sysctl
net.ipv6.conf.all.disable_ipv6 = 1
n
is there a way to have k3s running on an ipv6-only hetzner vps? I've just installed it and I'm getting
Copy code
failed to do request: Head \"<https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6>\": dial tcp 34.205.13.154:443: connect: network is unreachable"
p
How did you configure the DNS. You should use DNS64 and NAT64 because some public services don't have an IPv6.
n
no thanks I googled multiple times how to setup those things but I didn't understand the topic. Right now it's working with
<http://registry.ipv6.docker.com|registry.ipv6.docker.com>
. If something fundamental breaks, I'll pay for the IPv4