https://rancher.com/ logo
Title
a

ambitious-computer-67727

03/29/2023, 2:50 PM
Hey guys, I've can't get my pods in K3D to communicate with services or other pods. I can access external services like google.com fine. All my services are running on
ClusterIP
. I can access one of my services (dashboard below) on port 80 from a pod inside the cluster but nothing else, no matter if I try the cluster-IP of the service, pod IP or node IP. Nslookup translates into the ClusterIP and seems fine. Traceroutes ends at host.k3d.internal. I can't netcat or curl anything other than port 80. Do I have to do some network specific settings to get ClusterIP communication to work in K3D/K3s? Can anybody give me any pointers? How I created the cluster:
k3d cluster create --config config.yaml
config.yaml:
apiVersion: <http://k3d.io/v1alpha4|k3d.io/v1alpha4> # this will change in the future as we make everything more stable
kind: Simple
servers: 1
agents: 1
ports:
  - port: 8099:80
    nodeFilters:
      - loadbalancer
I can access the dashboard service on port 80 from a container in the cluster:
bash-5.1# nc -zv  dashboard-localdev 80
dashboard-localdev (10.43.84.72:80) open

bash-5.1# nslookup dashboard-localdev
Server:		10.43.0.10
Address:	10.43.0.10#53

Name:	dashboard-localdev.link.svc.cluster.local
Address: 10.43.84.72
I can't access the postgres service (or any other service I've tried) on another port:
bash-5.1# nc -zv postgresdb 5432
bash-5.1# 

bash-5.1# psql -h postgresdb -U postgres localdev_db
psql: error: connection to server at "postgresdb" (10.43.164.110), port 5432 failed: Connection refused
	Is the server running on that host and accepting TCP/IP connections?
Coredns is throwing some warnings but seems healthy:
❯ kubectl -n kube-system get po | grep coredns
coredns-597584b69b-wwkr6                  1/1     Running     0          52m

❯ kubectl -n kube-system logs coredns-597584b69b-wwkr6
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
[INFO] plugin/ready: Still waiting on: "kubernetes"
.:53
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
[INFO] plugin/reload: Running configuration SHA512 = b941b080e5322f6519009bb49349462c7ddb6317425b0f6a83e5451175b720703949e3f3b454a24e77f3ffe57fd5e9c6130e528a5a1dd00d9000e4afd6c1108d
CoreDNS-1.9.4
linux/amd64, go1.19.1, 1f0a41a
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
....
Some more info from the container I'm trying to access the postgres service or pod from:
bash-5.1# nslookup postgresdb
Server:		10.43.0.10
Address:	10.43.0.10#53

Name:	postgresdb.link.svc.cluster.local
Address: 10.43.164.110

bash-5.1# iproute
default via 10.42.1.1 dev eth0 
10.42.0.0/16 via 10.42.1.1 dev eth0 
10.42.1.0/24 dev eth0 scope link  src 10.42.1.15

bash-5.1# cat /etc/resolv.conf 
search link.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.43.0.10
options ndots:5

bash-5.1# traceroute postgresdb
traceroute to postgresdb (10.43.164.110), 30 hops max, 46 byte packets
 1  10.42.1.1 (10.42.1.1)  0.014 ms  0.017 ms  0.016 ms
 2  host.k3d.internal (172.29.0.1)  0.014 ms  0.017 ms  0.014 ms
 3  *  *  *
Services:
❯ kubectl get svc
NAME                           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                        AGE
postgresdb                     ClusterIP   10.43.164.110   <none>        5432/TCP                       13m
dashboard-localdev             ClusterIP   10.43.84.72     <none>        80/TCP                         13m
Pods:
❯ kubectl get po -o wide
NAME                                            READY   STATUS    RESTARTS   AGE     IP           NODE                            NOMINATED NODE   READINESS GATES
dashboard-localdev-6cf4fcf66b-ctdg9             1/1     Running   0          16m     10.42.1.9    k3d-linklocalcluster-agent-0    <none>           <none>
postgresdb-d78d6cbf6-cmdmm                      1/1     Running   0          16m     10.42.0.11   k3d-linklocalcluster-server-0   <none>           <none>
multitool                                       1/1     Running   0          6m59s   10.42.1.15   k3d-linklocalcluster-agent-0    <none>           <none>
Nodes:
❯ kubectl get nodes -o wide
NAME                            STATUS   ROLES                  AGE   VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE   KERNEL-VERSION      CONTAINER-RUNTIME
k3d-linklocalcluster-agent-0    Ready    <none>                 16m   v1.25.7+k3s1   172.29.0.3    <none>        K3s dev    5.19.0-35-generic   <containerd://1.6.15-k3s1>
k3d-linklocalcluster-server-0   Ready    control-plane,master   16m   v1.25.7+k3s1   172.29.0.2    <none>        K3s dev    5.19.0-35-generic   <containerd://1.6.15-k3s1>
c

creamy-pencil-82913

03/29/2023, 4:13 PM
a

ambitious-computer-67727

03/30/2023, 6:10 AM
Thanks @creamy-pencil-82913. Would the solution be to pass the
--prefer-bundled-bin
flag to K3s like below? Gave it a try but it didn't make any difference. I tried a bit of different syntax like --prefer-bundled-bin=true etc as well. config.yaml:
options:
  k3s:
    extraArgs:
      - arg: --prefer-bundled-bin
        nodeFilters:
          - server:*
          - agent:*
Versions I'm running:
❯ k3d --version
k3d version v5.4.9
k3s version v1.25.7-k3s1 (default)

❯ iptables --version
iptables v1.8.7 (nf_tables)
c

creamy-pencil-82913

03/30/2023, 6:19 AM
no, just use 1.25.8
👍 1
a

ambitious-computer-67727

03/30/2023, 7:13 AM
I'm running 1.25.8 on the nodes now (set
image: rancher/k3s:v1.25.8-k3s1
in my config.yaml) but still can't communicate between services in the cluster except with the one that has port 80 exposed.
bash-5.1# nc -zv postgresdb 5432
bash-5.1# nc -zv dashboard-localdev 80
dashboard-localdev (10.43.151.120:80) open
I can connect fine to external services:
bash-5.1# nc -zv <http://google.com|google.com> 443
<http://google.com|google.com> (142.250.74.142:443) open
Do you have any other ideas for what I could look into @creamy-pencil-82913?
❯ kubectl  get nodes
NAME                            STATUS   ROLES                  AGE     VERSION
k3d-linklocalcluster-server-0   Ready    control-plane,master   3m49s   v1.25.8+k3s1
k3d-linklocalcluster-agent-0    Ready    <none>                 3m43s   v1.25.8+k3s1
c

creamy-pencil-82913

03/30/2023, 7:52 AM
a couple nc commands with no output doesn’t really provide much info to work off of
a

ambitious-computer-67727

03/30/2023, 8:04 AM
I was thinking if there was some strange setting on my computer, but I have replicated it across both my home and work computer running Ubuntu 22.04.2. As k3s runs on my docker network, could it be some setting in my host machines iptables, /etc/resolv.conf, /etc/hosts or something similar that is messing up the interpod communication in k3s? DNS seems to resolve fine in the cluster but how should the routes look like from inside the pods for example? This is how it looks like for me:
bash-5.1# ip route
default via 10.42.1.1 dev eth0 
10.42.0.0/16 via 10.42.1.1 dev eth0 
10.42.1.0/24 dev eth0 proto kernel scope link src 10.42.1.15
c

creamy-pencil-82913

03/30/2023, 8:08 AM
I would probably start a little less low-level. Does the service have pods running? Are they on the same nodes? Can you reach the ports directly on any of the pods directly if you bypass the service?
a

ambitious-computer-67727

03/30/2023, 8:09 AM
I've tried it all: • All services have pods running successfully • I've tried running the services both on different nodes and on the same, it doesn't make any difference • Can't reach the ports on the pods directly • Can't reach the ports on the host (but they are not exposed as nodeports either)
c

creamy-pencil-82913

03/30/2023, 8:10 AM
did you customize your docker networking or anything?
I’m not sure exactly how k3d sets up networking between the nodes but I don’t generally hear anyone complaining about it not working.
a

ambitious-computer-67727

03/30/2023, 8:11 AM
No far from, I'm no network expert but I have a few docker networks up and running from running some apps locally using docker-compose:
❯ docker network ls
NETWORK ID     NAME                   DRIVER    SCOPE
0b7f8d70e980   bridge                 bridge    local
d66b00cbe5c7   data-api_app-network   bridge    local
7be6054f08fe   host                   host      local
a7f9b31a8512   k3d-linklocalcluster   bridge    local
5dcc66fe03dd   none                   null      local
I've previously deleted all the ones docker would allow me to but it didn't make any difference
Agree it's really strange, pod communication is such a core feature so it feels like I'm missing something really basic
c

creamy-pencil-82913

03/30/2023, 8:12 AM
did you deploy anything other than just your DB?
a

ambitious-computer-67727

03/30/2023, 8:14 AM
Yeah I'm trying to get our app running in K3s (that is currently running on AKS) so I have 6 services running; the postgresdb, redis, rabbitmq and some internal node.js apps
c

creamy-pencil-82913

03/30/2023, 8:15 AM
did you deploy any network policies?
a

ambitious-computer-67727

03/30/2023, 8:16 AM
Ah I did actually, one moment that could be it
c

creamy-pencil-82913

03/30/2023, 8:17 AM
you might delete all the NPs just to be sure they’re not overly generous in what pods they’re matching
and then if that works re-add them one by one and see when things break
a

ambitious-computer-67727

03/30/2023, 8:19 AM
Thanks Brad you're a genius, I had completely overlooked that I deployed a NP that was interfering
Thanks a ton for your help
c

creamy-pencil-82913

03/30/2023, 8:21 AM
np, gl!