https://rancher.com/ logo
Title
b

breezy-autumn-81048

03/31/2023, 2:37 PM
Hi all, I installed a K3S (k3s version v1.26.2+k3s1 (ea094d1d)) on the node1. After that was trying to add additional master and agent nodes. However, unsuccessfully. It shows that k3s service is active and running, but: on the agent node, I run:
INSTALL_K3S_SKIP_DOWNLOAD=true K3S_TOKEN=SECRET INSTALL_K3S_EXEC="--resolv-conf=/etc/rancher/k3s/resolv.conf --selinux" sh install.sh agent --server <https://node1.example.net:6443>
and it seems that it didn't configure it as an agent node, but as a master:
k3s kubectl get nodes
NAME                   STATUS   ROLES                  AGE     VERSION
<http://node4.example.net|node4.example.net>   Ready    control-plane,master   2m20s   v1.26.2+k3s1
It should be added to the node1 and the role should be an "agent" On the node1 (master):
k3s kubectl get nodes
NAME                   STATUS   ROLES                       AGE    VERSION
<http://node1.example.net|node1.example.net>   Ready    control-plane,etcd,master   3d4h   v1.26.2+k3s1
on the rest 2 master nodes I run:
INSTALL_K3S_SKIP_DOWNLOAD=true K3S_TOKEN=SECRET INSTALL_K3S_EXEC="--resolv-conf=/etc/rancher/k3s/resolv.conf --selinux" sh install.sh server --server <https://node1.example.net:6443>
But again, these nodes were not added to the main node1. Is it a bug?
h

hundreds-evening-84071

03/31/2023, 3:27 PM
have you looked at
journalctl -u k3s
on additional nodes that you are trying to join?
b

breezy-autumn-81048

04/03/2023, 5:54 AM
Hi @hundreds-evening-84071, Sure, here are the logs from one of the nodes that should be an additional master node in the cluster:
: desc = \"transport: Error while dialing dial tcp 10.154.146.193:2379: connect: no route to host\""}
ERRO[0023] Failed to get member list from etcd cluster. Will assume this member is already added
INFO[0023] Starting etcd to join cluster with members [node1.example.net-e0ce15fe=<https://10.154.146.193:2380> node3.example.net-f5c8229b=<https://10.158.146.17:2380>]
{"level":"info","ts":"2023-04-03T07:49:14.530+0200","caller":"embed/etcd.go:124","msg":"configuring peer listeners","listen-peer-urls":["<https://10.158.146.17:2380>","<https://127.0.0.1:2380>"]}
{"level":"info","ts":"2023-04-03T07:49:14.530+0200","caller":"embed/etcd.go:482","msg":"starting with peer TLS","tls-info":"cert = /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.crt, key = /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.key, client-cert=, client-key=, trusted-ca = /var/lib/rancher/k3s/server/tls/etcd/peer-ca.crt, client-cert-auth = true, crl-file = ","cipher-suites":[]}
{"level":"info","ts":"2023-04-03T07:49:14.530+0200","caller":"embed/etcd.go:132","msg":"configuring client listeners","listen-client-urls":["<https://10.158.146.17:2379>","<https://127.0.0.1:2379>"]}
{"level":"info","ts":"2023-04-03T07:49:14.531+0200","caller":"embed/etcd.go:306","msg":"starting an etcd server","etcd-version":"3.5.5","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.19.6","go-os":"linux","go-arch":"amd64","max-cpu-set":4,"max-cpu-available":4,"member-initialized":false,"name":"node3.example.net-f5c8229b","data-dir":"/var/lib/rancher/k3s/server/db/etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/lib/rancher/k3s/server/db/etcd/member","force-new-cluster":false,"heartbeat-interval":"500ms","election-timeout":"5s","initial-election-tick-advance":true,"snapshot-count":10000,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["<http://localhost:2380>"],"listen-peer-urls":["<https://10.158.146.17:2380>","<https://127.0.0.1:2380>"],"advertise-client-urls":["<https://10.158.146.17:2379>"],"listen-client-urls":["<https://10.158.146.17:2379>","<https://127.0.0.1:2379>"],"listen-metrics-urls":["<http://127.0.0.1:2381>"],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"node1.example.net-e0ce15fe=<https://10.154.146.193:2380>,node3.example.net-f5c8229b=<https://10.158.146.17:2380>","initial-cluster-state":"existing","initial-cluster-token":"etcd-cluster","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":true,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
{"level":"info","ts":"2023-04-03T07:49:14.532+0200","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/var/lib/rancher/k3s/server/db/etcd/member/snap/db","took":"1.098286ms"}
INFO[0023] Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:6443/v1-k3s/readyz>: 500 Internal Server Error
{"level":"warn","ts":"2023-04-03T07:49:14.536+0200","caller":"etcdserver/cluster_util.go:79","msg":"failed to get cluster response","address":"<https://10.154.146.193:2380/members>","error":"Get \"<https://10.154.146.193:2380/members>\": dial tcp 10.154.146.193:2380: connect: no route to host"}
{"level":"info","ts":"2023-04-03T07:49:14.544+0200","caller":"embed/etcd.go:371","msg":"closing etcd server","name":"node3.example.net-f5c8229b","data-dir":"/var/lib/rancher/k3s/server/db/etcd","advertise-peer-urls":["<http://localhost:2380>"],"advertise-client-urls":["<https://10.158.146.17:2379>"]}
{"level":"info","ts":"2023-04-03T07:49:14.545+0200","caller":"embed/etcd.go:373","msg":"closed etcd server","name":"node3.example.net-f5c8229b","data-dir":"/var/lib/rancher/k3s/server/db/etcd","advertise-peer-urls":["<http://localhost:2380>"],"advertise-client-urls":["<https://10.158.146.17:2379>"]}
FATA[0023] ETCD join failed: cannot fetch cluster info from peer urls: could not retrieve cluster information from the given URLs
# nc 10.154.146.193 2380
Ncat: No route to host.
# nc 10.154.146.193 2379
Ncat: No route to host.
But what does that really mean? On the K3S version 1.24, I could add nodes to the cluster without any problems at all.
iptables exists only on master node1, firewall rules were created in the internal system to allow TCP traffic over ports 2379 and 2380. Other ports:
nc -v 10.154.146.193 6443
Ncat: Version 7.70 ( <https://nmap.org/ncat> )
Ncat: Connected to 10.154.146.193:6443.

nc -vu 10.154.146.193 8472
Ncat: Version 7.70 ( <https://nmap.org/ncat> )
Ncat: Connected to 10.154.146.193:8472.

nc -vu 10.154.146.193 10250
Ncat: Version 7.70 ( <https://nmap.org/ncat> )
Ncat: Connected to 10.154.146.193:10250.

nc -vu 10.154.146.193 2380   #UDP
Ncat: Version 7.70 ( <https://nmap.org/ncat> )
Ncat: Connected to 10.154.146.193:2380.

nc -v 10.154.146.193 2380
Ncat: Version 7.70 ( <https://nmap.org/ncat> )
Ncat: No route to host.

nc -v 10.154.146.193 2379
Ncat: Version 7.70 ( <https://nmap.org/ncat> )
Ncat: No route to host.

nc -vu 10.154.146.193 2379   #UDP
Ncat: Version 7.70 ( <https://nmap.org/ncat> )
Ncat: Connected to 10.154.146.193:2379.
51820 - is not in use.
nc -v 10.154.146.193 10250
Ncat: Version 7.70 ( <https://nmap.org/ncat> )
Ncat: No route to host.
@creamy-pencil-82913
c

creamy-pencil-82913

04/03/2023, 7:57 AM
Why don't you have a route to that host?
I can't answer that for you. This is a basic networking issue. Why doesn't this node have a route to other nodes?
b

breezy-autumn-81048

04/03/2023, 8:05 AM
Thank you for your reply! Just to clarify, K3S doesn't create any rules that could block network traffic?
Would be nice to understand if is there any need to modify default iptables that K3S creates.
c

creamy-pencil-82913

04/03/2023, 9:18 AM
I've never seen anyone need to do that, no
b

breezy-autumn-81048

04/03/2023, 2:04 PM
So, currently have two K3S clusters: one with version v1.26 and another with v1.24. 1.26 is compatible with RHEL8.7, but somehow I can't add nodes to the master node, I get "*no route to host*". 1.24 is not compatible with RHEL8.7. However, I could successfully create a cluster with v1.24 and add nodes. But now the problem is that the actions-runner-controller can't access GHES:
"ERROR runner Failed to get new registration token {"runner": "github-actions-runner-small-001-f6qfz-49wvb", "error": "failed to create registration token: Post "<https://test-github.example.net/api/v3/orgs/myorg/actions/runners/registration-token/>": could not refresh installation id 5's token: could not get access_tokens from GitHub API for installation ID 5: dial tcp: lookup <http://test-github.example.net|test-github.example.net>: i/o timeout"}"
The same thing with cert-manager:
E0403 13:54:46.663175       1 reflector.go:140] <http://k8s.io/client-go@v0.26.0/tools/cache/reflector.go:169|k8s.io/client-go@v0.26.0/tools/cache/reflector.go:169>: Failed to watch *v1.Secret: failed to list *v1.Secret: Get "<https://10.43.0.1:443/api/v1/namespaces/cert-manager/secrets?fieldSelector=metadata.name%3Dcert-manager-webhook-ca&resourceVersion=49236>": dial tcp 10.43.0.1:443: i/o timeout
On version 1.26 cert-manager and ARC work fine, without any issues. The only problem there is that I can't add nodes to the cluster. iptables and firewalld are not installed.
c

creamy-pencil-82913

04/03/2023, 9:30 PM
somehow I can’t add nodes to the master node, I get “*no route to host*”.
I don’t think it has anything to do with the version. It has to do with the nodes you’re trying to add not being able to reach the node you’re trying to add them to
Although the other errors appear to be coming from pods, not from k3s itself trying to join the cluster? You’ve got a lot going on here. Maybe try to solve one problem at a time?
b

breezy-autumn-81048

04/06/2023, 11:42 AM
Thank you for your reply and help! The issue has been solved after the host restart. Most probably something was cached or stuck and thus I couldn't add nodes to the cluster...