This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

04/25/2024, 10:08 AM

This message was deleted.

millions-lock-94892

04/25/2024, 10:09 AM

Restarting the k3s service didn't help

bland-article-62755

04/25/2024, 2:36 PM

Sounds like the token expired so it doesn't trust the node anymore.

creamy-pencil-82913

04/25/2024, 3:08 PM

That doesn't sound like the actual terminal error that is preventing it from starting. Please post the full log.

millions-lock-94892

04/25/2024, 3:26 PM

@creamy-pencil-82913 Having a look at the logs, I guess the issue is somewhere here:

Copy code

Apr 25 15:19:00 k3s-node-1 k3s[13037]: time="2024-04-25T15:19:00Z" level=info msg="Applying CRD addons.k3s.cattle.io"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: time="2024-04-25T15:19:00Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:6443/v1-k3s/readyz>: 500 Internal Server Error"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.843673   13037 kubelet_node_status.go:112] "Node was previously registered" node="k3s-node-1"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.843889   13037 kubelet_node_status.go:76] "Successfully registered node" node="k3s-node-1"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.875058   13037 kuberuntime_manager.go:1529] "Updating runtime config through cri with podcidr" CIDR="10.42.0.0/24"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.893932   13037 kubelet_network.go:61] "Updating Pod CIDR" originalPodCIDR="" newPodCIDR="10.42.0.0/24"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.894375   13037 memory_manager.go:170] "Starting memorymanager" policy="None"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.894404   13037 state_mem.go:35] "Initializing new in-memory state store"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.894619   13037 state_mem.go:75] "Updated machine memory state"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.923971   13037 setters.go:568] "Node became not ready" node="k3s-node-1" condition={"type":"Ready","status":"False","lastHeartbeatTime":"2024-04-25T15:19:00Z","lastTransitionTime":"2024-04-25T15:19:00Z","reason":"KubeletNotReady","message":"container runtime status check may not have completed yet"}
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.955484   13037 manager.go:479] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.955685   13037 plugin_manager.go:118] "Starting Kubelet Plugin Manager"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.964115   13037 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="daed2709eead47ee33558a1ffc90dc2949f2af9c9e376cd415b5c9e80ded35f0"
Apr 25 15:19:00 k3s-node-1 k3s[13037]: I0425 15:19:00.989103   13037 serving.go:387] Generated self-signed cert in-memory
Apr 25 15:19:01 k3s-node-1 k3s[13037]: I0425 15:19:01.023900   13037 serving.go:387] Generated self-signed cert in-memory
Apr 25 15:19:01 k3s-node-1 k3s[13037]: I0425 15:19:01.182932   13037 serving.go:387] Generated self-signed cert in-memory
Apr 25 15:19:01 k3s-node-1 k3s[13037]: I0425 15:19:01.205819   13037 kubelet_node_status.go:497] "Fast updating node status as it just became ready"
Apr 25 15:19:01 k3s-node-1 k3s[13037]: I0425 15:19:01.245950   13037 apiserver.go:52] "Watching apiserver"
Apr 25 15:19:01 k3s-node-1 k3s[13037]: time="2024-04-25T15:19:01Z" level=info msg="Applying CRD etcdsnapshotfiles.k3s.cattle.io"
Apr 25 15:19:01 k3s-node-1 k3s[13037]: I0425 15:19:01.306314   13037 kube.go:144] Node controller sync successful
Apr 25 15:19:01 k3s-node-1 k3s[13037]: I0425 15:19:01.306477   13037 vxlan.go:141] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=false
Apr 25 15:19:01 k3s-node-1 k3s[13037]: time="2024-04-25T15:19:01Z" level=info msg="Stopped tunnel to 127.0.0.1:6443"
Apr 25 15:19:01 k3s-node-1 k3s[13037]: time="2024-04-25T15:19:01Z" level=info msg="Connecting to proxy" url="<wss://ip:6443/v1-k3s/connect>"
Apr 25 15:19:01 k3s-node-1 k3s[13037]: time="2024-04-25T15:19:01Z" level=info msg="Proxy done" err="context canceled" url="<wss://127.0.0.1:6443/v1-k3s/connect>"
Apr 25 15:19:01 k3s-node-1 k3s[13037]: I0425 15:19:01.337874   13037 topology_manager.go:215] "Topology Admit Handler" podUID="542d7059-5858-4a44-bc36-8e645c44f276" podNamespace="kube-system" podName="local-path-provisioner-84db5d44d9-lpx2h"
Apr 25 15:19:01 k3s-node-1 k3s[13037]: I0425 15:19:01.337958   13037 topology_manager.go:215] "Topology Admit Handler" podUID="2eb23ab8-30c4-4770-9008-ee9e5ed7fb44" podNamespace="kube-system" podName="metrics-server-67c658944b-s7blc"
Apr 25 15:19:01 k3s-node-1 k3s[13037]: I0425 15:19:01.337997   13037 topology_manager.go:215] "Topology Admit Handler" podUID="46770914-4230-4eb6-a033-b4158a407597" podNamespace="kube-system" podName="coredns-6799fbcd5-7dtdb"
Apr 25 15:19:01 k3s-node-1 k3s[13037]: time="2024-04-25T15:19:01Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"

creamy-pencil-82913

04/25/2024, 4:59 PM

I don’t see that it’s exited. Unless you cut off the end, those look like fairly normal logs from a running k3s server. What makes you think that it’s not joining the cluster?

millions-lock-94892

04/25/2024, 6:16 PM

Because I can’t see it listed as one of the servers, using kubectl get nodes doesn’t show it

creamy-pencil-82913

04/25/2024, 6:18 PM

are you checking that locally, or from another node? Are you using etcd, or an external db?

millions-lock-94892

04/26/2024, 8:17 AM

I get the same output from checking locally, and from one of the working servers, Only two servers are visible which should be three instead I'm using HA Embedded etcd

millions-lock-94892

04/26/2024, 11:20 AM

I ended up destroying the VM and creating a new one, running the joining command on a clean VM worked, the server has joined the rest of the servers

creamy-pencil-82913

04/26/2024, 3:35 PM

if the third server didn’t even show up in the cluster, it sounds like you deleted it at some point? If thats the case you would have had to delete the DB files and pointed it at one of the other cluster members to rejoin.

creamy-pencil-82913

04/26/2024, 3:36 PM

If you’d just left it alone it should have been listed in the cluster as NotReady.

millions-lock-94892

04/26/2024, 3:39 PM

Ah okay, will keep it in mind in case the issue shows up again, thanks 👍

2 Views

Open in Slack

Previous Next