This message was deleted.
# k3s
a
This message was deleted.
c
What node do those IPs correspond to? Error dialing backend usually happens when it's trying to connect to a kubelet to do kubectl exec or logs. Do you have node-ip and node-external-ip set properly? Are you using the built in cloud controller, or a different one?
e
So these are pretty much stock k3s installs. Each node only has one IP. Those IPs correspond to the old machine's IP. So somehow even though I deleted the old node when I added the new node in (with the same name) the old cert was used instead of generating new certs.
c
You didn’t provide any information on when you’re getting that error. What’s the context here?
The apiserver serving cert is stored in the cluster, and is shared across all the servers. The kubelet certs are generated on startup every time so there’s no way it would persist across different hosts. I suspect that what you’re running into is that the serving cert isn’t valid in combination with some other setting you’re overriding.
e
Oh, sorry. Like you said, exec and logs are the two places I've noticed it. It's port 10250 that it's complaining about which is the kublet API, isn't it?
c
Can you provide the full logs from the servers when this error occurs, along with any args or config file context you’re using to set up these nodes?
ok yeah so in that case that’s the kubelet cert
That means you need to look at the node where the pod is running, not necessarily the server
The kubelet cert will be valid for the specified --node-ip and --node-external-ip addresses.
Did you install K3s from our install script or repo, or are you using some distro’s packaging of it? I think I remember seeing someone run into something similar using truenas’s modified install of K3s
e
curl -sfL <https://get.k3s.io> | K3S_TOKEN=SECRET sh -s - server --cluster-init
and friends
c
OK. And for the node the pod you’re trying to exec/logs from, what did you use for --node-ip/node-external-ip when installing on that hose?
e
I didn't
I let it pick it because there is only one anyway
c
ok, but it’s clearly got an old address on it.
What is the output of
kubectl get node <NODENAME> -o yaml
for the node with the pod on it, and what is the output when you run
openssl x509 -noout -text -in /var/lib/rancher/k3s/agent/serving-kubelet.crt
on that node?
Is the
38.x.x.x
address in the error the old or current address of the node with the pod, or the server?
e
the 38.x.x.x is the new VM's IP. The 45.x.x.x is the old VM's IP.
that cert has the correct IP
looking at that node config I don't see anything obvious
Ok, now I am utterly confused. If I point my web browser at the kublet api on that host and look at the cert I get back it has the correct IP address. Where is that error coming from?
so, I'm not connected to that node. I'm using kubectl against a different node. Is that message coming from the node itself? Could it possibly have cached something and is confused? (just spitballing here)
yes
that's it
because I just switched my kubeconfig to the host in question and I can get logs now
268 Views