https://rancher.com/ logo
Title
c

colossal-action-96650

12/27/2022, 3:44 PM
Hello, Background: 13 node bare-metal Raspberry Pi 4 based cluster, 1 master, 12 workers. Running K3s v1.24.8+k3s1 for compatibility with Rancher management UI. I’m doing this to learn Kubernetes. I’m trying to add 2 more master nodes. After working through various goofs of my own (failed to match K3s version and launching with the same flags as the original master), I believe I got all of that straightened out. The respective .service files indicate as such, anyway. However, when I attempt to add these 2 nodes as master, I get the following error in the logs:
2477 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, [invalid bearer token, square/go-jose: error in cryptographic primitive]]"
And, they never show up in the output of
kubectl get nodes
I’ve Googled that error and find many bug reports, posts, etc. but no definitive answer or solution. I’ve wiped out these new nodes and started fresh, but still get the same error. Any advice would be appreciated.
c

creamy-pencil-82913

12/27/2022, 8:11 PM
That suggests that the other nodes have invalid certs, probably not shared with the first one. Are you using embedded etcd, or an external DB?
I suspect you may have started them up as standalone servers at some point, and they are running pods with secrets from that configuration, instead of from the desired cluster
I would (one at a time) run the uninstall script, delete them from the cluster, then reinstall and rejoin.
c

colossal-action-96650

12/27/2022, 8:37 PM
thanks for the reply! i am [attempting] to run embedded etcd i have uninstalled using the script and reinstalled numerous times to no avail. i have also verified the absence of
/var/lib/rancher/k3s
which is the only place I'm aware of that contains the certificates
curl -sfL <https://get.k3s.io> | INSTALL_K3S_VERSION=v1.24.8+k3s1 INSTALL_K3S_EXEC="--disable traefik --disable servicelb --kube-controller-manager-arg bind-address=0.0.0.0 --kube-proxy-arg metrics-bind-address=0.0.0.0 --kube-scheduler-arg bind-address=0.0.0.0 --etcd-expose-metrics true --kubelet-arg containerd=/run/k3s/containerd/containerd.sock" K3S_TOKEN=[token] sh -s - server --server <https://10.3.14.2:6443>
is the latest version if the command I used to install K3s. I see
server
and
--server
in there, which would I need to eliminate. I was just trying to replicate the .service file flags as closely as I could
c

creamy-pencil-82913

12/27/2022, 8:52 PM
something running somewhere in the cluster has a serviceaccount token that was signed by a different key. That’s all I can tell from that error. The most common cause of that is one of the control-plane nodes having a serviceaccount issuer key that is out of sync with the others.
c

colossal-action-96650

12/27/2022, 9:05 PM
right now there is just the one etcd/controlplane node. in my Googlings, about the only suggestion I found was to delete all service accounts and restart K3s, which I did, but again, no luck. If my install command has a flaw, that could explain why that didn't work. Is the command above good or should I remove one of the
server
flags? if so, which one?
c

creamy-pencil-82913

12/28/2022, 12:46 AM
there are not two server flags. You have
server
as in
k3s server
which is a command… and then you have
--server=<https://x:6443>
which is a flag that tells it where the existing server is.
c

colossal-action-96650

12/28/2022, 3:25 PM
ok, so the command is good? any advice how to proceed? again, I'm fairly new and trying to learn. Would rotating the certificates work maybe?