Long shot, but figure I'll ask anyway. We have bee...
# k3s
a
Long shot, but figure I'll ask anyway. We have been running rancher server on a single node k3s cluster, using MySQL as the backing datastore. The Amazon EC2 instance died and was replaced, but we didn't think to save the
/var/lib/rancher/k3s/server/node-token
file, believing that it was not needed. Indeed, this page has a note saying "Previously, K3s did not enforce the use of a token when using external SQL datastores." But that seems to have changed somewhere along the way. What I'm wondering now is, is there any way to bring up a new replacement k3s node in this single-node cluster without having a token?
k3s-server.log
We had believed that all we needed was the MySQL database. Had no idea that the data inside the database might be encrypted with a token.
c
Nope. You need the token. This is loudly plastered over many docs pages and release notes. https://docs.k3s.io/datastore/backup-restore
WARNING
In addition to backing up the datastore itself, you must also back up the server token file at
/var/lib/rancher/k3s/server/token
. You must restore this file, or pass its value into the
--token
option, when restoring from backup. If you do not use the same token value when restoring, the snapshot will be unusable, as the token is used to encrypt confidential data within the datastore itself.
this includes simply reattaching a new node to an existing database or sqlite database file.
a
Yes, I am seeing that. The person who set up this cluster is no longer with the company, so I'm trying to step in to handle recovery.
It does appear that there once was a time where the token was not necessary, as hinted at by the link I shared.
And it's likely that the people involved didn't notice the new behavior and updated docs
c
it wasn’t that it wasn’t necessary, it is just that it would happily use an empty string as the token. which was bad because then everyone ended up with the same token by default. That was AGES ago.
a
Gotcha.
So is all the data in the external MySQL entirely unusable and irretrievable without a token?
Our team had basically been under the (obviously mistaken) impression that as long as we had the MySQL database, we were fine.
c
If secrets-encryption is not enabled you could try deleting the bootstrap keys from the database. You’ll be essentially doing a hard rotation of the cluster CAs. All your existing kubeconfigs will be invalid and you may have to do some additional cleanup.
a
It is not enabled.
To the best of my knowledge.
c
If secrets encryption is enabled then doing this will make all your existing secrets inaccessible and things will be much more broken.
a
(Can I validate that by looking at the data inside MySQL?)
c
not easily no
a
Ok. So if I can remove the bootstrap keys from the DB and the rotation of the cluster CAs, it would be plausible to bring up a new single k3s node and the rancher server instance that was running on it?
c
You can try deleting all the rows where the key starts with
/bootstrap/
- it’ll create new cluster CAs and then create a new bootstrap key. Do take a new backup before doing that, and make sure k3s isn’t running at the time.
you can set the new token in the config, or just make sure to back up the node-token file that contains the automatically generated token after startup
a
Otherwise I was anticipating that recovery would look something like: 1. Bring up an entirely new k3s and rancher server installation from scratch. 2. Add our EKS cluster previously managed by the old rancher server to the new rancher server. 3. Presumably re-work all the RBAC configurations in all the projects, assuming that all our (github authenticated) users will have different identifiers
Ok, cool. I'll give that a shot!
Thank you for pointing me in the right direction.
c
I suspect you will likely have to poke around cleaning up things that have references to the old CAs and service account tokens. But you should have most of your data accessible.
it’s still there, its just the CA keys, service account issuer keys, and secrets encryption config that is locked behind the token.
a
Ok. Fingers crossed, hoping for the best! Appreciate the insights!
@creamy-pencil-82913 - It worked! We're back up and running right where we left off, and I was able to put the token in our secrets manager and adjust our configuration so that this won't happen again. Thank you SO much for pointing me in the right direction!
🙌 3