This message was deleted.
# k3s
a
This message was deleted.
c
sounds like the sqlite file got corrupted perhaps? Did you run out of disk space?
mostly just sounds like you’re missing some rows from the database
b
I don't think so, but I might look around
c
are you using sqlite or an external db?
b
sqlite
c
yeah… were there any interesting messages at the start of the log?
both of those indicate that critical keys are missing from the datastore.
which probably means the sqilte file was corrupted when the node crashed
b
Copy code
server1:/var/lib/rancher/k3s # /usr/local/bin/k3s server --flannel-backend none --disable-network-policy --token K106e0d23ed626d7fa3d9113a8f50d806f584c63e3a52325122bfc6f2d2d173abb4::server:7187c735c7335ea1ac123899986909eb 
INFO[0000] Starting k3s v1.27.9+k3s1 (2c249a39)         
INFO[0000] Configuring sqlite3 database connection pooling: maxIdleConns=2, maxOpenConns=0, connMaxLifetime=0s 
INFO[0000] Configuring database table schema and indexes, this may take a moment... 
INFO[0000] Database tables and indexes are up to date   
INFO[0000] Kine available at <unix://kine.sock>           
ERRO[0000] TTL event watch failed to get start revision 
INFO[0000] Bootstrap key already locked - waiting for data to be populated by another server 
ERRO[0000] TTL event watch failed to get start revision
I was able to make a full backup copy of the /data directory to the same disk when I started looking into it, so I don't think I am out of space
But corrupted sounds plausible
Any k3s specific advice, or should I just look at trying to fix the sqlite database?
c
yeah you might grab the sqlite cli tools and poke at it :/
in general it is pretty robust, I’ve only seen it get trashed a few times and usually there is some underlying storage issue, like bad sd cards on a Raspberry Pi or something.
b
I ran the
.recovery
command and dumped that back, now I'm getting a different error.
Copy code
FATA[0000] starting kubernetes: preparing server: creating storage endpoint: starting kine backend: rpc error: code = InvalidArgument desc = etcdserver: duplicate key given in txn request
c
sounds like you now have some duplicate rows. You might check for entries with the same
id, name
or
name, prev_revision
?
b
What are the wal and shm db files for? Should I be looking at them?
c
sqlite internal stuff
b
got it
Thanks for the help, I think I'm just going to have to roll back to my backups from last month