This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

01/03/2023, 2:03 PM

This message was deleted.

rich-cartoon-70161

01/03/2023, 2:03 PM

For some reason when I stop 2 of the 3 the last control plane node also becomes also unhappy because of health checks to the others

sticky-summer-13450

01/03/2023, 2:23 PM

Once your HA cluster doesn't have quorum any more (i.e less than 50% of the control-plane nodes are available) it cannot continue.

rich-cartoon-70161

01/03/2023, 2:24 PM

Jeah, the question is… can I somehow reconfigure it to allow it to only have 1 control-plane node?

sticky-summer-13450

01/03/2023, 2:24 PM

I don't know whether you can remove the two nodes from the cluster, but it feels like you're in a catch-22 situation

rich-cartoon-70161

01/03/2023, 2:50 PM

you read my issue in the post before? jeah… it pretty much feels like it. and I thought downgrading again to single-control-plane would make things easier… and then adding server nodes again later.

sticky-summer-13450

01/03/2023, 4:07 PM

I had skimmed your issue previously, but it wasn't an issue I'd noticed.

sticky-summer-13450

01/03/2023, 4:08 PM

I've tried looking for FAQs about not being able to scale down from a 3 node HA to a single node, but I haven't found anything yet. Maybe it was said in Slack and has disappeared.

creamy-pencil-82913

01/03/2023, 6:37 PM

starting a node with --cluster-reset will make it the only member of the etcd cluster. At that point you could delete and rejoin the other nodes as agents.

rich-cartoon-70161

01/03/2023, 6:40 PM

So I would stop all three server instances and then restart one with cluster-reset and it’ll still have the data but as single control plane node? That sounds good and might make me solve the issue I posted as GitHub issue above :D

rich-cartoon-70161

01/03/2023, 6:41 PM

Would I have to recreate all agents again or would they automatically be joined again if the token stays the same?

rich-cartoon-70161

01/03/2023, 6:42 PM

I’ll try this out tomorrow in a test Cluster :) but thanks that’s a good lead

creamy-pencil-82913

01/03/2023, 6:47 PM

stop the k3s service on all three nodes, run

k3s server --cluster-reset

on one of them, then start the service again.

creamy-pencil-82913

01/03/2023, 6:47 PM

then reinstall the other servers as agents

creamy-pencil-82913

01/03/2023, 6:48 PM

and yes, this keeps all the datastore data

sticky-summer-13450

01/03/2023, 6:49 PM

Wow - that really is magic.

creamy-pencil-82913

01/03/2023, 6:58 PM

That flag is documented in passing at https://docs.k3s.io/backup-restore#options

rich-cartoon-70161

01/04/2023, 1:38 PM

This seems to be a feasible way out of my checkmate-situation 🙂 downscaling the control-plane to 1 server, changing the important config flags and then adding more servers again seems to work fine.

rich-cartoon-70161

01/05/2023, 10:35 AM

@creamy-pencil-82913 thanks again for the hint with the

--cluster-reset

- that was the missing link 😄 now I can fix all my k3s clusters properly. (at least bring them to 1.23 😄 let’s see what the future brings)

131 Views

Open in Slack

Previous Next