This message was deleted.
# cluster-api
a
This message was deleted.
w
In single-node clusters, setting
maxSurge: 0
means the current node must be deleted before a new one can be provisioned. If
spec.agentConfig.additionalUserData.data
is even slightly different in formatting between the
rke2controlplane
and the rendered
rke2config
(e.g., due to whitespace or quote style), it triggers a change in the machine spec. Since the cluster sees it as an outdated spec and no extra node can be created (because
maxSurge: 0
), it deletes the only node — causing the cluster to go down. Recommendation: Temporarily set
maxSurge: 1
and
maxUnavailable: 0
to allow the new node to come up before the old one is removed. That will prevent this destructive behavior and let you validate if the issue is caused by spec drift.
m
Thank you I will try that. I paused machine reconciliation immediately after it came up to prevent deletion and I could see that there were slight whitespace differences in the rendered rke2config and the rke2controlplane. What I don't understand is why there are differences only the first time it renders the config for the initial node. Maybe a different code path?