Trying to do a graceful shutdown of my bare metal nodes. When running the rke2 killall script on each node, then I shutoff. When I turn them back on, the cluster is back up but the rook ceph pods are very unhappy. I found a GitHub issue but it hasn’t been updated in a while. Any help?
09/22/2022, 8:01 PM
killall is about the worst way to do that. It doesn’t let anything shut down cleanly, it just `kill -9`s all the pods. You should drain/cordon the node first, THEN killall if you absolutely must. Although just doing a normal shutdown and letting init send sigterm before killing things would be better.
09/23/2022, 11:49 AM
In my experience a rough shutdown of ceph requires a bit of patience for it to restart and clean itself up.