This message was deleted.
# longhorn-storage
a
This message was deleted.
b
I'm also interested for an answer
n
The approach I'm currently thinking of is indeed the approach you said, and it's similar to the approach mentioned in the document: Updating the Node OS or Container Runtime.
The method that can speed up the rebuilding first is to adjust the
Replica Replenishment Wait Interval
(doc ) first.
In addition, in 1.4.0, there is also a performance improvement for rebuilding(#4783) , and there is a new improvement(#5002) plan to replace the original HTTP method with other data transfer protocols to improve the performance of rebuilding.
What version number are you currently using?
n
we are experiencing the same problem with 1.5.1 LH in our case, we do not exceed the default Replica Replenishment Wait Interval (with the node drain state) so increasing it would not help with the problem
a
We are at 1.3.2. I actually reduced the replenishment wait interval so the new replicas would start building sooner, since I was waiting for that to finish before draining and rebooting the next node. Many of our nodes are pretty full and Unschedulable, if I get the node back up within the Replinishment wait interval will it still rebuild the replica on that same node even if it's unschedulable?
With a little testing it looks like Longhorn will not rebuild the failed replica on the same node when it comes back up if it is Unschedulable. At least half of our nodes are unschedulable so that's unfortunate.