https://rancher.com/ logo
Title
s

sticky-truck-78998

05/12/2022, 5:42 AM
Has this burned anyone yet? We just found this setting. I hope it solves the issue for us. Lost a node (1of3)today and manually fixing stuck pods sucks. Luckily our admin was around when it happened. 3 hours later restored.
I feel like this should be highlighted a bit more. We have had no issues with longhorn unless a node had failed previously. We started testing LH a year ago and has been great. This issue(node fail)has caught us off guard a few times before but, thought it was related to something else. I'm curious what version this setting was implemented?
f

famous-journalist-11332

05/14/2022, 2:16 AM
Does turning on the setting help your case?
s

sticky-truck-78998

05/14/2022, 2:21 AM
We just finished testing today with VM longhorn environment with 80 pods. Seems to have fixed the problem. Takes a minimum of 5 minutes for it to restore. Working through how to handle pods that don't allow replica sets. I guess live with the downtime in the event of node failure. Don't expect that to happen much.
f

famous-journalist-11332

05/14/2022, 2:22 AM
Thank you for the feedback
Working through how to handle pods that don’t allow replica sets
Yeah, Longhorn cannot auto delete these pods because if Longhorn does, no one will recreate them
Why cann’t you putting those pods in a statefulset or deployment?
w

white-battery-15789

05/14/2022, 2:35 AM
hello, our problems is more related to ReadWriteMany, some pods that need to share some volume, and we have to set affinity to them in order to have both on same node
as longhorn doesnt support it we have to use RWO
f

famous-journalist-11332

05/14/2022, 2:36 AM
I see
w

white-battery-15789

05/14/2022, 2:36 AM
but the setting helped us, still testing with more containers, to make sure
but back when we had that problem, enabling the setting didnt solved, i had to enable it and force kubernetes to delete the pod
only after delete all pods, longhorn recovered and started to attach new volumes
also i had some timeout attaching pods, i had to do some attachment manually
f

famous-journalist-11332

05/14/2022, 2:38 AM
you might want to take a look at the experiment feature readwritemany https://longhorn.io/docs/1.2.4/advanced-resources/rwx-workloads/
w

white-battery-15789

05/14/2022, 2:38 AM
i had read about that, im planning to try it out
do you guys have some set of setting that is recommended?
f

famous-journalist-11332

05/25/2022, 10:53 PM
Do you mean is there any way to delete the pods are not inside any deployment or statefulset by Longhorn?