https://rancher.com/ logo
Title
g

gifted-stone-19912

08/17/2022, 2:55 PM
Hi all, for longhorn to use ext4 for the volumes, does the Disk/LVM on the Node (/var/lib/longhorn) also have to be ext4 or can this be xfs and still serve ext4 volumes later on?
i

icy-agency-38675

08/18/2022, 3:13 AM
You can use any filesystem that support sparse files for the node disk. That is, xfs is available.
g

gifted-stone-19912

08/18/2022, 7:56 AM
Thank you for your answer. So it would be no problem to use XFS for the node disk but then use ext4 vor Longhorn Volumes in the StorageClass?
i

icy-agency-38675

08/18/2022, 7:56 AM
XFS for the node disk is OK. For Longhorn volume, we recommend ext4 because ext4 is more stable.
g

gifted-stone-19912

08/18/2022, 7:58 AM
That's correct, we are currently using XFS for our Volumes but experience lots of filesystem corruption, especially when short network outages happen. We had Weave restart at automatic RKE Update and it crashed all our PVC's and ~30% of them were corrupt afterwards
đź‘Ť 1
i

icy-agency-38675

08/18/2022, 7:59 AM
Got it. Using ext4 for LH volumes should be better according to other users’ feedback and our experiences.
g

gifted-stone-19912

08/18/2022, 10:12 AM
Changing StorageClass from xfs to ext4 is possible in production? It will only affect volumes that are created after StorageClass change?
i

icy-agency-38675

08/18/2022, 10:32 AM
No. It only affects volumes that are created after StorageClass change.
cc @famous-shampoo-18483 @famous-journalist-11332 @salmon-doctor-9726 XFS filesystem corruption issues caused by network outages.
đź‘Ť 2
f

famous-shampoo-18483

08/19/2022, 9:47 AM
BTW, if you want to switch to EXT4, you may need to stop the IO for the corresponding volumes then try the migration: https://github.com/longhorn/longhorn/blob/master/examples/data_migration.yaml
đź‘Ť 2
g

gifted-stone-19912

08/25/2022, 7:20 AM
Thank you all for this helpful information. We will try the migration from XFS to EXT4 very soon with our customer
i

icy-agency-38675

09/16/2022, 8:13 AM
@gifted-stone-19912 We are investigating the XFS corruption issue. There are some questions. In your case, what’s the longhorn version? What are the workloads those are using the corrupted volume?
g

gifted-stone-19912

09/19/2022, 6:30 AM
Hi @icy-agency-38675, we mostly have v1.1.1 on the line, planning to upgrade. We experienced the most volume corruption with high i/o workload like mysql or mongodb databases.
We just had an outage on the weekend and the only PVC's that didn't came back up were all mongoDB PVC's. I could repair all of them without data loss. Some of them were even running on 1.2.4 and it seemed that 1.2.4 is overall more stable even with XFS
i

icy-agency-38675

09/20/2022, 3:57 AM
@gifted-stone-19912 Thank for the information. We just identified the root cause of the corruption. FYI. https://github.com/longhorn/longhorn/issues/3597#issuecomment-1251802416
g

gifted-stone-19912

09/20/2022, 7:54 AM
@icy-agency-38675 thank you for this information, sounds absolutely awesome. Just so that I understand correctly: for ext4 < Longhorn 1.3.0 there is no issue currently? Also, when this fix is applied, does this automatically work for all the existing volumes or do we need to create new ones on the new version?
i

icy-agency-38675

09/20/2022, 8:10 AM
for ext4 < Longhorn 1.3.0 there is no issue currently
Yes. According to our test, ext4 in longhorn < 1.3.0 should be fine.
does this automatically work for all the existing volumes
Yes. it will automatically work.
g

gifted-stone-19912

09/27/2022, 7:23 PM
is already known when 1.2.6 is being released? does 1.2.6 contain the fix?
f

famous-journalist-11332

09/27/2022, 8:36 PM
We don’t know the date for 1.2.6 yet but yeah 1.2.6 will contain the fix