Big storage how to go about this in a Harvester Longhorn Ran Rancher Users #harvester

Big storage - how to go about this in a Harvester/...

worried-state-78253

07/06/2025, 1:09 PM

Big storage - how to go about this in a Harvester/Longhorn/Rancher stack? I've a large capacity storage issue > 5 TiB volume in use that has failed a few times and been reading that volumes > 1TiB (even half that size) are not ideal using Longhorn. This is ran from a VM provisioned in harvester with a large replicated filesystem attached. Probably 3-4 times a year it fails, usually due to power cuts or reboots - but when it happens it takes an age to rebuild and XFS repair is usually neeeded. I've also worked out we can use minio operator & minio tenant in a cluster, to utilise a large number of smaller volumes to give you a large sized pool. We are running k8s clusters on our harvester stack. The question is however - what to do about replication? Minio handles this - so I'm thinking of either customising the stack and using DirectPV ( though I'd rather not do this ) , I'm also concerned this adds something additional to install with harvester upgrades to make this available - alternatively I could provision a cluster with nodes 1:1 per host, and attach that way I guess... Alternatively using single replicate Longhorn volumes, in that instance I need to ensure that rancher/harvester know what to do with these volumes. With single replica volumes live migration wont be possible and upgrades to the stack will also fail unless all the pods are shutdown - since the workloads cant be safely migrated. Is there a way to tell rancher and harvester that this particular group of vms can be disrupted - i.e. allowing a given number of them to be shut down while maintenance is happening - as long as the replicas are numerous enough and not on the same node then in theory minio will deal with the fallout. Ultimately we have some 2TiB servers to backup, and we need volume to contain this data - we've about 100TiB capacity. I'd appreciate any thoughts on this and would love to hear how others have implemented this on a harvester / rancher / longhorn stack.

thousands-advantage-10804

07/06/2025, 1:54 PM

Since you had issues, I would off load the storage to an external iscsi/nfs array. Let the array manage the redundancy. You will need v1.5.0 or higher.

thousands-advantage-10804

07/06/2025, 1:55 PM

FYI Pure Storage arrays are designed to handle power outages. There isn’t even a power button. 😄

prehistoric-morning-49258

07/06/2025, 2:00 PM

yeah also hacky but so far ZFS+harvester has been working fine for us. I wouldn't use longhorn for like, "storage storage"

thousands-advantage-10804

07/06/2025, 2:01 PM

over the years I have learned to stay away from the hacky stuff.

🐿️ 1

thousands-advantage-10804

07/06/2025, 2:01 PM

how many servers are you running?

worried-state-78253

07/06/2025, 2:02 PM

Well we can’t do that, we do have an old iscsi array, but our storage is on the nodes. 1.5.0 isn’t stable yet I believe too. This question really is about how to leverage minio which can use many volumes to coalesce as a huge capacity volume, if it can be configured to work with what’s available in a rancher style stack it may be a winner for me. Also since the machines have storage in node it’s fast. We’re running 5 high performance harvester nodes, and a separate ha rancher cluster to manage

prehistoric-morning-49258

07/06/2025, 2:02 PM

ehh not many and we're just testing out ZFS on like ~4. ymmv / do your own due diligence obvz 😉

thousands-advantage-10804

07/06/2025, 2:04 PM

Longhorn is great for taking advantage of the the storage that is on the servers. But for the replications it needs a TON of bandwidth. And as you have seen it does not handle power outages well. I have never seen minio used for volumes. I know it is an s3 service.

prehistoric-morning-49258

07/06/2025, 2:04 PM

this basically says should be directpv: https://min.io/docs/minio/linux/operations/install-deploy-manage/deploy-minio-multi-node-multi-drive.html#storage-requirements

thousands-advantage-10804

07/06/2025, 2:06 PM

oh wow. that is cool.

worried-state-78253

07/06/2025, 2:07 PM

bandwidth - we've got an extra dedicated 10G link used by longhorn - the HA it provides is desirable - but obviously if minio is handling that itself that bandwidth wont be used (using single replicass)

thousands-advantage-10804

07/06/2025, 2:07 PM

How are you connecting harvester to the the minio multi-drive? which csi?

worried-state-78253

07/06/2025, 2:07 PM

I played with minio operator before and created a tennant with several TiB's of stroage and it worked nicely but it was wasteful in resources since far too much replication

worried-state-78253

07/06/2025, 2:08 PM

You just choose a storage class and its dynamically provisioned

thousands-advantage-10804

07/06/2025, 2:08 PM

so you are installing the minio operator on harvester itself.

worried-state-78253

07/06/2025, 2:09 PM

no - in a cluster hosted in vms provisioned by harvester using rancher (rke2)

worried-state-78253

07/06/2025, 2:10 PM

I do have a single node minio vm running as a stop gap for s3 services, but when I last ran this i did it in cluster

thousands-advantage-10804

07/06/2025, 2:10 PM

OH… I thought you were talking about the Harvester level. AVOID using longhorn on the vms. longhorn on longhorn is SLOW.

worried-state-78253

07/06/2025, 2:10 PM

no im not using longhorn in longhorn

worried-state-78253

07/06/2025, 2:11 PM

I exposed the storage class to the cluster so its dynamically provisioned

thousands-advantage-10804

07/06/2025, 2:12 PM

good. slicing and dicing storage 3x3 is not performant. Ah. I still like iscsi/nfs arrays for both Harvester AND the vms as additional service. We have an array that can handle iscsi/nfs and S3. 😄

prehistoric-morning-49258

07/06/2025, 2:14 PM

multi-node+drive minio basically is longhorn so you still wind up longhorn-in-longhorn 🦜

worried-state-78253

07/06/2025, 2:14 PM

my use case is a bit different - since the storage is distrubuted to the vms - each machine has 4TiB NVME, 4TiB SSD and 20 TiB HDD - 5 nodes all identical, 2x10GiB nics, 1 for management, 1 for storage, seperate 1GiB for ingress

worried-state-78253

07/06/2025, 2:16 PM

its not - because in the cluster you'd tell it the storage class to use then its provisioned by harvester using that class - however since minio handles replication it has the similar problem - however longhorn could be run without assuming minio handles itself

thousands-advantage-10804

07/06/2025, 2:16 PM

personally, I like bonding the 2x10gb for mgnt and storage into a single 20gb.

prehistoric-morning-49258

07/06/2025, 2:16 PM

minio says

prehistoric-morning-49258

07/06/2025, 2:16 PM

MinIO does not distinguish drive types and does not benefit from mixed storage types. Each pool must use the same type (NVMe, SSD)

prehistoric-morning-49258

07/06/2025, 2:17 PM

For example, deploy a pool consisting of only NVMe drives. If you deploy some drives as SSD or HDD, MinIO treats those drives identically to the NVMe drives. This can result in performance issues, as some drives have differing or worse read/write characteristics and cannot respond at the same rate as the NVMe drives.

prehistoric-morning-49258

07/06/2025, 2:17 PM

solution is dump longhorn, use bcachefs, then minio 😉

worried-state-78253

07/06/2025, 2:17 PM

thats fine - we can setup seperate tennatnts with seperate sorage profiles if we want fast or slow pools

worried-state-78253

07/06/2025, 2:18 PM

bcachefs? do you mean directfs

worried-state-78253

07/06/2025, 2:18 PM

i guess the operator could be added to the harvester cluster and directfs on the hosts - but then that makes upgrading harvester tricky doesnt it

prehistoric-morning-49258

07/06/2025, 2:18 PM

no bcachefs is sorta new/controlversial fs, but one feature is > A feature request we've had is configurationless tiering, smart tiering of member devices in a filesystem based on performance. This feature will allow easy and simple tiering of devices within a filesystem based on the performance of the device. The effect of this is that it will allow data that is commonly accessed, hot data, stored on the faster drives while data that is not used as often, cold data, will be stored on slower drives. This will increase filesystem performance greatly.

prehistoric-morning-49258

07/06/2025, 2:19 PM

good for mixed storage

worried-state-78253

07/06/2025, 2:20 PM

im not fussed about that to be honest, you can also do soemthing similar in LVM with NVME cache to speed up HDD's we've found that can work well for web servers.

worried-state-78253

07/06/2025, 2:21 PM

What i need here is to get large capacity volumes workign reliably in cluster or virtually to be used as backup targets

prehistoric-morning-49258

07/06/2025, 2:21 PM

yeah there's

csi-driver-lvm

too

worried-state-78253

07/06/2025, 2:23 PM

trouble is i think this would need to be installed on the harvester end wouldn't it, and when you upgrade harvester you'd have to re-install since the complete image is replaced when you upgrade.

thousands-advantage-10804

07/06/2025, 2:24 PM

as of 1.5.0 harvester supports 3rd party CSIs for booting. installing as a CSI means in it not node dependent and will upgrade. 😄

worried-state-78253

07/06/2025, 2:26 PM

nice - this might change things around a bit then! Do we know when 1.5.0 is use to become stable?

thousands-advantage-10804

07/06/2025, 2:27 PM

1.5.1 just came out. lot of fixes.. and improved csi handling. i talked to the my team at Pure Storage that is certifying it, and suse said to wait for 1.5.1.

✅ 1

prehistoric-morning-49258

07/06/2025, 3:24 PM

btw lol "Upgrading Harvester causes the changes to the OS in the

after-install-chroot

stage to be lost. You must also configure the

after-upgrade-chroot

to make your changes persistent across an upgrade. Refer to Runtime persistent changes before upgrading Harvester." https://docs.harvesterhci.io/v1.5/advanced/csidriver/

worried-state-78253

07/06/2025, 3:25 PM

Reminds me I need to get my powerchute in there!

prehistoric-morning-49258

07/06/2025, 3:25 PM

thats the never-ending issue I keep being sad about https://github.com/harvester/harvester/issues/4556

thousands-advantage-10804

07/06/2025, 3:25 PM

that is why I added mutlipathd from kube : https://github.com/clemenko/px-harvester?tab=readme-ov-file#add-multipathd

prehistoric-morning-49258

07/06/2025, 3:27 PM

hah I never heard about this

Copy code

apiVersion: <http://node.harvesterhci.io/v1beta1|node.harvesterhci.io/v1beta1>
kind: CloudInit

prehistoric-morning-49258

07/06/2025, 3:29 PM

yeah its still a problem if you need drivers installed on the host os

prehistoric-morning-49258

07/06/2025, 3:30 PM

I think the "correct" solution is fork harvester-installer?

thousands-advantage-10804

07/06/2025, 3:31 PM

couldn’t you use the cloudinit for installing the drivers?

prehistoric-morning-49258

07/06/2025, 3:55 PM

maybe? e.g. zfs does kernel modules ... to me I'd rather have that stuff baked in an iso than having a cloud-init try to do it on boot on top of the immutable os 😅

worried-state-78253

07/06/2025, 4:06 PM

Cloud init would make sense, if you want to bake in you’ll need to run your own image registry / build - could get quite manual otherwise and protracted each update

prehistoric-morning-49258

07/06/2025, 4:15 PM

yeah you need to be rebuilding them either way ... maybe build/package per-harvester then cloud-init vs. bake into harvester ¯\_(ツ)_/¯

worried-state-78253

07/06/2025, 5:03 PM

@thousands-advantage-10804 said - I have never seen minio used for volumes. I know it is an s3 service. I was referring to the volumes minio uses under the hood to store the data - it is an object store and we can use it for our backups, but we need quite a bit of space. Those volumes dont need replication - it sounds like I need to take volumes out of longhorn and use a direct mount driver then mount to minio so it can manage that storage to make this work. Thats going to be a PITA given we currently run the data of 20TiB drives and they are in use...

thousands-advantage-10804

07/06/2025, 7:05 PM

ah yes. also think about the layers. vms / longhorn / disk….

worried-state-78253

07/06/2025, 8:17 PM

@thousands-advantage-10804 if that’s the primary focus bare metal with rancher management starts to make sense in the context of cloud - we only use harvester because we need some traditional vms it does sometime beg the question does the extra work and overhead justify the means

5 Views

Open in Slack

Previous Next