We re currently using Longhorn for replicated storage and To Rancher Users #longhorn-storage

We're currently using Longhorn for replicated stor...

future-fountain-82544

07/15/2025, 1:07 PM

We're currently using Longhorn for replicated storage and TopoLVM for storage when replication isn't required (IE: The underlying database does its own replication. Kafka, ElasticSearch, Postgres cluster). A couple pain points with this setup are that we're using multiple CSIs, backup strategies are different, and occasionally it's convenient to re-home a workload to a new node (IE: Someone didn't setup anti-affinity rules when creating a workload) One of our engineers brought up the idea of using a storage class in Longhorn that uses 1 replica and best-effort or strict-local data-locality. Is this a pattern that others are using? Beyond standard "you only have 1 copy of data" are there any safety concerns with this pattern?

bland-article-62755

07/15/2025, 1:42 PM

Strict local can cause issues with migrations timing out. I wouldn't call it a safety concern but it might make migrations take longer. If you're not doing live migrations it might not be a concern.

👍 1

future-fountain-82544

07/15/2025, 2:51 PM

TY; Have you run Longhorn volumes with replicas=1 before? or are you talking about strict-local in general regardless of replica count?

bland-article-62755

07/15/2025, 2:53 PM

Only ran replicas=1 as a test, but strict-local in general and fewer replicas could make things more painful.

bland-article-62755

07/15/2025, 2:54 PM

We keep our LH volumes pretty small as we saw that anything over ~100 Gi tended to really slow down migrations/upgrades on our hardware

bland-article-62755

07/15/2025, 2:55 PM

So we typically do boot volumes only. With it being that small the 3x replicas weren't a big deal. We use ceph rbd for data volumes.

future-fountain-82544

07/15/2025, 2:57 PM

Interesting; Are you using Ceph-RBD via Rook? or something else?

bland-article-62755

07/15/2025, 2:58 PM

I don't think I'd change that, because the replicas for longhorn isn't just insurance against data loss, but all sorts of network/hardware resiliency to keep the VMs running vs having to wait and spin up a new one. Yeah the percona cluster might keep the db in HA but we'd rather not deal with losing a node.

bland-article-62755

07/15/2025, 2:58 PM

We're using an external ceph cluster

👍 1

bland-article-62755

07/15/2025, 2:58 PM

We have different clusters for different classes of hardware depending on the need.

👍 1

future-fountain-82544

07/15/2025, 2:58 PM

For background, we're largely talking about container workloads and not VM workloads.

bland-article-62755

07/15/2025, 2:59 PM

We use it for container workloads too, but that's even more of a reason to stay away from strict local imho.

bland-article-62755

07/15/2025, 2:59 PM

but I guess it depends on your scheduling.

bland-article-62755

07/15/2025, 3:00 PM

Our stuff is on-demand JupyterHub/DataScience workloads for students so speed to launch is big for us.

bland-article-62755

07/15/2025, 3:01 PM

(outside of using longhorn for Harvester)

future-fountain-82544

07/15/2025, 3:01 PM

re: even more of a reason to stay away from stict-local... yeah; I suppose we'll probably need to experiment a bit

bland-article-62755

07/15/2025, 3:02 PM

Doing some sort of network filesystem instead of longhorn might be an easier route, but I guess it depends on your iops requirements.

future-fountain-82544

07/15/2025, 3:10 PM

A lot of the underlying workloads may not work well with network filesystems. Plus we're running an HCI stack, so we're not going to lean on an external storage systems.

future-fountain-82544

07/15/2025, 3:11 PM

We're running k3s+longhorn on top of Proxmox-VE, so I've thought about trying the proxmox-csi.

future-fountain-82544

07/15/2025, 3:12 PM

It's nice to lean on our local NVMe drives for IOPS-heavy workload, but we're just starting to put more IOPS-heavy workloads on kubernetes and trying to grapple with it

future-fountain-82544

07/15/2025, 3:17 PM

Thinking aloud a bit, I'm slightly hesitant to lean on the proxmox-csi because I expect in the future we may invert the "k8s on VM" stack to be "VM on k8s"

famous-journalist-11332

07/16/2025, 12:56 AM

> One of our engineers brought up the idea of using a storage class in Longhorn that uses 1 replica and best-effort or strict-local data-locality. Strict local will make it so that workload pod has to live on that same node as the replica. Pod cannot move best-effort Longhorn will move replica to the pod's node when pod moves. So there will be more rebuilding activity when workload move Both have benefit that replica is on same node as workload thus better performance IMO, you probably good to stick with TopoLVM when replication is not needed. Then use backup solution like velero or kasten to handle multiple CSI providers

future-fountain-82544

07/16/2025, 1:01 AM

IMO, you probably good to stick with TopoLVM when replication is not needed. Then use backup solution like velero or kasten to handle all CSI provider

Any reason in particular that you're thinking of? Disk throughput? Not a situation longhorn designs around?

future-fountain-82544

07/16/2025, 1:17 AM

If data migration from topolvm is a concern, we're only running it on a dev cluster to sort out for we do local storage. We can swap it out pretty easily at this point

famous-journalist-11332

07/16/2025, 2:12 AM

> Any reason in particular that you're thinking of? Disk throughput? Not a situation longhorn designs around? If you don't need replication and your workload already distributed, TopoLVM would be simpler and as fast as native disk performance. Longhorn v2 will address the performance issue but v2 is not GA yet.

future-fountain-82544

07/16/2025, 2:19 AM

We're in a position where we might be able to hold out for a bit for performance gains while the v2 data engine becomes GA in order to use 2 less operators (topolvm+velero), but I'll take both options under consideration.

👍 1

7 Views

Open in Slack

Previous Next