I have a v1 PVC on longhorn 1.6.4 that is Size 2Ti...
# longhorn-storage
f
I have a v1 PVC on longhorn 1.6.4 that is Size 2Ti and Actual Size 3.04Ti. Is there a way to trigger a consolidation? The data is pretty much read-only and because the actual size is so big, the replica won't fit anwhere so I'm stuck with a single replica for it.
b
Is it imaged backed?
that might be part of it.
Outside of that, I think you have remove snapshots and do an fstrim
Which I think you can do from inside the OS/pod it's attached to.
f
Filesystem backed. Unfortunately fstrim is not available with talos linux. I'm pretty sure it's the snapshots (one 2Tib and one 543Mib), but I don't know how to tell longhorn to consolidate them.
b
There's a fstrim option from the longhorn ui
There's nothing to consolidate with the snapshots, you just have to delete the ones you don't need
f
There's no delete button?
image.png
b
It's in the volume view.
image.png
image.png
f
Yes, that's trim, and trim doesn't work on talos because it assumes host OS stuff that just doesn't exist.
b
From what I remember I thought it was a pod that was actually doing the executable. Do you mean there's kernel modules or something that's missing from the OS?
You can do the fstrim at the VM level if not.
f
talos linux is an immutable linux distribution specifically designed for running kubernetes and does not have tools installed (no ssh, no shell, nothing), longhorn assumes that fstrim is available on the OS. This is the error message :
Copy code
[nsenter --mount=/host/proc/35862/ns/mnt --net=/host/proc/35862/ns/net fstrim /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/3668e84ba422459eeb2730536e3d1f9f102a617a885922fb055d7539531cb8dc/globalmount], output , stderr nsenter: failed to execute fstrim: No such file or directory : exit status 127
Given that, and the fact that there is no ssh, there's no way to run fstrim on the node (which is bare metal btw, not VM).
b
Yeah I'm familiar with Talos, We were running a Sidero cluster briefly. There CAN be ssh. As far as the VM level, you can do it in the pods as well. We're using longhorn with KubeVirt, but I've done it with other normal pods as well.
`siderolabs/util-linux-tools`: this extension enables linux tool to be available to all nodes. For example, the
fstrim
binary is used for SUSE Storage volume trimming.
f
yes, but I would have to a. find one that has the required tooling, and b. figure out what the commands are and given these snapshots are "system hidden" according to the longhorn ui, that's not even guaranteed to do anything.
b
The two that suse says are required are literally listed in the docs.
You have to use
fstrim
to release the empty blocks and reclaim that space. For example, I can fill up a block device with
/dev/zero
and it'll take up all 2TB. If I make snapshots or backups at that point, it doesn't matter if I delete or clear up space. Longhorn is gonna keep track of those blocks. At that point any changes to the disk is going to result in the Actual Size being bigger than the volume. If I filled it up by writing to a file or something, I have to delete the file, then run
fstrim
to tell the kernel to release those unneeded blocks. Like an ssd you they're not real deletes, it's like they're symlinked. The
fstrim
is what actually frees up the space on the PV. This isn't just how Longhorn works either. The ceph csi works almost exactly the same way, but they just don't have a fancy way to reclaim the unused space automagically. You can make a new pod, or a job that does this if you really don't want to install the documented extensions for Talos, but it's the only way (that I'm aware of) to actually do what you're asking.
If you want more reading, here's an article.
f
Yeah, I understand fstrim roughly, I just don't know for a fact that installing fstrim and running it will delete those system hidden snapshots, and given the machine has persistent data on it, and the reason I want rid of those snapshots is to allow a second replica, it's not a low-effort thing to install the extension.
b
running it will delete those system hidden snapshots
Nope. At least not the k8s objects one. It's only part of the Actual Size problem.
f
right, okay, well I'm running fstrim from a container now, let's see what size it brings it down to, but yeah, the snapshots could still be a fairly big problem.
yeah, still 2.96Tib of 2Tib. 😞
b
Still lower than 3 something
f
yeah, but I need it pretty close to 2Tib to schedule.
b
Delete the snaps or backups and it should lower it.
f
I can't, they're system hidden
it's extremely irritating given there's 3724.2 Gi free on an appropriate disk in the cluster and it just won't schedule.
b
kubectl get snapshots -A
f
okay, I can just kubectl delete them?
b
I have ¯\_(ツ)_/¯
f
Thankyou so much! deleting one of them has caused it to schedule!
b
🎉
b
> Filesystem backed. Unfortunately fstrim is not available with talos linux. >
Copy code
[nsenter --mount=/host/proc/35862/ns/mnt --net=/host/proc/35862/ns/net fstrim /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/3668e84ba422459eeb2730536e3d1f9f102a617a885922fb055d7539531cb8dc/globalmount], output , stderr nsenter: failed to execute fstrim: No such file or directory : exit status 127
@few-lawyer-99266, As @bland-article-62755 said, Longhorn supports file system trim on Talos. You will need to installed the
siderolabs/util-linux-tools
extension. • https://longhorn.io/docs/1.8.0/advanced-resources/os-distro-specific/talos-linux-support/#system-extensions • https://github.com/longhorn/longhorn/blob/master/enhancements/20230814-talos-linux-support.md#fstrim