This message was deleted.
# longhorn-storage
a
This message was deleted.
βœ… 1
i
Do you want to use v1 volumes or try v2volumes?
c
I installed Longhorn through Chart and enabled v2. However, the Longhorn UI displays v1 version(I think it is UI bug), which confuses me whether it's v1 or v2. Yesterday, I ran performance tests and found that Longhorn's performance was hardly affected, so I assume it's using v2. @icy-agency-38675
i
Can you provide a support bundle?
c
Sure, BTW I would like to ask is that our business often involves disk expansion. Assuming that the physical disks of machine A are already full, what should I do? Can you quickly clone a copy of PVC from A onto machine B, assuming machine B has a larger disk We plan to use V2 in the production system, deploying a large number of blockchain nodes and designing for storage growth. Due to our fear of PVC fragments, we plan to set raid0 and replicas to 1. Do you have any good plans for this?
i
Can you quickly clone a copy of PVC from A onto machine B, assuming machine B has a larger disk
Currently, no.
BTW I would like to ask is that our business often involves disk expansion. Assuming that the physical disks of machine A are already full, what should I do?
What's your disk? A lvm disk?
We plan to use V2 in the production system, deploying a large number of blockchain nodes and designing for storage growth.
Currently, v2 volume is an alpha version for testing, so please use it in a production system.
Due to our fear of PVC fragments, we plan to set raid0 and replicas to 1.
Can you elaborate more on the PVC fragments?
c
I conducted performance testing yesterday. The first was the operating system disk, the second was Longhorn v1.5.4 v1 Engine, and the third was Longhorn 1.6.0 V2 Engine.
Copy code
# bare disk on physical machine
 fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=50G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 51200MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=480MiB/s,w=159MiB/s][r=123k,w=40.7k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=634423: Wed Feb 28 05:40:15 2024
  read: IOPS=120k, BW=469MiB/s (492MB/s)(37.5GiB/81838msec)
   bw (  KiB/s): min=452280, max=500440, per=100.00%, avg=480917.99, stdev=10057.50, samples=163
   iops        : min=113070, max=125110, avg=120229.50, stdev=2514.35, samples=163
  write: IOPS=40.0k, BW=156MiB/s (164MB/s)(12.5GiB/81838msec); 0 zone resets
   bw (  KiB/s): min=150824, max=166472, per=100.00%, avg=160266.85, stdev=3351.87, samples=163
   iops        : min=37706, max=41618, avg=40066.72, stdev=837.96, samples=163
  cpu          : usr=10.71%, sys=44.15%, ctx=4542828, majf=0, minf=8
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=9830837,3276363,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=469MiB/s (492MB/s), 469MiB/s-469MiB/s (492MB/s-492MB/s), io=37.5GiB (40.3GB), run=81838-81838msec
  WRITE: bw=156MiB/s (164MB/s), 156MiB/s-156MiB/s (164MB/s-164MB/s), io=12.5GiB (13.4GB), run=81838-81838msec

Disk stats (read/write):
  nvme0n1: ios=9808669/3269628, merge=0/257, ticks=2715615/1742618, in_queue=4458234, util=99.96%
fio pod( v1 longhorn(v1.5.4) engine)
Copy code
fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=50G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 51200MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=123MiB/s,w=41.5MiB/s][r=31.6k,w=10.6k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=68: Wed Feb 28 05:42:58 2024
  read: IOPS=31.3k, BW=122MiB/s (128MB/s)(37.5GiB/314319msec)
   bw (  KiB/s): min=104320, max=151968, per=100.00%, avg=125155.71, stdev=9316.09, samples=628
   iops        : min=26080, max=37992, avg=31288.89, stdev=2329.01, samples=628
  write: IOPS=10.4k, BW=40.7MiB/s (42.7MB/s)(12.5GiB/314319msec); 0 zone resets
   bw (  KiB/s): min=35104, max=50208, per=100.00%, avg=41711.13, stdev=3125.74, samples=628
   iops        : min= 8776, max=12552, avg=10427.74, stdev=781.43, samples=628
  cpu          : usr=5.96%, sys=26.34%, ctx=12104184, majf=0, minf=8
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=9830837,3276363,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=122MiB/s (128MB/s), 122MiB/s-122MiB/s (128MB/s-128MB/s), io=37.5GiB (40.3GB), run=314319-314319msec
  WRITE: bw=40.7MiB/s (42.7MB/s), 40.7MiB/s-40.7MiB/s (42.7MB/s-42.7MB/s), io=12.5GiB (13.4GB), run=314319-314319msec

Disk stats (read/write):
  sda: ios=9819399/3273370, merge=3273/463, ticks=15022532/4955509, in_queue=19978041, util=100.00%
pod v2 longhorn (v1.6.0)
Copy code
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
root@fio-test-pod:/# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=50G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 51200MiB)

Jobs: 1 (f=1): [m(1)][100.0%][r=490MiB/s,w=164MiB/s][r=125k,w=42.0k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=78: Wed Feb 28 08:32:26 2024
  read: IOPS=119k, BW=465MiB/s (488MB/s)(37.5GiB/82504msec)
   bw (  KiB/s): min=436960, max=507144, per=100.00%, avg=476690.24, stdev=13328.70, samples=164
   iops        : min=109240, max=126786, avg=119172.60, stdev=3332.15, samples=164
  write: IOPS=39.7k, BW=155MiB/s (163MB/s)(12.5GiB/82504msec); 0 zone resets
   bw (  KiB/s): min=146000, max=169992, per=100.00%, avg=158864.34, stdev=4360.70, samples=164
   iops        : min=36500, max=42498, avg=39716.10, stdev=1090.17, samples=164
  cpu          : usr=9.80%, sys=43.71%, ctx=4394736, majf=0, minf=6
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=9830837,3276363,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=465MiB/s (488MB/s), 465MiB/s-465MiB/s (488MB/s-488MB/s), io=37.5GiB (40.3GB), run=82504-82504msec
  WRITE: bw=155MiB/s (163MB/s), 155MiB/s-155MiB/s (163MB/s-163MB/s), io=12.5GiB (13.4GB), run=82504-82504msec
i
Looks good!
c
I haven't decided which approach to use yet. Perhaps LVM is a good option, or maybe you have a better suggestion? The issue with PVC fragmentation is as follows: Suppose I introduce two disks via Block, each being 3.5TB. However, I cannot directly request a disk larger than 3.5TB, right? Because it cannot be stored separately on two paths. Yesterday, I tested the filesystem approach, and indeed, it's like this. This means that as my pod's data volume continues to grow, although I can see plenty of disk space through the UI, I cannot expand it. This is also why I want to set up RAID 0, to treat disk capacity as a whole for utilization. Thank you for your patient explanation.
i
Data engine the the volume
pvc-d870fe3c-e330-4ce6-a1cc-056e95d1bc10
is
v1
Copy code
name: pvc-d870fe3c-e330-4ce6-a1cc-056e95d1bc10
    namespace: longhorn-system
    resourceVersion: "495815"
    uid: 53f07169-334f-43de-852d-5b5d5523701c
  spec:
    Standby: false
    accessMode: rwo
    backendStoreDriver: "null"
    backingImage: "null"
    backupCompressionMethod: lz4
    dataEngine: v1
You need to create a storageclass for v2 volumes doc longhorn.io/docs/1.6.0/v2-data-engine/quick-start/#create-a-storageclass
πŸ‘ 1
> I haven't decided which approach to use yet. Perhaps LVM is a good option, or maybe you have a better suggestion? LLM might be a good option for adjusting the size of underlying disk. If the disk is full, you can adjust it easily. (This really deps on your plan of the underlying disks) If you really need to expand the disks, you can set the node to maintenance node. longhorn.io/docs/1.6.0/maintenance/maintenance/
The issue with PVC fragmentation is as follows: Suppose I introduce two disks via Block, each being 3.5TB. However, I cannot directly request a disk larger than 3.5TB, right?
Yes
This is also why I want to set up RAID 0, to treat disk capacity as a whole for utilization.
Thank you for your patient explanation.
Yes, you're right. Using RAID 0 is good for your use case.
πŸ‘ 1
c
Thank you for your answer, it's been very helpful. I'll go ahead and conduct the tests now. I noticed that Longhorn mentioned not to use V2 for production systems, but I believe the performance improvement is quite significant. Could you please tell me what I need to be aware of if I use the V2 engine in production? @icy-agency-38675
i
You can check the roadmap, but it might be updated over time according to our development progress.
πŸ™ 1
c
hi @icy-agency-38675 I just test and try to add LVM blocks to Longhorn, but it failed, indicating that it couldn't add reverse blocks. Then, I attempted to create RAID0 and added
/dev/md0
to the Longhorn UI, and it was recognized. Does this mean Longhorn v2 doesn't support LVM? However, a new issue emerged: kubelet is reporting mount timeouts. There are no error events for PV or PVC.
i
Can you check the nvme_tcp and uio drive on the worker nodes lsmod | grep nvme lsmod | grep uio
c
They all existed
i
nvme_tcp and uio_pci_generic are missing. Please run the commands on all worker nodes
Copy code
modprobe nvme-tcp
modprobe uio_pci_generic
c
Sry @icy-agency-38675 : (, My DevOps script encountered an issue. When I try lvm block device, I got this error in longhorn UI. admission webhook "validator.longhorn.io" denied the request: disk disk-3 type block is not supported to reserve storage . In Figure 3, this is the benchmark I ran. May I ask if it's possible to use LVM in the Longhorn v2 engine?
i
Yeah, disk from a lvm is ok
c
I tried again, but it still doesn't work. Can you help me see where the problem is? @icy-agency-38675
i
Can you generate a support bundle? I will check it later.
c
Oh, I've found the issue. By default,
/dev/mapper/xxx
is a symbolic link that points to
/dev/dm-0
. When I entered this address in Longhorn, it was recognized correctly. I've updated the benchmark, and it seems there's a certain decline in performance. Are there any optimization parameters that can be adjusted? All of my machines have 10Gbps LAN cards. Longhorn is cool. Anyway, thank you very much for taking the time to help me.
πŸ™Œ 1
i
You're welcome. Feel free to let us know anything we can help
πŸ‘ 1
c
Hi @icy-agency-38675 ,We are a company specializing in blockchain solutions, and we are currently discussing the decision to incorporate Longhorn into our production services. Currently, we are using RKE2 Rancher and Longhorn to manage over a dozen machines running more than 30 different blockchain public chains, with a total data size of nearly 100TB. I would like to inquire whether Longhorn has enterprise support available, even if it is on a paid basis. We are particularly concerned about stability since we are not a company specialized in storage technologies, especially considering that we are directly utilizing the V2 engine. Thank you for your assistance.
i
@cool-architect-86201 Thanks for asking. I think @salmon-doctor-9726 will help answer this question later πŸ™‚
c
Hi @icy-agency-38675 , Because V2 doesn't support expansion, I have already used the V1 version of the engine. Does the V1 version not support block devices? When I try to add /dev/md-0, it prompts me that SPDK is not installed. Therefore, does the V1 version not support block devices?
i
Hello v1 engine doesn't support adding block device.
πŸ‘Œ 1