In addition, I installed a local-storage-path prov...
# longhorn-storage
b
In addition, I installed a local-storage-path provisioner and run the kbench fio comparison to show you what a massive discrepancy in speeds there are between the raw NVMe and via longhorn. It’s really dramatic: TEST_FILE: /volume1/test TEST_OUTPUT_PREFIX: Local-Path TEST_SIZE: 30G Benchmarking iops.fio into Local-Path-iops.json ^[[A Benchmarking bandwidth.fio into Local-Path-bandwidth.json Benchmarking latency.fio into Local-Path-latency.json TEST_FILE: /volume2/test TEST_OUTPUT_PREFIX: Longhorn TEST_SIZE: 30G Benchmarking iops.fio into Longhorn-iops.json Benchmarking bandwidth.fio into Longhorn-bandwidth.json Benchmarking latency.fio into Longhorn-latency.json ================================ FIO Benchmark Comparsion Summary For: Local-Path vs Longhorn CPU Idleness Profiling: disabled Size: 30G Quick Mode: disabled ================================ Local-Path vs Longhorn : Change IOPS (Read/Write) Random: 189,291 / 119,199 vs 29,958 / 26,365 : -84.17% / -77.88% Sequential: 187,728 / 117,845 vs 50,813 / 42,189 : -72.93% / -64.20% Bandwidth in KiB/sec (Read/Write) Random: 3,537,400 / 1,473,275 vs 1,229,721 / 311,051 : -65.24% / -78.89% Sequential: 5,469,850 / 1,475,119 vs 551,374 / 297,434 : -89.92% / -79.84% Latency in ns (Read/Write) Random: 72,394 / 18,033 vs 458,683 / 429,503 : 533.59% / 2281.76% Sequential: 15,281 / 17,674 vs 382,695 / 418,362 : 2404.38% / 2267.10%
p
Yeah that sounds about as right as what i'm experiencing right now.
Hopefully ™️ v2 engine will increase IOPS
🙏 1
b
So I've got the V2 volumes working on this harvester cluster and I even smash installed longhorn v1.6.2 to get that performance bottleneck patch. I'll post the results, but it's much faster. About 900mb/s writes now. However, it still quite a bit slower than the drives. I'll post the kbench shortly. So you're saying this is expected? I don't understand why the performance benchmarks on the wiki have seemingly better numbers on slower drives via equinox bare metal servers
p
I'm not saying your resulsts align with the wiki, but with my own experience.
b
For sure. Understood
p
Also ofc even v2 will be slower than bare metaln theres still a middleman
Bare metal is using the kernel-space drivers, longhorn v2 is using some kind of FUSE with permanent monitoring
b
💯 It will be slower, but the overhead is 5-10x which is crazy. I get that it's in user space, but it still seems wrong
Especially compared to the official results
p
The theroical advantage of longhorn is raw bandwidth, but what i'm myself looking at is IOPS
b
Where the we're getting almost native speeds
Yeah. Both are compromised severely in my tests
p
Btw, the more replicas your volume has, the better the results
b
For read only
p
If you want the closest to-the-metal implementation, theres the strict-local data locality (which removes the main benefit of longhorn)
b
On V2. More replicas drop the write a little, which is what I'm looking to optimize right now
Correct, I've tested with strict local above
And other options
p
replica=1 and strict-local are not the same
b
I know
But that is the optimal config for write bandwidth
p
I don't see your test results of strict local, i'm confused
Oh also i just noticed you have only one node overall !
b
The kbench results above are replica 1 and strict local. It's a 1 node cluster to rule out network effects
p
Longhorn v2 scales hard with nodes iirc
b
Yes, for read it does
If you check the performance wiki you'll see how replicas affect read and write
That's what has set my expectations. I used better hardware and am getting worse results
p
nvm i read it backwards. You're correct
👍 1
b
I have to step away, but I'll be back in a couple hours
👋 1
p
Using longhorn is still overall a decent option for replicated storage across all datacenters.
b
It is, but I feel that there is something wrong here as the wiki gets only like a 10% overhead
So, it would be nice to see if some devs and users of this have comments and kbench results that compare and show the overhead differences.
p
All i know is my gitlab is slower on longhorn than it was on local disks
And my artifactory is not faster than it was on remote nfs storage
b
aye