This message was deleted.
# longhorn-storage
a
This message was deleted.
b
cc @famous-journalist-11332 @icy-agency-38675
đź‘€ 2
i
data transfer for the images
What's the images?
f
What's the images?
On the linked ticket https://github.com/harvester/harvester/issues/2894, it is referring to as downloading the harvester images and rke2 images. This would be on the host disk directly and at this function https://github.com/connorkuehl/harvester/blob/66f69d251913516fd776b9521ce2b6f22ceea80e/package/upgrade/upgrade_node.sh#L105-L126 Do you mean this preload image step is slow @bland-book-71921?
🤔 1
i
it is referring to as downloading the harvester images and rke2 images.
Yes. That's why I think is it related to longhorn volume?
f
Yes. That's why I think is it related to longhorn volume?
Sorry, if it is downloaded to host disk directly and it is slow. Should it NOT related to Longhorn volume?
i
host disk
I'm confused. Is the host disk on a host machine or on a longhorn volume?
f
I think the bare metal node because this is Harvester upgrade. I could be wrong though
b
it is on the bare metal nodes. Running the standard harvester upgrade proces
i
No worries. cc @salmon-city-57654
f
@bland-book-71921 Can you help me to to 1 experiment: 1. Access Longhorn UI 2. Create a 40G test volume with 3 replica 3. Attach it to a bare metal node using Longhorn UI 4. Format a filesystem on the created block device at /dev/longhorn/<volume-name> 5. Run kbench on it Repeat the test with 1 replica and attach the to the same node as the replica.
👍 2
b
cc @prehistoric-balloon-31801
b
ok. one minute
i
@famous-journalist-11332 One question. Why need to benchmark longhorn volume?
b
yeah. so I get ~300MB/s using replica=3 longhorn v1 provisioning volumnes on harvester k8s cluster and runnign k-bench When I run kbench on a k8s cluster created on harvester vms, I get < 100 MB/s
i
I see. It's clear. Thank you.
b
I’ve got a few problems right now though: 1. Because I’ve been trying to upgrade this, I actually got it passed the download by running the install a few times and relying on cached images. It went through and is now trying to upgrade the nodes and has put one of them in cordoned mode (this is another problem as some of the VMs won’t migrate - but ignore it)… so longhorn can only create degraded volumes on 2 nodes.
f
> yeah. so I get ~300MB/s using replica=3 longhorn v1 provisioning volumnes on harvester k8s cluster and runnign k-bench > > When I run kbench on a k8s cluster created on harvester vms, I get < 100 MB/s So there are 2 concerns: 1. Why Longhorn can only output 300 Mbps. What is the expectation here? We want to compare it with running kbench with local-provisioner on Harvester cluster 2. Why Harvester layer drops it to 100 Mbps
i
When I run kbench on a k8s cluster created on harvester vms, I get < 100 MB/s
What's your disk? replica=3 as well? virtio disk?
b
@famous-journalist-11332 yes. I am the same person who is talking to you over github. So, the local-provisioner gives me >1GB/s read and write as per the issue. I assume that I should get something much better than 10% with this very expensive hardware and 10GB network.
f
LOL. Is this ticket https://github.com/longhorn/longhorn/issues/3037 @bland-book-71921?
b
@icy-agency-38675 I am running this on the virtualized k8s cluster: https://github.com/longhorn/kbench/blob/main/deploy/fio.yaml ``````
👍 1
@famous-journalist-11332 correct. Yeah. It seems this issue is getting me at every corner
👍 1
f
Ok. I think I will focus on the Longhorn side of the issue and remove the Harvester out of the investiagtion for now. Harvester member can help with the upper layer.
👍 1
b
yes. totally reasonable
f
i
The latest result?
b
That was on a test node where I had one node cluster + replica=1 and harvester 1.3.0
👍 2
Right now, I’m using my “production” nodes running v1.2.1, replica=3
f
Ok. Sounds good. Doing 1 node experiment will remove a lot of factors we need to consider. From this report https://github.com/longhorn/longhorn/issues/3037#issuecomment-2136154149, which metric are you concerning the most? I know there are multiple ones look bad
b
bandwidth, I believe.
f
Write bandwidth?
âś… 1
b
This is part of my mess now. I don’t have that 1 node anymore. I just have this production cluster =(
đź‘€ 1
f
Can we somehow get a node to do investigation? Since we want to test different version/modification of Longhorn. This doesn't sound like something we can do it on prod
b
yes, that’s fair. Let me get back to you. I may be able to restore the previous configuation. I jsut have to shuffle things
OK, how should I prepare it? Harvester 1.2.1? or 1.3.0?
they have different version fo longhorn
but, we will be dealing with v1 volumes.
f
And from the latest report, the number between
iscsi + tgt + local file
is almost the same as
iscsi + tgt + longhorn engine
https://github.com/longhorn/longhorn/issues/3037#issuecomment-2140519381 Which make me think the bottleneck is not at longhorn-engine. Any idea about next step @icy-agency-38675
> OK, how should I prepare it? Harvester 1.2.1? or 1.3.0? Can we remove the Harvester from the picture for now? Install Longhorn 1.6.2 directly? https://longhorn.io/docs/1.6.2/deploy/install/install-with-kubectl/ @bland-book-71921 1. Install rke2 2. Install Longhorn 1.6.2
b
ok!
i
I feel the bottleneck is in iSCSI.
s
Just started tracking this thread. If you would like to install a harvester to test together, please use v1.2.2
checking the previous discussion…
i
Can you generate a support bundle? See if we can find a similar machine for further investigation in our side.
👍 2
b
Where should I send the bundle?
i
f
longhorn-support-bundle@suse.com or you can create a DM with me, @icy-agency-38675 @salmon-city-57654 @billowy-painting-56466
👍 2
Got the support bundle.
Can you generate a support bundle? See if we can find a similar machine for further investigation in our side.
@icy-agency-38675 I am curious how we can find machine info from support bundle?
i
I am curious how we can find machine info from support bundle?
IIRC, the CPU and memory info are recorded in the longhorn support bundle. I just found those info are not in Harvester's support bundle
đź‘€ 1