gentle-actor-15883
06/06/2025, 6:26 AMInitiatorName
, but after the update, the IQN has changed.
Is it possible that the Harvester update caused this change, or is it something managed by Kubernetes? This change is causing issues with adding disks from our QSAN storage server, as the updated IQN no longer matches the expected configuration.
Thank you in advance for your time and support!
Best regards,Damyanfull-night-53442
06/06/2025, 3:27 PMgentle-actor-15883
06/06/2025, 3:30 PMiscsiadm
, not using a CSI driver.full-night-53442
06/06/2025, 3:32 PMbland-article-62755
06/06/2025, 3:35 PM/oem/90_custom.yaml
and write a script to get it iscsiadm to perform the actions after every reboot, but best practice would be to use the csi.ambitious-daybreak-95996
06/09/2025, 12:44 AMambitious-daybreak-95996
06/09/2025, 12:46 AM/etc/iscsi/initiatorname.iscsi
because those changes should now persist. For other versions as @bland-article-62755 said you can add yaml to /oem
to override it (see suggested workaround in the description of that issue).ambitious-daybreak-95996
06/09/2025, 12:46 AMgentle-actor-15883
06/09/2025, 12:52 PMambitious-daybreak-95996
06/10/2025, 6:49 AMgentle-actor-15883
06/10/2025, 9:26 AMambitious-daybreak-95996
06/11/2025, 1:20 AM/etc/iscsi/initiatorname.iscsi
on each host, so if you really need to you could set them all back to the same name, BUT each host really is meant to have a different name, so it would be best to have unique names for each host, and update the config on the storage server to allow connections from all hosts (not just the one name).
I haven't tried backing Longhorn with iSCSI myself, so I'm interested to hear how it works for youfull-night-53442
06/11/2025, 1:25 AMambitious-daybreak-95996
06/11/2025, 1:40 AMfull-night-53442
06/11/2025, 1:43 AMambitious-daybreak-95996
06/11/2025, 1:44 AMambitious-daybreak-95996
06/11/2025, 1:45 AMgentle-actor-15883
06/11/2025, 9:02 AMgentle-actor-15883
06/11/2025, 12:45 PMDISK_PATH=$(readlink -f /dev/disk/by-path/ip-192.168.70.10:3260-iscsi-iqn.2004-08.com.qsan:xs3216-000d47038:dev0.ctr1-lun-0)
mount "${DISK_PATH}" /var/lib/harvester/extra-disks/79de259774dce4983b72c9ddb950793c
It just doesn’t seem to work when I use a variable for the path, and I’m not sure why.
Do you have any idea what might be causing this? Or am I possibly doing something wrong in the way I'm using variables?ambitious-daybreak-95996
06/12/2025, 12:42 AMcommands:
- DISK_PATH=$(readlink ...)
- mount "${DISK_PATH}" ...
I suspect each of those commands is run separately, i.e. in separate instances of a shell. If that's the case, there's no way for a variable to propagate from one to the other. Try instead something like this:
commands:
- |
DISK_PATH=$(readlink ...)
mount "${DISK_PATH}" ...
...or maybe:
commands:
- DISK_PATH=$(readlink ...) ;
mount "${DISK_PATH}" ...
...to combine the variable declaration and use into one single command. Note: I haven't tested the above so it's possible there's errors in the yaml formatting, but you get the idea 🙂ambitious-daybreak-95996
06/12/2025, 1:23 AMgentle-actor-15883
06/12/2025, 7:58 AMEXPORT
, but I couldn’t get it to work. Either I’m missing something, or it’s not possible to use variables in the Elemental syntax. Still, I suspect I’m doing something wrong.
Lastly, do you know if there’s a way to force a VM to boot from a specific disk? It would really help with testing iSCSI disks more efficiently.ambitious-daybreak-95996
06/25/2025, 5:42 AMfio
inside running VMs. At a lower level, the Longhorn folks have https://github.com/longhorn/longhorn/wiki/Performance-Benchmarkambitious-daybreak-95996
06/25/2025, 6:15 AMgentle-actor-15883
06/25/2025, 6:24 AM/dev/sdX
paths. Since multipath uses /dev/mapper/mpathX
, Longhorn isn’t detecting those multipath devices.
We need to resolve this issue first. If my managers haven’t decided to move away from Harvester, I’ll be more than happy to provide you with full test results on the iSCSI disks.
I’ll review the information you’ve shared and will try to get back to you with some test results as soon as I can.
Thanks again!ambitious-daybreak-95996
06/25/2025, 6:45 AMambitious-daybreak-95996
06/25/2025, 6:45 AMambitious-daybreak-95996
06/25/2025, 6:50 AMmkfs /dev/mapper/whatever
then mount that somewhere, then use the Longhorn GUI (not the Harvester GUI) to add that disk to Longhorn (see https://longhorn.io/docs/1.8.2/nodes-and-volumes/nodes/multidisk/). The trick is going to be getting those re-mounted automatically after boot. For that you'd need to add some custom yaml under /oem
to mount the multipath devices on boot.ambitious-daybreak-95996
06/25/2025, 6:55 AMambitious-daybreak-95996
06/25/2025, 6:56 AMgentle-actor-15883
06/25/2025, 7:06 AMambitious-daybreak-95996
06/25/2025, 7:12 AMstages:
fs:
- commands:
- mkdir -p /path-you-want-to-mount
- mount -o [..options here..] /dev/mapper/your-multipath-disk /path-you-want-to-mount
- mkdir -p /another-path-you-want-to-mount
- mount [...you get the idea...]
ambitious-daybreak-95996
06/25/2025, 7:14 AMambitious-daybreak-95996
06/25/2025, 7:16 AMambitious-daybreak-95996
06/25/2025, 7:20 AMambitious-daybreak-95996
06/25/2025, 7:21 AMgentle-actor-15883
06/25/2025, 7:38 AMLast Transition Time: 8 days ago
Message: multipathd is running with a known issue that affects Longhorn. See description and solution at
<https://longhorn.io/kb/troubleshooting-volume-with-multipath>
Reason: MultipathdIsRunning
Status: False
From what I understand, this might be resolved by adding the following to the multipath config:
blacklist {
devnode "^sd[a-z0-9]+"
}
But I’m concerned this could break the iSCSI disk connections. Here's the issue: if I apply this config, multipath might stop working properly, because upon reboot, each iSCSI target reconnects as a new
sdX
device — and those names change every time.
I’m thinking of working around this by dynamically resolving the correct device with something like:
readlink -f /dev/disk/by-path/ip-<iscsi_session>-lun-0
That way I can identify the current
sdX
device and make sure it's not blacklisted, but that’s still a bit messy. I’m open to better ideas if you have any.
Also, one thing I’m not sure about: does this warning from Longhorn actually impact production usage? I’m not entirely sure what role multipath plays in Longhorn internally, and whether this warning is just informational or could cause actual issues.
For reference, here's what my current multipath setup looks like:
mpathd (32017001378101100) dm-14 Qsan,XS3216
size=500G features='0' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 16:0:0:0 sde 8:64 active ready running
`-+- policy='service-time 0' prio=50 status=enabled
`- 17:0:0:0 sdf 8:80 active ready running
I’ll start testing now to see if everything works as expected.gentle-actor-15883
06/25/2025, 7:40 AMambitious-daybreak-95996
06/27/2025, 6:46 AM/etc/multipath/conf.d/99-longhorn.conf
which explicitly blacklists longhorn devices by vendor (IET) and product (VIRTUAL-DISK)ambitious-daybreak-95996
06/27/2025, 6:50 AMgentle-actor-15883
06/27/2025, 7:11 AMmultipath
service detects all sdX
disks, which can lead to complications. At one point, I even encountered a critical error where the OEM layer and filesystem failed to mount, but I believe I will be able to adjust the configuration so it will eventually work.
In general, multipath
can easily break things and it is tricky to configure it in a way that doesn't cause node issues, but I am hopeful I will manage it. Other than that, adding disks through the Longhorn UI works without problems, and I just need to ensure they are mounted automatically on startup, which I have some ideas on how to handle.
One thing I couldn't fully understand is why disks added directly through the Longhorn UI do not have the Provisioned
tag under the node's storage in the Harvester UI, while disks added through the Harvester UI do have this tag. Does this tag play an important role? The disk itself doesn't show errors, and I was able to create and run a VM on it without issues.
Regarding multipath, as far as I have seen, Longhorn itself does not use multipath, since it is disabled by default in the GRUB configuration. However, Longhorn creates additional paths to the disks, which might confuse it if it relies on unique identifiers to find disks. Since multipath
can use the same names and WWIDs, this could potentially be the root cause of these issues, but I am not 100% certain yet as I am still investigating.ambitious-daybreak-95996
06/27/2025, 7:33 AMblacklist {
device {
vendor "IET"
product "VIRTUAL-DISK"
}
}
If you apply that to v1.4.3 it should blacklist the LH devices, without messing up any other /dev/sd[x] devicesambitious-daybreak-95996
06/27/2025, 7:34 AMambitious-daybreak-95996
06/27/2025, 7:35 AMambitious-daybreak-95996
06/27/2025, 7:36 AMgentle-actor-15883
06/30/2025, 7:44 AMmultipath
by adding the necessary GRUB variables and rebooted the node.
2️⃣ Manually attached the iSCSI disk
Initially, I attached the iSCSI disk manually to retrieve its wwid
.
3️⃣ Created a temporary multipath.conf
multipaths {
multipath {
wwid 32017001378101100
alias qsan
path_grouping_policy multibus
path_selector "round-robin 0"
failback manual
rr_weight priorities
no_path_retry 5
}
}
4️⃣ Added the disk via Longhorn UI
Mounted the disk at:
/var/lib/harvester/extra-disks/qsan500g
5️⃣ Created a YAML for automated mounting
The YAML:
• creates a persistent multipath.conf
,
• ensures the multipath device is properly mapped,
• mounts it automatically after confirming device readiness.
⚠️ Important note:
I disabled multipathd
autostart because it causes race conditions when mounting COS_GRUB
, COS_OEM
, COS_RECOVERY
, COS_STATE
, COS_PERSISTENT
, and HARV_LH_DEFAULT
. Instead, I manually start multipathd
after everything else is ready.
Current problem needing input
I cannot blacklist the OS disk in multipath.conf
. It keeps being handled by multipath, not sdX
, and since it is actively in use, I cannot easily blacklist it. Do you know a clean way to prevent multipath from claiming the OS disk while keeping the system stable?
Here is the clean YAML I am currently using:
name: "Configure multipath for QSAN"
stages:
fs:
- name: "Create multipath.conf"
files:
- path: /etc/multipath.conf
content: |
defaults {
user_friendly_names yes
find_multipaths yes
path_checker tur
fast_io_fail_tmo 2
dev_loss_tmo 5
}
blacklist {
wwid 3600605b011e738702b752a444f9c5b1a
wwid SPCC_Solid_State_Disk_A20231228S301KG01384
wwid ST9500325AS_6VE7J1WW
}
multipaths {
multipath {
wwid 32017001378101100
alias qsan
path_grouping_policy group_by_prio
path_selector "round-robin 0"
failback immediate
rr_weight priorities
no_path_retry 5
}
}
permissions: 0644
owner: 0
group: 0
- name: "Create QSAN multipath mount script"
files:
- path: /usr/local/bin/qsan_mount.sh
content: |
#!/bin/bash
echo "Waiting for iSCSI LUN with WWID 32017001378101100..."
for i in {1..12}; do
if ls /dev/disk/by-id/ | grep -q "32017001378101100"; then
echo "iSCSI LUN detected."
break
else
echo "Retrying in 5s..."
sleep 5
fi
done
echo "Starting multipathd..."
systemctl unmask multipathd
systemctl start multipathd
echo "Reloading multipath maps..."
multipath -r
echo "Waiting for /dev/mapper/qsan..."
for i in {1..12}; do
if [ -e /dev/mapper/qsan ]; then
echo "QSAN multipath device ready."
fsck -y /dev/mapper/qsan || true
if ! mountpoint -q /var/lib/harvester/extra-disks/qsan500g; then
mkdir -p /var/lib/harvester/extra-disks/qsan500g
mount /dev/mapper/qsan /var/lib/harvester/extra-disks/qsan500g
else
echo "QSAN disk already mounted."
fi
break
else
echo "QSAN device not ready, retrying in 5s..."
sleep 5
fi
done
echo "Multipath setup complete."
permissions: 0755
owner: 0
group: 0
network:
- name: "Run QSAN multipath mount script"
commands:
- bash /usr/local/bin/qsan_mount.sh
Next steps:
I am now starting tests to observe the system’s behavior when a controller path failure occurs on the QSAN side to see how gracefully multipath handles reconnections and failover.
Let me know if you have any suggestions on the OS disk blacklisting issue or improvements for stability during path failures.
Thanks!gentle-actor-15883
07/01/2025, 6:07 AMnode.conn[0].timeo.noop_out_timeout
(default 15)
• node.conn[0].timeo.noop_out_interval
(default 10)
• node.session.timeo.replacement_timeout
(default 120)
To speed up failure detection and failover, I modified them as follows:
• node.conn[0].timeo.noop_out_timeout
from 15 to 2
• node.conn[0].timeo.noop_out_interval
from 10 to 1
• node.session.timeo.replacement_timeout
from 120 to 2
I applied these changes using this YAML snippet:
name: "ISCSI configuration"
stages:
boot:
- name: "Set replacement_timeout in iscsid.conf"
commands:
- iscsiadm -m node --op update -n node.conn[0].timeo.noop_out_timeout -v 2
- iscsiadm -m node --op update -n node.conn[0].timeo.noop_out_interval -v 1
- iscsiadm -m node --op update -n node.session.timeo.replacement_timeout -v 2
Additionally, I adjusted the multipath configuration for faster path failover:
• f`ast_io_fail_tmo`
• set to 2 seconds
• dev_loss_tmo
set to 5 seconds
This allows the system to more quickly mark a path as faulty and switch to the operational path without interrupting workloads.
After applying these changes, I re-ran the tests, and the VM continued operating without interruption. The tests did not cause the VM to pause, disconnect, or crash.
For testing, I used Rocky Linux 9 with the fio package:
First test (lighter workload):
[global]
ioengine=libaio
direct=1
runtime=600
time_based
group_reporting
numjobs=1
iodepth=8
size=500M
[randrw]
rw=randrw
rwmixread=70
bs=4k
S*econd test (heavier workload):*
[global]
ioengine=libaio
direct=1
runtime=600
time_based
group_reporting
numjobs=1
iodepth=32
size=1G
[randread]
rw=randread
bs=4k
[randwrite]
rw=randwrite
bs=4k
During the tests, I manually toggled the server ports connected to one of the QSAN controllers multiple times, and failover was smooth after reducing the timeouts.
Observations:
• Kubernetes generates a very high number of I/O operations, and if the disk connection is not restored within 5–7 seconds, the node may mark the volume as lost.
• Since the volumes are iSCSI devices, they disconnect quickly, requiring faster reaction times than the default Kubernetes behavior.
• I have not yet tested this in a real cluster environment with multiple virtual machines performing different workloads on the disks simultaneously. If issues occur under such conditions, one possible approach could be reducing the timeouts further to 1 second to ensure faster failover. However, this carries the risk of treating transient micro-outages as failures and will increase server load due to more frequent connection checks.