This message was deleted.
# harvester
a
This message was deleted.
b
yuuup
rebooting is only a temp fix.
at least for me
Did you dig into this any more @ambitious-knife-27333? Were you ever able to grab logs of the serial port write failures? I had a ticket open with Suse but they're giving me crap saying it's working in OpenSuse so tell Alma.
Actually it seems like 9.3 is the only one effected? Ugh. we'll see
a
@bland-article-62755 afraid we’ve not dug in more. We’re definitely seeing this behaviour in Alma 9.4 as well.
b
Granted, I'm on Harvester 1.2.2, which is actually newer than 1.3.0
But I downloaded the 9.4 Alma image from a few days ago, and I can't get it to reproduce currently.
our older 9.3 image is easy to replicate. (takes less than 5 minutes)
I thought it might be related to the qemu-guest-agent version (8.2.2) but Fedora 40 and Ubuntu 24.04 LTS are on that version too, and I can't replicate it there either.
My current thought of how it might be different is that virtual kernel serial port. I'm guessing that a kernel function, so maybe the issue is there? IDK. I'm still running tests and I'm kinda hopeful I can get it to trigger on any of the other OSes.
Can you tell me what version you from
qemu-ga --version
?
Ok, turns out the ones that are wacky are version 8.2.0 and it seems to be fixed in 8.2.2
Suse Support said 8.2.4 but that's not what I'm seeing on my testing.
Ok... I'm about ready to call this quits. After running
dnf update -y ; systemctl reboot
the frequency in which the IP is there/missing is inverted. After updating/rebooting, it's there most of the time vs gone most of the time in my cluster. I haven't been able to get any of the ubuntu or fedora boxes running
8.2.2
to vanish. @ambitious-knife-27333 I'm happy to help with testing if you have another idea.
a
So I pulled the latest version of Alma (5th Aug) and OOTB this does not appear to be a problem. This version has agent version
8.2.0-11
in it. However, you can make the issue occur by restarting
qemu-guest-agent
. If you restart the VM, then it seems to be okay again
In essence, I think it is the agent restart that causes this. Are you using the default
cloud-init
config in Harvester?
b
I am. I did get 1/15 instances using that same version to replicate the issue, but it was very brief.
less than 30 seconds was the IP missing. I'm guessing they patched it somehow, but I have no idea how as the RPM version seems to be the same before it too.