This message was deleted.
# harvester
a
This message was deleted.
b
Hi, what type of PCI error is it showing in the logs? Can you show the output from
cat /proc/cmdline
o
The dmesg output
Copy code
[177679.704466] pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
[177679.704471] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[177679.704472] pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
[177679.704473] pcieport 0000:00:01.0:    [ 0] RxErr
[177679.878352] atlantic 0000:04:00.0 enp4s0: atlantic: link change old 10000 new 0
[177679.910338] atlantic 0000:03:00.0 enp3s0: atlantic: link change old 10000 new 0
[177680.014435] mgmt-bo: (slave enp3s0): link status definitely down, disabling slave
[177680.014439] device enp3s0 left promiscuous mode
[177680.014481] mgmt-bo: now running without any active interface!
[177680.014541] mgmt-br: port 1(mgmt-bo) entered disabled state
Output from /proc/cmdline
Copy code
BOOT_IMAGE=(loop0)/boot/vmlinuz console=tty1 root=LABEL=COS_STATE cos-img/filename=/cOS/active.img panic=0 net.ifnames=1 rd.cos.oemlabel=COS_OEM rd.cos.mount=LABEL=COS_OEM:/oem rd.cos.mount=LABEL=COS_PERSISTENT:/usr/local rd.cos.oemtimeout=120 audit=1 audit_backlog_limit=8192 intel_iommu=on amd_iommu=on iommu=pt
b
@orange-businessperson-74585 Add the kernel boot option
pci=nommconf
this will revert that memory allocation back to traditional. I use it here on my HP Z440 (openSUSE Tumbleweed). So that entry is in /etc/default/grub then you can run
grub2-mkconfig -o /boot/grub2/grub.cfg
afterwards to update, then reboot. Or at boot you can edit grub and add the option for that boot to see how it goes first.
Also what CPU is this as you have both amd and intel iommu?
o
I have added the kernel parameter at boot time to try it out first. The cpu is an 11th Gen Intel(R) Core(TM) i9-11900 @ 2.50GHz
b
So should only need the intel_iommu then.
b
Might want to update firmware of that card too. And in driver notes it says Disable LRO when routing/bridging.
o
LRO and GRO are disabled but problem still exists
Copy code
ethtool -K enp3s0 lro off
ethtool -K enp3s0 gro off
@bored-farmer-36655 the content of the cmdline is now as show below. Should the intel_iommu be switched on, off or removed? And how do I change/remove these kernel parameters in harvester. If I rremove them from /etc/cos/bootargs.cfg then after a reboot they are automatically added to this file again.
Copy code
BOOT_IMAGE=(loop0)/boot/vmlinuz console=tty1 root=LABEL=COS_STATE cos-img/filename=/cOS/active.img panic=0 net.ifnames=1 rd.cos.oemlabel=COS_OEM rd.cos.mount=LABEL=COS_OEM:/oem rd.cos.mount=LABEL=COS_PERSISTENT:/usr/local rd.cos.oemtimeout=120 audit=1 audit_backlog_limit=8192 intel_iommu=on amd_iommu=on iommu=pt pci=nommconf
b
@orange-businessperson-74585 did you run the grub2-mkconfig command after changing the options?
o
I mounted the boot partition and changed the parameters in de grub config file manually. Now is the question what to do with the intel_iommu parameter. I removed it completely but network card still stop functioning after a while. Should the intel_iommu be set to "off"?
b
No, it was just the amd_iommu one... since you have an intel cpu.
o
with
intel_iommu=on
and
pci=nommconf
the network card stops functioning after a while. ANy other ideas or tips?
b
Did you try as @brainy-whale-97450 suggested a firmware upgrade on the card?
o
I tried but ended up with a windows installation were I ran a DELL firmware update but it did not flash the firmware in the nic. There is not much information for updating the firmware on this card available.
b
From my quick searching, it looks like this card is not very well supported by marvell anymore. Maybe it is a lot faster and easier to go with a card that is well supported? https://www.ebay.com/sch/i.html?_from=R40&_trksid=p2334524.m570.l1313&_nkw=pcie+dua[…]10g&_sacat=0&LH_TitleDesc=0&_odkw=pcie+dual+rj45+19g&_osacat=0
o
Had the same conclusion, is there a list of supported hardware in harvester?
b
@orange-businessperson-74585 what is the output from
cat/etc/os-release
?
o
@bored-farmer-36655 the output is
Copy code
NAME="SLE Micro"
VERSION="5.3"
VERSION_ID="5.3"
PRETTY_NAME="Harvester v1.2.1"
ID="sle-micro-rancher"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sle-micro-rancher:5.3"
VARIANT="Harvester"
VARIANT_ID="Harvester-v1.2-20231023"
GRUB_ENTRY_NAME="Harvester v1.2.1"
b
@orange-businessperson-74585 See <https://www.suse.com/yessearch/>