bland-article-62755
08/25/2025, 9:09 PMbland-article-62755
08/25/2025, 9:12 PM[Mon Aug 25 21:07:13 2025] Loaded X.509 cert 'Nvidia GPU OOT signing 001: 55e1cef88193e60419f0b0ec379c49f77545acf0'
[Mon Aug 25 21:07:13 2025] Loaded X.509 cert 'AlmaLinux NVIDIA Module Signing: 4a47544e27f990e063664d7d0dd9c2158954567a'
[Mon Aug 25 21:07:17 2025] nvidia: loading out-of-tree module taints kernel.
[Mon Aug 25 21:07:17 2025] nvidia-nvlink: Nvlink Core is being initialized, major device number 238
[Mon Aug 25 21:07:17 2025] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:0a:00.0)
[Mon Aug 25 21:07:17 2025] nvidia: probe of 0000:0a:00.0 failed with error -1
[Mon Aug 25 21:07:17 2025] NVRM: The NVIDIA probe routine failed for 1 device(s).
[Mon Aug 25 21:07:17 2025] NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64 580.65.06 Release Build (mockbuild@) Tue Aug 5 16:02:49 EDT 2025
[Mon Aug 25 21:07:17 2025] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64 580.65.06 Release Build (mockbuild@) Tue Aug 5 16:02:38 EDT 2025
[Mon Aug 25 21:07:17 2025] [drm] [nvidia-drm] [GPU ID 0x00000b00] Loading driver
[Mon Aug 25 21:07:19 2025] [drm] Initialized nvidia-drm 0.0.0 for 0000:0b:00.0 on minor 1
[Mon Aug 25 21:07:19 2025] nvidia 0000:0b:00.0: [drm] No compatible format found
[Mon Aug 25 21:07:19 2025] nvidia 0000:0b:00.0: [drm] Cannot find any crtc or sizes
bland-article-62755
08/25/2025, 9:16 PM# dmesg -T | egrep -i 'nvidia|nvrm|xid'
[Mon Aug 25 20:50:31 2025] nvidia: loading out-of-tree module taints kernel.
[Mon Aug 25 20:50:31 2025] nvidia: module license 'NVIDIA' taints kernel.
[Mon Aug 25 20:50:31 2025] nvidia: module license taints kernel.
[Mon Aug 25 20:50:31 2025] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[Mon Aug 25 20:50:31 2025] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:0a:00.0)
[Mon Aug 25 20:50:31 2025] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR2 is 0M @ 0x0 (PCI:0000:0a:00.0)
[Mon Aug 25 20:50:31 2025] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR5 is 0M @ 0x0 (PCI:0000:0a:00.0)
[Mon Aug 25 20:50:31 2025] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 575.57.08 Sat May 24 07:21:16 UTC 2025
[Mon Aug 25 20:50:31 2025] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 575.57.08 Sat May 24 06:52:56 UTC 2025
[Mon Aug 25 20:50:31 2025] [drm] [nvidia-drm] [GPU ID 0x00000a00] Loading driver
[Mon Aug 25 20:50:31 2025] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:0a:00.0 on minor 0
[Mon Aug 25 20:50:31 2025] [drm] [nvidia-drm] [GPU ID 0x00000b00] Loading driver
[Mon Aug 25 20:50:31 2025] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:0b:00.0 on minor 1
[Mon Aug 25 20:50:32 2025] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[Mon Aug 25 20:51:18 2025] caller os_map_kernel_space+0x120/0x130 [nvidia] mapping multiple BARs
[Mon Aug 25 20:51:18 2025] NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x24:0x72:1589)
[Mon Aug 25 20:51:18 2025] NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0
bland-article-62755
08/25/2025, 9:25 PMnvidia-smi -L
will only ever list one at time. If you only add 1 of 2 - it will always work/show up. But with two we always see the PCI regions are invalid messages or Cannot find any crtc or sizes.bland-article-62755
08/25/2025, 9:45 PMInterrupt: pin A routed to IRQ 23
is there a way to unset that or change it?thousands-advantage-10804
08/25/2025, 11:18 PMbland-article-62755
08/25/2025, 11:23 PMthousands-advantage-10804
08/25/2025, 11:24 PMthousands-advantage-10804
08/25/2025, 11:24 PMbland-article-62755
08/25/2025, 11:36 PMthousands-advantage-10804
08/25/2025, 11:37 PMbland-article-62755
08/25/2025, 11:42 PMthousands-advantage-10804
08/25/2025, 11:43 PMthousands-advantage-10804
08/25/2025, 11:44 PMbland-article-62755
08/25/2025, 11:44 PMbland-article-62755
08/25/2025, 11:44 PMbland-article-62755
08/25/2025, 11:44 PMthousands-advantage-10804
08/25/2025, 11:46 PMbland-article-62755
08/25/2025, 11:50 PMbland-article-62755
08/25/2025, 11:50 PM0a:00.0 3D controller: NVIDIA Corporation GA100GL [A30 PCIe] (rev a1)
Subsystem: NVIDIA Corporation GA100GL [A30 PCIe]
Physical Slot: 0-10
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 23
Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
Region 3: Memory at 39080000000 (64-bit, prefetchable) [size=32M]
Capabilities: <access denied>
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nvidia_drm, nvidia
0b:00.0 3D controller: NVIDIA Corporation GA100GL [A30 PCIe] (rev a1)
Subsystem: NVIDIA Corporation GA100GL [A30 PCIe]
Physical Slot: 0-11
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 23
Region 0: Memory at f9000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at 38000000000 (64-bit, prefetchable) [size=32G]
Region 3: Memory at 38800000000 (64-bit, prefetchable) [size=32M]
Capabilities: <access denied>
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nvidia_drm, nvidia
thousands-advantage-10804
08/25/2025, 11:51 PMbland-article-62755
08/25/2025, 11:51 PMbland-article-62755
08/25/2025, 11:52 PMthousands-advantage-10804
08/25/2025, 11:52 PMthousands-advantage-10804
08/25/2025, 11:53 PMbland-article-62755
08/25/2025, 11:53 PMbland-article-62755
08/25/2025, 11:56 PMbland-article-62755
08/25/2025, 11:56 PMgpu3:~ # lspci -d 10de: -vvv |grep IRQ
Interrupt: pin A routed to IRQ 1209
Interrupt: pin A routed to IRQ 1210
Interrupt: pin A routed to IRQ 1217
Interrupt: pin A routed to IRQ 1218
bland-article-62755
08/25/2025, 11:57 PMubuntu@gpu3test:~$ lspci -d 10de: -vvv | grep IRQ
Interrupt: pin A routed to IRQ 23
Interrupt: pin A routed to IRQ 23
bland-article-62755
08/25/2025, 11:57 PM[root@alma9gpu3test ~]# lspci -d 10DE: -vvv |grep IRQ
Interrupt: pin A routed to IRQ 23
Interrupt: pin A routed to IRQ 23
thousands-advantage-10804
08/25/2025, 11:57 PMbland-article-62755
08/25/2025, 11:58 PM5.14.0-570.35.1.el9_6.x86_64
bland-article-62755
08/25/2025, 11:58 PMthousands-advantage-10804
08/25/2025, 11:58 PMbland-article-62755
08/25/2025, 11:58 PMubuntu@gpu3test:~$ uname -a
Linux gpu3test 6.8.0-78-generic #78-Ubuntu SMP PREEMPT_DYNAMIC Tue Aug 12 11:34:18 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@gpu3test:~$ uname -r
6.8.0-78-generic
thousands-advantage-10804
08/25/2025, 11:59 PMbland-article-62755
08/25/2025, 11:59 PMthousands-advantage-10804
08/26/2025, 12:00 AMbland-article-62755
08/26/2025, 12:00 AMthousands-advantage-10804
08/26/2025, 12:00 AMbland-article-62755
08/26/2025, 12:00 AMbland-article-62755
08/26/2025, 12:00 AMbland-article-62755
08/26/2025, 12:01 AMthousands-advantage-10804
08/26/2025, 12:01 AMbland-article-62755
08/26/2025, 12:01 AMbland-article-62755
08/26/2025, 12:02 AMthousands-advantage-10804
08/26/2025, 12:03 AMbland-article-62755
08/26/2025, 12:03 AMthousands-advantage-10804
08/26/2025, 12:03 AMthousands-advantage-10804
08/26/2025, 12:03 AMbland-article-62755
08/26/2025, 12:09 AMbland-article-62755
08/26/2025, 12:09 AMthousands-advantage-10804
08/26/2025, 12:11 AMthousands-advantage-10804
08/26/2025, 12:11 AMbland-article-62755
08/26/2025, 12:12 AMthousands-advantage-10804
08/26/2025, 12:12 AMbland-article-62755
08/26/2025, 12:13 AMprehistoric-morning-49258
08/26/2025, 8:12 AMrhythmic-article-81903
08/26/2025, 9:25 AMbland-article-62755
08/26/2025, 1:22 PMrhythmic-article-81903
08/26/2025, 1:22 PMbland-article-62755
08/26/2025, 1:23 PMrhythmic-article-81903
08/26/2025, 1:24 PMbland-article-62755
08/26/2025, 10:58 PMnvidia-smi
always only detects 1 card. They all have the same interrupt IRQ number. I'm out of ideas and am starting to think about just opening up a bug issue.rhythmic-article-81903
08/26/2025, 11:13 PMrhythmic-article-81903
08/26/2025, 11:16 PMbland-article-62755
08/26/2025, 11:30 PMbland-article-62755
08/26/2025, 11:33 PMbland-article-62755
08/27/2025, 12:47 AMbland-article-62755
08/29/2025, 5:03 AMbland-article-62755
08/29/2025, 5:05 AMbland-article-62755
08/29/2025, 5:06 AMcpu-model: host-passthrough
then the MMIO reservation on most modern systems would be sufficient to easily handle multiple GPUs.