This message was deleted.
# harvester-dev
a
This message was deleted.
l
Posting here
f
Sure
l
Still getting my VM set up so it might be a little while
f
No problem! Let me know if this works for you, too. I need to check if the issue perists on kubevirt 0.56.0 (current latest) and figure out why it doesn't affect minikube deployments. I will probably put together a PR on kubevirt.
๐Ÿ‘ 1
l
Okay I got my KubeVirt set up, VM is getting assigned to node with GPU, but I've still got some config issues, documenting everything and I'll let you know as soon as I try your workaround
๐Ÿ‘Œ๐Ÿผ 1
I booted up a VM, then shut it down. Then I added:
Copy code
hostDevices:
          - deviceName: <http://nvidia.com/GeForce1060|nvidia.com/GeForce1060>
            name: gpu1
to the
spec.domain.devices
part of the VM's YAML config
When I booted up I got this error: failed to setup container for group 14: No available IOMMU models
ohh, I wonder if it's the GPU/Audio thing
I passed through my VGA device and Audio card, I need to tell Kubevirt to allow the audio to get passed through as well
f
you use the nvidia kubevirt device plugin?
l
no
I have externalResourceProvider set to false
f
Ok, it makes sense if your devices have a single bus address. It's not always the case, some FPGA boards for example expose their mgmt and user pfs separately.
l
Yeah, I'll have to make some tutorials for certain hardware and add troubleshooting docs
Still getting the no available IOMMU models. I don't have the
vfio_iommu_type1
module loaded
f
So, the hack worked right?
I'm not sure about that new issue, but I guess you can try just passing through sth simple, e.g. the USB controller.
It should be loaded together with
vfio-pci
.
l
Yeah, although GPU passthrough is the thing all our customers want, so I want to work through a real GPU example before I go forward. after loading the vfio_iommu_type1 module, I have a different error (progress?)
Copy code
failed to setup container for group 14: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x55ead073e260, 0x100000000, 0x79c00000, 0x7f64e2200000) = -12 (Cannot allocate memory)')"
is that a capability problem?
another manifestation of the no capability for SYS_RESOURCE?
f
Check please the virt launcher pod, how many containers are in there? Is the "compute" container the one that ligs this? Does it have the SYS_RESOURCE capability?
l
Copy code
securityContext:
    capabilities:
      add:
        - NET_BIND_SERVICE
        - SYS_PTRACE
        - SYS_NICE
nope, can I just edit it live?
also it's not a priviledged container
I tried editing the launch container and adding it but that didn't validate
I'll try your kubevirt workaroudn
f
Did you deploy my yaml?
It will automatically add this capability.
l
Doing that now
f
To every virt-launcher pod.
Sure, after that simply restart the vm.
l
Hey that worked!
Copy code
capabilities:
      add:
        - NET_BIND_SERVICE
        - SYS_PTRACE
        - SYS_NICE
        - SYS_RESOURCE
Pod is running
VM is running, I mean
f
The "correct" way to do it is to add in the kubevirt renderer a condition that checks if the vmi contains host devices and append that CAP to the pod.
l
I can do that, I'll go in and see if I can see the PCI device
f
It's gonna be there ๐Ÿ˜Ž !
l
Good news:
Copy code
tobi@pcitest4:~$ diff before.txt after.txt 
9a10,11
> 00:02.7 PCI bridge [0604]: Red Hat, Inc. QEMU PCIe Root port [1b36:000c]
> 00:03.0 PCI bridge [0604]: Red Hat, Inc. QEMU PCIe Root port [1b36:000c]
18c20,22
< 06:00.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon [1af4:1045] (rev 01)
---
> 06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] [10de:1c02] (rev a1)
> 07:00.0 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
> 08:00.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon [1af4:1045] (rev 01)
I can see my NVIDIA device in the VM!
f
๐ŸŽ‰
l
Okay, driver time
You rock elias!
inaccel 1
Hey we could also add the capability using the UI too
f
Unfortunately there is no way to add it through the VMI, which is the smaller KubeVirt component that we can manipulate. That's why I had to build a mutator to do this, in the rendered pod, under the hood.
I think the best way to do this is to resolve it upstream on KubeVirt.
l
Got it, I can incorporate the mutator into my pcidevices controller and then ship it tomorrow. Then I'll start working on the upstream fix
that screenshot was there from earlier, my slack crashed
I was thinking if we could edit the VMI then we might be able to mutate it from the UI
its' just changing the status though right?
f
We can't do it on the VMI. I think we should first try bumping KubeVirt (v0.57.0). It may use a newer libvirt version that doesn't need that CAP. It has happened again in the past.
Otherwise, if you could help me raise an issue at KubeVirt first, I have almost prepared the PR that fixes it!
l
Oh nice! How can I help?
KubeVirt 0.57 changelog doesn't mention libvirt
f
I think both v0.54.x and v0.57.x use libvirt v8.0.0, so they should probably behave the same, but let me check this first. I'll keep you posted.
l
I just looked through a bunch of 0.57 PRs and found no hint of a version bump in libvirt
f
What I don't get is why when I deploy kubevirt on minikube, pci passthrough works without the hack.
l
what do the minikube pods' securityContext look like?
f
It's identical (without the CAP_SYS_RESOURCE)!
โ‰๏ธ 1
l
does minikube use Docker for it's container engine?
f
yes, that's a core difference
l
Maybe there's some convenience setting that is giving the process that capability?
f
if it does, it does it under the hood, it escalates somehow
FYI: I tried KubeVirt v0.49.0 both on k3s and rke2, and PCI passthrough works normally (without the SYS_RESOURCE capability hack). I'm now looking for OS differences (apparmor, etc).
l
Could it be something in SLES?
its' the base os image in harvester