This message was deleted Rancher Users #harvester-dev

Join Slack

This message was deleted.

# harvester-dev

adamant-kite-43734

09/12/2022, 5:54 PM

This message was deleted.

limited-breakfast-50094

09/12/2022, 5:54 PM

Posting here

future-address-23425

09/12/2022, 5:54 PM

Sure

limited-breakfast-50094

09/12/2022, 6:27 PM

Still getting my VM set up so it might be a little while

future-address-23425

09/12/2022, 6:32 PM

No problem! Let me know if this works for you, too. I need to check if the issue perists on kubevirt 0.56.0 (current latest) and figure out why it doesn't affect minikube deployments. I will probably put together a PR on kubevirt.

👍 1

limited-breakfast-50094

09/14/2022, 5:46 PM

Okay I got my KubeVirt set up, VM is getting assigned to node with GPU, but I've still got some config issues, documenting everything and I'll let you know as soon as I try your workaround

👌🏼 1

limited-breakfast-50094

09/14/2022, 7:03 PM

I booted up a VM, then shut it down. Then I added:

Copy code

hostDevices:
          - deviceName: <http://nvidia.com/GeForce1060|nvidia.com/GeForce1060>
            name: gpu1

to the

spec.domain.devices

part of the VM's YAML config

limited-breakfast-50094

09/14/2022, 7:03 PM

When I booted up I got this error: failed to setup container for group 14: No available IOMMU models

limited-breakfast-50094

09/14/2022, 7:10 PM

ohh, I wonder if it's the GPU/Audio thing

limited-breakfast-50094

09/14/2022, 7:10 PM

I passed through my VGA device and Audio card, I need to tell Kubevirt to allow the audio to get passed through as well

future-address-23425

09/14/2022, 7:10 PM

you use the nvidia kubevirt device plugin?

limited-breakfast-50094

09/14/2022, 7:11 PM

limited-breakfast-50094

09/14/2022, 7:11 PM

I have externalResourceProvider set to false

future-address-23425

09/14/2022, 7:14 PM

Ok, it makes sense if your devices have a single bus address. It's not always the case, some FPGA boards for example expose their mgmt and user pfs separately.

limited-breakfast-50094

09/14/2022, 7:18 PM

Yeah, I'll have to make some tutorials for certain hardware and add troubleshooting docs

limited-breakfast-50094

09/14/2022, 7:26 PM

Still getting the no available IOMMU models. I don't have the

vfio_iommu_type1

module loaded

future-address-23425

09/14/2022, 7:27 PM

So, the hack worked right?

future-address-23425

09/14/2022, 7:28 PM

I'm not sure about that new issue, but I guess you can try just passing through sth simple, e.g. the USB controller.

future-address-23425

09/14/2022, 7:29 PM

It should be loaded together with

vfio-pci

limited-breakfast-50094

09/14/2022, 7:30 PM

Yeah, although GPU passthrough is the thing all our customers want, so I want to work through a real GPU example before I go forward. after loading the vfio_iommu_type1 module, I have a different error (progress?)

Copy code

failed to setup container for group 14: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x55ead073e260, 0x100000000, 0x79c00000, 0x7f64e2200000) = -12 (Cannot allocate memory)')"

limited-breakfast-50094

09/14/2022, 7:30 PM

is that a capability problem?

limited-breakfast-50094

09/14/2022, 7:31 PM

another manifestation of the no capability for SYS_RESOURCE?

future-address-23425

09/14/2022, 7:34 PM

Check please the virt launcher pod, how many containers are in there? Is the "compute" container the one that ligs this? Does it have the SYS_RESOURCE capability?

limited-breakfast-50094

09/14/2022, 7:36 PM

Copy code

securityContext:
    capabilities:
      add:
        - NET_BIND_SERVICE
        - SYS_PTRACE
        - SYS_NICE

limited-breakfast-50094

09/14/2022, 7:36 PM

nope, can I just edit it live?

limited-breakfast-50094

09/14/2022, 7:38 PM

also it's not a priviledged container

limited-breakfast-50094

09/14/2022, 7:40 PM

I tried editing the launch container and adding it but that didn't validate

limited-breakfast-50094

09/14/2022, 7:40 PM

I'll try your kubevirt workaroudn

future-address-23425

09/14/2022, 7:40 PM

Did you deploy my yaml?

future-address-23425

09/14/2022, 7:40 PM

It will automatically add this capability.

limited-breakfast-50094

09/14/2022, 7:40 PM

Doing that now

future-address-23425

09/14/2022, 7:41 PM

To every virt-launcher pod.

future-address-23425

09/14/2022, 7:41 PM

Sure, after that simply restart the vm.

limited-breakfast-50094

09/14/2022, 7:44 PM

Hey that worked!

limited-breakfast-50094

09/14/2022, 7:44 PM

Copy code

capabilities:
      add:
        - NET_BIND_SERVICE
        - SYS_PTRACE
        - SYS_NICE
        - SYS_RESOURCE

Pod is running

limited-breakfast-50094

09/14/2022, 7:45 PM

VM is running, I mean

future-address-23425

09/14/2022, 7:45 PM

The "correct" way to do it is to add in the kubevirt renderer a condition that checks if the vmi contains host devices and append that CAP to the pod.

limited-breakfast-50094

09/14/2022, 7:45 PM

I can do that, I'll go in and see if I can see the PCI device

future-address-23425

09/14/2022, 7:46 PM

It's gonna be there 😎 !

limited-breakfast-50094

09/14/2022, 7:46 PM

Good news:

Copy code

tobi@pcitest4:~$ diff before.txt after.txt 
9a10,11
> 00:02.7 PCI bridge [0604]: Red Hat, Inc. QEMU PCIe Root port [1b36:000c]
> 00:03.0 PCI bridge [0604]: Red Hat, Inc. QEMU PCIe Root port [1b36:000c]
18c20,22
< 06:00.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon [1af4:1045] (rev 01)
---
> 06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] [10de:1c02] (rev a1)
> 07:00.0 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
> 08:00.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon [1af4:1045] (rev 01)

limited-breakfast-50094

09/14/2022, 7:46 PM

I can see my NVIDIA device in the VM!

future-address-23425

09/14/2022, 7:46 PM

🎉

limited-breakfast-50094

09/14/2022, 7:46 PM

Okay, driver time

limited-breakfast-50094

09/14/2022, 7:46 PM

You rock elias!

inaccel 1

limited-breakfast-50094

09/14/2022, 8:12 PM

Hey we could also add the capability using the UI too

future-address-23425

09/14/2022, 9:21 PM

Unfortunately there is no way to add it through the VMI, which is the smaller KubeVirt component that we can manipulate. That's why I had to build a mutator to do this, in the rendered pod, under the hood.

future-address-23425

09/14/2022, 9:22 PM

I think the best way to do this is to resolve it upstream on KubeVirt.

limited-breakfast-50094

09/15/2022, 2:28 AM

Got it, I can incorporate the mutator into my pcidevices controller and then ship it tomorrow. Then I'll start working on the upstream fix

limited-breakfast-50094

09/15/2022, 2:28 AM

that screenshot was there from earlier, my slack crashed

limited-breakfast-50094

09/15/2022, 2:28 AM

I was thinking if we could edit the VMI then we might be able to mutate it from the UI

limited-breakfast-50094

09/15/2022, 2:29 AM

its' just changing the status though right?

future-address-23425

09/15/2022, 7:40 AM

We can't do it on the VMI. I think we should first try bumping KubeVirt (v0.57.0). It may use a newer libvirt version that doesn't need that CAP. It has happened again in the past.

future-address-23425

09/15/2022, 7:41 AM

Otherwise, if you could help me raise an issue at KubeVirt first, I have almost prepared the PR that fixes it!

limited-breakfast-50094

09/15/2022, 8:57 PM

Oh nice! How can I help?

limited-breakfast-50094

09/15/2022, 9:04 PM

KubeVirt 0.57 changelog doesn't mention libvirt

future-address-23425

09/15/2022, 9:48 PM

I think both v0.54.x and v0.57.x use libvirt v8.0.0, so they should probably behave the same, but let me check this first. I'll keep you posted.

limited-breakfast-50094

09/15/2022, 9:50 PM

I just looked through a bunch of 0.57 PRs and found no hint of a version bump in libvirt

future-address-23425

09/15/2022, 9:53 PM

What I don't get is why when I deploy kubevirt on minikube, pci passthrough works without the hack.

limited-breakfast-50094

09/15/2022, 9:58 PM

what do the minikube pods' securityContext look like?

future-address-23425

09/15/2022, 10:00 PM

It's identical (without the CAP_SYS_RESOURCE)!

⁉️ 1

limited-breakfast-50094

09/15/2022, 10:31 PM

does minikube use Docker for it's container engine?

future-address-23425

09/15/2022, 10:32 PM

yes, that's a core difference

limited-breakfast-50094

09/15/2022, 10:32 PM

Maybe there's some convenience setting that is giving the process that capability?

future-address-23425

09/15/2022, 10:33 PM

if it does, it does it under the hood, it escalates somehow

future-address-23425

09/16/2022, 12:00 PM

FYI: I tried KubeVirt v0.49.0 both on k3s and rke2, and PCI passthrough works normally (without the SYS_RESOURCE capability hack). I'm now looking for OS differences (apparmor, etc).

limited-breakfast-50094

09/16/2022, 4:21 PM

Could it be something in SLES?

limited-breakfast-50094

09/16/2022, 4:21 PM

its' the base os image in harvester

80 Views

Open in Slack

Previous Next