This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

11/16/2023, 6:13 PM

This message was deleted.

witty-jelly-95845

11/16/2023, 7:28 PM

Are you asking which OS Harvester uses as its base? If so, it's a customised version of SLE Micro

brainy-whale-97450

11/16/2023, 9:51 PM

No I was asking what OS was being used on the VM that was having errors.

thousands-action-31843

11/17/2023, 2:55 PM

ethtool doesn't work. The max inside the VM is 256 because qemu/kvm set the default limits as 256. To bump them up, you need to modify the qemu command line to add the rx_queue_size and tx_queue_size to the device args by putting the config I referenced in the libvirt xml.

thousands-action-31843

11/17/2023, 3:02 PM

I already bumped up the limit on the physical interfaces, but that is not where the packets are being dropped at. I have a script which monitors the interfaces on the host OS side and inside the container. It checks the physical, veth, net*, and tap* interfaces and the taps are where the packets are being dropped because the VM can't respond fast enough sometimes. Bumping up the tx/rx queue sizes is supposed to help with that, but I don't see any way I can accomplish that with Harvester beside using my own packaged KubeVirt. I am going to take another look at macvtap today, maybe that is a solution to the issue I am running into. I can't use SR-IOV due to the issue with PCI devices being assigned incorrectly by KubeVirt and some of my VM images don't support it anyway.

brainy-whale-97450

11/18/2023, 12:24 AM

What kind of traffic are we talking about here?

brainy-whale-97450

11/18/2023, 12:26 AM

We have virtual machines doing 2gb+ in normal traffic.

thousands-action-31843

11/20/2023, 2:55 PM

This is a late reply, but the issue in my case is a device with an Atom processor and the number of packets per second. Large packets and throughput aren't as much of an issue, I start experiencing drops around 50-55K+ pps. At a full packet size, that is nearly maxing out the 1G pipe, but the test case I am having to satisfy uses a smaller average packet size and is more concerned about at what point packets start dropping.

brainy-whale-97450

11/20/2023, 3:41 PM

Linux os? What does your sysctl look like? Are you using 9000 mtu on your layer 3 interfaces? What network card? Are you using all the offloading capabilities?

thousands-action-31843

11/20/2023, 5:13 PM

packets are being dropped on the virtual side on the tap inside the container that connects to the virtio device inside the VM. MTU doesn't help, these are small packets, around 400 bytes.

thousands-action-31843

11/20/2023, 5:23 PM

Fedora 38 test VMs. Offloading on the physical side Intel I350 is on, but again the packet drops are on the virtual side, not the physical. The VM isn't processing the packets fast enough. I believe it is bottlenecked due to the Atom C3958 speed between a combination of ksoftirqd, %si, and %st. Bumping up the rx queue size to 1024 has given about a 10-15% increase, I think I can push around 60-65K pps now with similar drop rates as before at the lower pps. I'm starting to look to see whether macvtap is an option to reduce the CPU utilization. This is also UDP packets, tested with both a Spirent and with a shell script that spawns multiple parallel iperf3 tests to create multiple flows. Single flows max out a single core quicker because all traffic goes through a single virtio multiqueue thread and the single ksoftirqd inside the VM consumes 99% of a core. With multiple threads, the load is spread across the multiple virtio multiqueue threads and multiple ksoftirqd threads inside the VM.

thousands-action-31843

11/20/2023, 5:29 PM

The 1024 rx queue would definitely help if this was bursty traffic, but in this case, this is a load test that I have been given and I have to see how much performance I can get out of this box compared to a previous solution running on the same exact hardware. Even if I have to use something else like macvtap to get closer to the performance number I am trying to hit, I would likely still leave the config in place to use the larger rx queue to handle bursts better.

brainy-whale-97450

11/20/2023, 5:32 PM

Have oyu increased the txqueuelen of the inteface?

brainy-whale-97450

11/20/2023, 5:32 PM

what does ethtool -l <intrface> say

brainy-whale-97450

11/20/2023, 5:32 PM

Have oyu tried enabling zero copy?

thousands-action-31843

11/20/2023, 5:50 PM

ethtool -l inside the VM shows 4 combined since this is a quad-core VM. I am limited in what changes I can make inside the VM due to the test requirements. I'm testing with Fedora mainly while trying to tune the system, but the real test uses a VNF appliance. Early on, I bumped up the txqueuelen on the physicals before I tracked down that the packet drops were occurring on the tap that is inside the container. The tap faces the VM. I did enable zero copy on the host OS side, but that didn't help my pps because the packets are being dropped on the virtual side. I have run the VM on dedicated cores using CPUManager and not on dedicated to allow the multiqueue threads to move around. In the non-dedicated test, other processes such as kube-apiserver spinning up would trigger packet loss, so I'm running dedicated now. I've tested IRQ/interrupts in the host OS not dedicated and on a dedicated core that I isolated from Kubernetes use by excluding it from the defaultCpuSet and disabling that core temporarily via 'chcpu -d 15' to allow kubelet to start and not complain that the default set didn't include the core.

brainy-whale-97450

11/20/2023, 7:11 PM

fond this:

brainy-whale-97450

11/20/2023, 7:11 PM

There is no "ring buffer" of packets on a virtio-net device. This means that if the TUN/TAP device’s TX queue fills up because the VM is not receiving (either fast enough or at all) then there is nowhere for new packets to go, and the hypervisor sees TX loss on the tap. If you notice TX loss on a TUN/TAP, increase the tap

txqueuelen

to avoid that, similar to increasing the RX ring buffer to stop receive loss on a physical NIC.

brainy-whale-97450

11/20/2023, 7:14 PM

What do you have for receive settings on these systls?

brainy-whale-97450

11/20/2023, 7:14 PM

net.core.rmem_max = 2147483647 net.core.wmem_max = 2147483647 net.core.netdev_max_backlog = 2500 net.core.somaxconn = 65000 net.core.default_qdisc = fq

brainy-whale-97450

11/20/2023, 7:14 PM

Copy code

sysctl -p | grep net.core.

brainy-whale-97450

11/20/2023, 7:16 PM

So it seems, increase rx queue and tx queue for physical interface ot max, set txqueuelen on tap interface to 10000, then make sure you have enought udp buffer in the vm: net.core.rmem_max

thousands-action-31843

11/20/2023, 8:02 PM

Thanks for the input. That sounds like an old doc. qemu support for rx_queue_size/tx_queue_size for virtio network devices is documented in multiple places. The default is 256 and these options were added to allow bumping up the size up to a max of 1024. I just found one place saying tx_queue_size is only good for vhost-user so that explains why I only saw rx increase inside the VM when I changed both in my sidecar script. The default net.core values were much lower. I just bumped them up to the values listed above and reran the test and no real change in pps before the drops. I bumped the sysctl values up in the host OS and tested, then also bumped up inside the VM and tested again. txqueuelen on the tap can't be changed. The below log is from inside the container. I tried 100/1000/10000 and same error.

Copy code

fedora-05-01:/ # ip link set tap1 txqueuelen 1000 
RTNETLINK answers: Operation not permitted

brainy-whale-97450

02/09/2024, 7:17 PM

Not sure what else you ever figured out, but I was able to get mutiqueue enabled by adding: networkInterfaceMultiqueue: true domain: devices: networkInterfaceMultiqueue: true

brainy-whale-97450

02/09/2024, 7:18 PM

that increased ring count from 1 to 8 on all interfaces

4 Views

Open in Slack

Previous Next