Slow VM start/revert, when trying to start/revert dozens of VMs in parallel

Hi, my problem can be described simply: libvirt can't handle starting dozens of VMs at the same time. (technically, it can, but it's really slow.) We have an AMD machine with 256 logical cores and 1.5T ram. On that machine there is roughly 200 VMs. Each VM is the same: 8GB of RAM, 4 VCPUs. Half of them is Win7 x86, the other half is Win7 x64. VMs are using qcow2 as the disk image. These images reside in the ramdisk (tmpfs). We use these machines for automatic malware analysis, so our scenario consists of this cycle: - reverting VM to a running state - execute sample inside of the VM for ~1-2 minutes - shutdown the VM Of course, this results in multiple VMs trying to start at the same time. At first, reverts/starts are really fast - second or two. After about a minute, the "revertToSnapshot" suddenly takes 10-15 seconds, which is really unacceptable. For comparison, we're running the same scenarion on Proxmox, where the revertToSnapshot usually takes 2 seconds. Few notes: - Because of this fast cycle (~2-3 minutes) and because of VMs taking 10-15 seconds to start, there is barely more than 25-30 VMs running at once. We would really love to utilise the whole potential of such beast machine of ours, and have at least ~100 VMs running at any given time. - During the time running, the avg. CPU load isn't higher than 25%. Also, there's only about 280 GB of RAM used. Therefore, it's not limitation of our resources. - When the framwork is running and libvirt is making its best to start our VMs, I noticed that every libvirt operation is suddenly very slow. Even simple "virsh list [--all]" takes few seconds to complete, even though it finishes instantly when no VM is running/starting. I was trying to search for this issue, but didn't really find anything besides this presentation: https://events19.linuxfoundation.org/wp-content/uploads/2017/12/Scalability-... However, I couldn't find those commits in your upstream. Is this a known issue? Or is there some setting I don't know of which would magically make the VMs start faster? As for steps to reproduce - I don't think there is anything special needed. Just try to start/destroy several VMs in a loop. There is even provided one-liner for that in the presentation above. ``` # For multiple domains: # while virsh start $vm && virsh destroy $vm; do : ; done # → ~30s hang ups of the libvirtd main loop ``` Best Regards, Petr

I forgot to mention: ``` $ lsb_release -a Distributor ID: Ubuntu Description: Ubuntu 22.04 LTS Release: 22.04 Codename: jammy $ virsh version Compiled against library: libvirt 8.0.0 Using library: libvirt 8.0.0 Using API: QEMU 8.0.0 Running hypervisor: QEMU 6.2.0 ``` ________________________________ Od: Petr Beneš <w.benny@outlook.com> Odesláno: pondělí 9. května 2022 20:52 Komu: libvirt-users@redhat.com <libvirt-users@redhat.com> Předmět: Slow VM start/revert, when trying to start/revert dozens of VMs in parallel Hi, my problem can be described simply: libvirt can't handle starting dozens of VMs at the same time. (technically, it can, but it's really slow.) We have an AMD machine with 256 logical cores and 1.5T ram. On that machine there is roughly 200 VMs. Each VM is the same: 8GB of RAM, 4 VCPUs. Half of them is Win7 x86, the other half is Win7 x64. VMs are using qcow2 as the disk image. These images reside in the ramdisk (tmpfs). We use these machines for automatic malware analysis, so our scenario consists of this cycle: - reverting VM to a running state - execute sample inside of the VM for ~1-2 minutes - shutdown the VM Of course, this results in multiple VMs trying to start at the same time. At first, reverts/starts are really fast - second or two. After about a minute, the "revertToSnapshot" suddenly takes 10-15 seconds, which is really unacceptable. For comparison, we're running the same scenarion on Proxmox, where the revertToSnapshot usually takes 2 seconds. Few notes: - Because of this fast cycle (~2-3 minutes) and because of VMs taking 10-15 seconds to start, there is barely more than 25-30 VMs running at once. We would really love to utilise the whole potential of such beast machine of ours, and have at least ~100 VMs running at any given time. - During the time running, the avg. CPU load isn't higher than 25%. Also, there's only about 280 GB of RAM used. Therefore, it's not limitation of our resources. - When the framwork is running and libvirt is making its best to start our VMs, I noticed that every libvirt operation is suddenly very slow. Even simple "virsh list [--all]" takes few seconds to complete, even though it finishes instantly when no VM is running/starting. I was trying to search for this issue, but didn't really find anything besides this presentation: https://events19.linuxfoundation.org/wp-content/uploads/2017/12/Scalability-... However, I couldn't find those commits in your upstream. Is this a known issue? Or is there some setting I don't know of which would magically make the VMs start faster? As for steps to reproduce - I don't think there is anything special needed. Just try to start/destroy several VMs in a loop. There is even provided one-liner for that in the presentation above. ``` # For multiple domains: # while virsh start $vm && virsh destroy $vm; do : ; done # → ~30s hang ups of the libvirtd main loop ``` Best Regards, Petr

On Mon, May 09, 2022 at 06:52:32PM +0000, Petr Beneš wrote:
Hi,
my problem can be described simply: libvirt can't handle starting dozens of VMs at the same time.
(technically, it can, but it's really slow.)
We have an AMD machine with 256 logical cores and 1.5T ram. On that machine there is roughly 200 VMs. Each VM is the same: 8GB of RAM, 4 VCPUs. Half of them is Win7 x86, the other half is Win7 x64. VMs are using qcow2 as the disk image. These images reside in the ramdisk (tmpfs).
We use these machines for automatic malware analysis, so our scenario consists of this cycle: - reverting VM to a running state - execute sample inside of the VM for ~1-2 minutes - shutdown the VM
Of course, this results in multiple VMs trying to start at the same time. At first, reverts/starts are really fast - second or two. After about a minute, the "revertToSnapshot" suddenly takes 10-15 seconds, which is really unacceptable. For comparison, we're running the same scenarion on Proxmox, where the revertToSnapshot usually takes 2 seconds.
Can you share the XML configuration of one of your guests - assuming they all have the same basic configuration. As a gut feeling it sounds to me like it could be initially fast due to utilization of host I/O cache, but then slows down due to having to flush data to disk / read fresh from disk. This could be the case if the disk configuration cache mode is set to certain values, so the XML config will show us this info. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Sure. As for flushing/reading from disk - as I said, all VMs reside on ramdisk. I'd also like to add, that the VMs are "linked-clones" with the same underlying base qcow2 - which is also in the ramdisk. ```xml <domain type='kvm'> <name>win7-x86-1-101</name> <uuid>dc7296c0-228a-44e8-bd47-db019a1f6344</uuid> <metadata> <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0"> <libosinfo:os id="http://microsoft.com/win/10"/> </libosinfo:libosinfo> </metadata> <memory unit='KiB'>8388608</memory> <currentMemory unit='KiB'>8388608</currentMemory> <vcpu placement='static'>4</vcpu> <os> <type arch='x86_64' machine='pc-q35-6.2'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <hyperv mode='custom'> <relaxed state='on'/> <vapic state='on'/> <spinlocks state='on' retries='8191'/> </hyperv> <vmcoreinfo state='on'/> </features> <cpu mode='host-passthrough' check='none' migratable='on'> <topology sockets='1' dies='1' cores='4' threads='1'/> </cpu> <clock offset='localtime'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> <timer name='hypervclock' present='yes'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>preserve</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/mnt/ramdisk/vms/clones/win7-x86-1-101/image.qcow2'/> <target dev='hda' bus='sata'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='qemu-xhci' ports='15'> <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> </controller> <controller type='sata' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/> </controller> <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='1' port='0x10'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/> </controller> <controller type='pci' index='2' model='pcie-to-pci-bridge'> <model name='pcie-pci-bridge'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </controller> <controller type='pci' index='3' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='3' port='0x11'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/> </controller> <controller type='pci' index='4' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='4' port='0x12'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/> </controller> <controller type='pci' index='5' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='5' port='0x13'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/> </controller> <interface type='network'> <mac address='52:54:00:bf:7b:01'/> <source network='cuckoonet1'/> <model type='e1000'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x01' function='0x0'/> </interface> <serial type='pty'> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <input type='tablet' bus='usb'> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='21101' autoport='no' websocket='11101' listen='0.0.0.0'> <listen type='address' address='0.0.0.0'/> </graphics> <audio id='1' type='none'/> <video> <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/> </video> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </memballoon> <panic model='hyperv'/> </devices> </domain> ``` ________________________________ Od: Daniel P. Berrangé <berrange@redhat.com> Odesláno: úterý 10. května 2022 10:08 Komu: Petr Beneš <w.benny@outlook.com> Kopie: libvirt-users@redhat.com <libvirt-users@redhat.com> Předmět: Re: Slow VM start/revert, when trying to start/revert dozens of VMs in parallel On Mon, May 09, 2022 at 06:52:32PM +0000, Petr Beneš wrote:
Hi,
my problem can be described simply: libvirt can't handle starting dozens of VMs at the same time.
(technically, it can, but it's really slow.)
We have an AMD machine with 256 logical cores and 1.5T ram. On that machine there is roughly 200 VMs. Each VM is the same: 8GB of RAM, 4 VCPUs. Half of them is Win7 x86, the other half is Win7 x64. VMs are using qcow2 as the disk image. These images reside in the ramdisk (tmpfs).
We use these machines for automatic malware analysis, so our scenario consists of this cycle: - reverting VM to a running state - execute sample inside of the VM for ~1-2 minutes - shutdown the VM
Of course, this results in multiple VMs trying to start at the same time. At first, reverts/starts are really fast - second or two. After about a minute, the "revertToSnapshot" suddenly takes 10-15 seconds, which is really unacceptable. For comparison, we're running the same scenarion on Proxmox, where the revertToSnapshot usually takes 2 seconds.
Can you share the XML configuration of one of your guests - assuming they all have the same basic configuration. As a gut feeling it sounds to me like it could be initially fast due to utilization of host I/O cache, but then slows down due to having to flush data to disk / read fresh from disk. This could be the case if the disk configuration cache mode is set to certain values, so the XML config will show us this info. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (2)
-
Daniel P. Berrangé
-
Petr Beneš