Hello all,
for people in qemu-devel list, you might want to have a look at the
previous thread about this topic, at
http://www.spinics.net/lists/kvm/msg61537.html
but I will try to recap here.
I found that virtual machines in my host booted 2x slower (on average
it's 2x slower, but probably some parts are at least 3x slower) under
libvirt compared to manual qemu-kvm launch. With the help of Daniel I
narrowed it down to the vhost_net presence (default active when launched
by libvirt) i.e. with vhost_net, boot process is *UNIFORMLY* 2x slower.
The problem is still reproducible on my systems but these are going to
go to production soon and I am quite busy, I might not have many more
days for testing left. Might be just next saturday and sunday for
testing this problem, so if you can write here some of your suggestions
by saturday that would be most appreciated.
I have performed some benchmarks now, which I hadn't performed in the
old thread:
openssl speed -multi 2 rsa : (cpu benchmark) show no performance
difference with or without vhost_net
disk benchmarks : show no performance difference with or without vhost_net
the disk benchmarks were: (both with cache=none and cache=writeback)
dd streaming read
dd streaming write
fio 4k random read in all cases of cache=none, cache=writeback with host
cache dropped before test, cache=writeback with all fio data in host
cache (measures context switch)
fio 4k random write
So I couldn't reproduce the problem with any benchmark that came to my mind.
But in the boot process this is very visible.
I'll continue the description below
before that, here are the System Specifications:
---------------------------------------
Host is with kernel 3.0.3 and Qemu-KVM 0.14.1, both vanilla and compiled
by me.
Libvirt is the version in Ubuntu 11.04 Natty which is 0.8.8-1ubuntu6.5 .
I didn't recompile this one
VM disks are LVs of LVM on MD raid array.
The problem shows identically on both cache=none and cache=writeback.
Aio native.
Physical CPUs are: dual westmere 6-core (12 cores total, + hyperthreading)
2 vCPUs per VM.
All VMs are idle or off except the VM being tested.
Guests are:
- multiple Ubuntu 11.04 Natty 64bit with their 2.6.38-8-virtual kernel:
very-minimal Ubuntu installs with deboostrap (not from the Ubuntu installer)
- one Fedora Core 6 32bit with a 32bit 2.6.38-8-virtual kernel + initrd
both taken from Ubuntu Natty 32bit (so I could have virtio). Standard
install (except kernel replaced afterwards).
Always static IP address in all guests
---------------------------------------
All types of guests show this problem, but it is more visible in the FC6
guest because the boot process is MUCH longer than in the
debootstrap-installed ubuntus.
Please note that most of boot process, at least from a certain point
onwards, appears to the eye uniformly 2x or 3x slower under vhost_net,
and by boot process I mean, roughly, copying by hand from some screenshots:
Loading default keymap
Setting hostname
Setting up LVM - no volume groups found
checking ilesystems... clean ...
remounting root filesystem in read-write mode
mounting local filesystems
enabling local filesystems quotas
enabling /etc/fstab swaps
INIT entering runlevel 3
entering non-interactive startup
Starting sysstat: calling the system activity data collector (sadc)
Starting background readahead
********** starting from here it is everything, or almost everything,
much slower
Checking for hardware changes
Bringing up loopback interface
Bringing up interface eth0
starting system logger
starting kernel logger
starting irqbalance
starting potmap
starting nfs statd
starting rpc idmapd
starting system message bus
mounting other filesystems
starting PC/SC smart card daemon (pcscd)
starint hidd ... can't open HIDP control socket : address familiy not
supported by protocol (this is an error due to backporting a new ubuntu
kernel to FC6)
starting autofs: loading autofs4
starting automount
starting acpi daemon
starting hpiod
starting hpssd
starting cups
starting sshd
starting ntpd
starting sendmail
starting sm-client
startingg console mouse services
starting crond
starting xfs
starting anacron
starting atd
starting youm-updatesd
starting Avahi daemon
starting HAL daemon
From the point I marked, onwards, most are services, i.e. daemons
listening from sockets, so I have thought that maybe the binding to a
socket could have been slower under vhost_net, but trying to put nc in
listening with: "nc -l 15000" is instantaneous, so I am not sure.
The shutdown of FC6 with basically the same services as above which tear
down, is *also* much slower on vhost_net.
Thanks for any suggestions
R.