On Wed, 2012-03-07 at 16:07 -0700, Eric Blake wrote:
On 03/07/2012 03:26 PM, Eduardo Habkost wrote:
> Thanks a lot for the explanations, Daniel.
>
> Comments about specific items inline.
>
>>> - How can we make sure there is no confusion between libvirt and Qemu
>>> about the CPU models? For example, what if cpu_map.xml says model
>>> 'Moo' has the flag 'foo' enabled, but Qemu disagrees?
How do we
>>> guarantee that libvirt gets exactly what it expects from Qemu when
>>> it asks for a CPU model? We have "-cpu ?dump" today, but
it's not
>>> the better interface we could have. Do you miss something in special
>>> in the Qemu<->libvirt interface, to help on that?
>
> So, it looks like either I am missing something on my tests or libvirt
> is _not_ probing the Qemu CPU model definitions to make sure libvirt
> gets all the features it expects.
>
> Also, I would like to ask if you have suggestions to implement
> the equivalent of "-cpu ?dump" in a more friendly and extensible way.
> Would a QMP command be a good alternative? Would a command-line option
> with json output be good enough?
I'm not sure where we are are using "-cpu ?dump", but it sounds like we
should be.
>
> (Do we have any case of capability-querying being made using QMP before
> starting any actual VM, today?)
Right now, we have two levels of queries - the 'qemu -help' and 'qemu
-device ?' output is gathered up front (we really need to patch things
to cache that, rather than repeating it for every VM start).
Eric:
In addition to VM start, it appears that the libvirt qemu driver also
runs both the 32-bit and 64-bit qemu binaries 3 times each when fetching
capabilities that appears to occur when fetching VM state. Noticed this
on an openstack/nova compute node that queries vm state periodically.
Seemed to be taking a long time. stracing libvirtd during these queries
showed this sequence for each query:
6461 17:15:25.269464 execve("/usr/bin/qemu", ["/usr/bin/qemu",
"-cpu", "?"], [/* 2 vars */]) = 0
6462 17:15:25.335300 execve("/usr/bin/qemu", ["/usr/bin/qemu",
"-help"], [/* 2 vars */]) = 0
6463 17:15:25.393786 execve("/usr/bin/qemu", ["/usr/bin/qemu",
"-device", "?", "-device", "pci-assign,?",
"-device", "virtio-blk-pci,?"], [/* 2 vars */]) = 0
6466 17:15:25.841086 execve("/usr/bin/qemu-system-x86_64",
["/usr/bin/qemu-system-x86_64", "-cpu", "?"], [/* 2 vars
*/]) = 0
6468 17:15:25.906746 execve("/usr/bin/qemu-system-x86_64",
["/usr/bin/qemu-system-x86_64", "-help"], [/* 2 vars */]) = 0
6469 17:15:25.980520 execve("/usr/bin/qemu-system-x86_64",
["/usr/bin/qemu-system-x86_64", "-device", "?",
"-device", "pci-assign,?", "-device",
"virtio-blk-pci,?"], [/* 2 vars */]) = 0
Seems to add about a second per VM running on the host. The periodic
scan thus takes a couple of minutes on a heavily loaded host -- several
10s of VMs. Not a killer, but we'd like to eliminate it.
I see that libvirt does some level of caching of capabilities, checking
the st_mtime of the binaries to detect changes. I haven't figured out
when that caching comes into effect, but it doesn't prevent the execs
above. So, I created a patch series that caches the results of parsing
the output of these calls that I will post shortly for RFC. It
eliminates most of such execs. I think it might obviate the existing
capabilities caching, but I'm not sure. Haven't had time to look into
it.
Later,
Lee Schermerhorn
HPCS
Then we
start qemu with -S, query QMP, all before starting the guest (qemu -S is
in fact necessary for setting some options that cannot be set in the
current CLI but can be set via the monitor) - but right now that is the
only point where we query QMP capabilities.
<snip>