
On Thu, Jul 25, 2013 at 10:15:56AM -0300, Eduardo Habkost wrote:
On Thu, Jul 25, 2013 at 10:45:10AM +0100, Daniel P. Berrange wrote:
On Wed, Jul 24, 2013 at 03:25:19PM -0300, Eduardo Habkost wrote:
In addition to the "-cpu host" KVM initialization problem, this is an additional problem with the current interfaces provided by QEMU:
1) libvirt needs to query data that depend on chosen machine-type and CPU model 2) Some machine-type behavior is code and not introspectable data * Luckily most of the data we need in this case should/will be encoded in the compat_props tables. * In either case, we don't have an API to query for machine-type compat_props information yet. 3) CPU model behavior will be modelled as CPU class behavior. Like on the machine-type case, some of the CPU-model-specific behavior may be modelled as code, and not introspectable data. * However, e may be able to eventually encode most or all of CPU-model-specific behavior simply as different per-CPU-class property defaults. * In either case, we don't have an API for QOM class introspection, yet.
But there's something important in this case: the resulting CPUID data for a specific machine-type + CPU-model combination must be always the same, forever. This means libvirt may even use a static table, or cache this information indefinitely.
(Note that I am not talking about "-cpu host", here, but about all the other CPU models)
Hmm, so if the CPU filtering can vary per every single individual machine type, then the approach Jiri started here, of invoking QEMU with machine type set to query the CPU after it was created, is definitely not something we can follow. It is just far too inefficient.
I believe there's some confusion here: we are trying to solve two problems:
1) CPU feature filtering (checking which features are available in a given host) 2) CPU model probing (checking what exactly is going to be available when a given CPU model is used, in case nothing is filtered out)
Yep, what Jiri proposed in the original libvirt thread was just a solution to 1). In seeing that though, I was concerned about how it scales up once we have to deal with 2) as well, which I believe is planned future work.
Item (1) depends on: host CPU capabilities, host kernel capabilities, QEMU capabilities, presence of some few QEMU command-line options (e.g. kernel irqchip), but shouldn't depend on the machine-type. It depends on /dev/kvm being open.
Item (2) depends on the machine-type, but is static and must never change on future QEMU versions (if it changes, it is a QEMU bug). It doesn't depend on opening /dev/kvm.
Item (1) can be solved if libvirt does the work itself, by opening /dev/kvm and checking for GET_SUPPORTED_CPUID and checking for QEMU options/capabilities (as long as we document that very carefully). But adding a more specific QMP command that won't require accel=kvm to work may be simpler and better for everybody.
Item (2) may be solved today using a static table and/or caching (so libvirt just need to query this information once in a lifetime). It can also be solved partially (without machine-type support) in theory if QEMU let libvirt repeatedly create and destroy CPU objects just to query the resulting feature properties.
We really don't want to have static tables, since that creates pain in the case where distro vendors create their own custom machine types or CPU models. It would mean libvirt had to record info not only about upstream QEMU, but about every vendor's QEMU builds. Probing the actual binary is the only sensible way here.
...but both problems could be solved very easily using current QEMU interfaces, if libvirt simply executed the QEMU binary more than once. Is "must not run QEMU more than once" a hard requirement? Perfect is the enemy of good. :)
Yes, it is a hard requirement.
I understand that the QEMU code isn't currently structured in a way that lets it easily expose information that varies per machine type, but I don't think we need to solve the entire problem space in a perfectly generic fashion here. Perfect is the enemy of good.
Right. Also, the more important item (item 1) is not affected by machine-types. Host features change every time you run on a new host/kernel, so probing it precisely is very useful, to detect problems earlier (not just at the last moment before starting a VM).
On the other hand, per-machine-type CPU model changes are more rare, and libvirt can still detect unexpected results immediately before the VM is started. (I don't know what libvirt would do in case it detects it, though. Abort? Log a warning?)
We don't want to be running QEMU multiple times during the startup process for a VM, because that adds delays to the startup process. It might not sound like much but adding a few 100ms to probe CPUs by running QEMU is quite significant for apps like libvirt-sandbox and libguesfs where absolute boot time is important. We used to run QEMU at startup to probe things & we just recently got rid of that delay, so I don't want to re-introduce it against. When starting a VM, we only once to start QEMU once, as the actual instance that is going to run the VM.
If we can get all the CPU feature flag filtering information to be in statically defined data structures, then it seems that it would be pretty straightforward to add a monitor API that takes a CPU model name and machine type name, and returns the list of feature flags, without actually having to initialize the machine type or CPU. It can even just open /dev/kvm & issue the neccessary ioctl, without having to initialize the entire KVM CPU subsystem in QEMU.
The "without actually having to initialize the machine" part may be complicated, but it may be doable. But depending on the direction QEMU machine-types design is going (I don't know if there are plans to eventually make them more QOM-friendly), the solution accepted by QEMU may be different.
I will suggest this as a topic for the next KVM call. Are you interested in joining the call?
Yes, assuming it doesn't clash with anything else i have scheduled. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|