On Fri, Mar 01, 2013 at 12:02:07 -0300, Eduardo Habkost wrote:
On Fri, Mar 01, 2013 at 02:12:38PM +0100, Jiri Denemark wrote:
> Definitely, we plan to start using "enforce" flag as soon as we have
> better CPU probing interface with QEMU. Since libvirt does not currently
> consult CPU specs with QEMU, some configurations in fact rely on QEMU
> dropping features it can't provide. Of course, that's bad for several
> reasons but we don't want such configurations to suddenly stop working.
> We want to first fix the CPU specs libvirt creates so that we know they
> will work with "enforce".
Also: more important than fixing the CPU definitions from libvirt, is to
ask QEMU for host capabilities and CPU model definitions. The whole
point of this is to solve the CPU model data duplication/synchronization
problems between libvirt and QEMU.
Once you are able to query CPU model definitions on runtime, you don't
even need to make cpu_map.xml agree with QEMU. You can simply ask QEMU
how each model looks like, and remove/add features from the command-line
as necessary, so the resulting VM matches what the user asked for.
Right, that's what I had in mind. By libvirt providing correct CPU
definitions (probably better called the right -cpu command line option)
I meant that we need to actually probe QEMU rather than making the
definitions on our own from cpu_map.xml and host CPUID.
> > Limitation: no proper machine-friendly interface to
report which features
> > are missing.
> >
> > Workaround: See "querying for host capabilities" below.
>
> I doubt we will be ready to start using "enforce" before the machine
> friendly interface is available...
If you query for the "-cpu host" capabilities first and ensure all
features from a CPU model is available, enforce is supposed to not fail.
I understand that a machine-friendly error reporting for "enforce" would
be very useful, but note that if "enforce" fails, it is probably already
too late for libvirt, and that means that what libvirt thinks about host
capabilities and CPU models is already incorrect.
But we still need to know details about each CPU model so that we
can choose the right one. I was trying to say that by the time we have
this probing and can start using enforce, the machine friendly reporting
could be available as well and the limitation will be gone.
> > == Future plans ==
> >
> > It would be interesting to get rid of the requirement for a live QEMU process
> > (with a complete machine being created) to be already running.
>
> Hmm, so is this complete machine needed even for getting CPU models from
> qom-list-types or only for querying exact definitions using
> query-cpu-definitions command?
Maybe "complete machine" isn't the right expression, here. What I mean is:
AFAIK, it is not possible to get a QMP monitor without actually having a
machine being created by QEMU (even if it is a machine that will never run).
OK, I thought there was something special needed. We already start QEMU
in such a way that we can communicate with it using QMP monitor. So this
is just a question of using the right machine if we need to know the
details about given CPU model.
> > = Getting information about CPU models =
> >
> > Requirement: libvirt uses the predefined CPU models from QEMU, but it needs to
> > be able to query for CPU model details, to find out how it can create a VM
that
> > matches what was requested by the user.
> >
> > Current problem: libvirt has a copy of the CPU model definitions on its
> > cpu_map.xml file, and the copy can be out of sync in case CPU models in QEMU
> > change. libvirt also assumes that the set of features on each model is always
> > the same on all machine-types, which is not true.
> >
> > Challenge: the resulting CPU features depend on lots of factors, including
> > the machine-type.
> >
> > Workaround: start a paused VM and query for the CPU device information
> > after the CPU was created.
I just noticed another problem here, but this gave me an idea that would
help solve the "enforce" error reporting problem:
Problem: "qemu -machine <M> -cpu <model>" will create CPU
objects
where the CPU features are _already_ filtered based on host
capabilities.
Ah, it seems logical now that you mention it :-)
* Using "enforce" wouldn't solve it, because then
QEMU would abort, and
QMP would be unavailable.
Solution: we could have a CPU object property like
"removed-features" that would have the list of features that were
disabled because they are not supported by the host (and would make
"enforce" fail).
* This would solve the problem above and also be a machine-friendly
way to check for possible "enforce" errors.
* In other words: instead of "enforce", libvirt could use
"check"
instead of "enforce", and before unpausing the VM (or even starting
migration), it should first check if the "removed-features" property
is
empty.
Would that work for you?
Yes, that seems like it could work. In fact, it seems much better than
using enforce and trying to deal with aborted QEMU.
Jirka