Re: [libvirt] [PATCH v3 3/4] qemu_capabilities: Introduce QEMU_CAPS_PM_WAKEUP_SUPPORT

11 Apr 2019


      On 4/11/19 12:19 PM, Peter Krempa wrote:
...
On Thu, Apr 11, 2019 at 11:58:04 +0200, Michal Privoznik wrote:
...
On 4/11/19 10:39 AM, Peter Krempa wrote:
...
On Thu, Apr 11, 2019 at 10:25:11 +0200, Michal Privoznik wrote:
...
This capability tells whether qemu is capable of waking up the
guest from PM suspend.
Based-on-work-of: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
   src/qemu/qemu_capabilities.c                  | 24 +++++++++++++++++++
   src/qemu/qemu_capabilities.h                  |  3 +++
   .../caps_4.0.0.riscv32.replies                | 11 +++++++++
   .../caps_4.0.0.riscv32.xml                    |  1 +
   .../caps_4.0.0.riscv64.replies                | 11 +++++++++
   .../caps_4.0.0.riscv64.xml                    |  1 +
   .../caps_4.0.0.x86_64.replies                 | 11 +++++++++
   .../caps_4.0.0.x86_64.xml                     |  1 +
   8 files changed, 63 insertions(+)
[...]
...
bool
   virQEMUCapsCPUFilterFeatures(const char *name,
                                void *opaque)
@@ -4373,6 +4395,8 @@ virQEMUCapsInitQMPMonitor(virQEMUCapsPtr qemuCaps,
           return -1;
       if (virQEMUCapsProbeQMPSEVCapabilities(qemuCaps, mon) < 0)
           return -1;
+    if (virQEMUCapsProbeQMPCurrentMachine(qemuCaps, mon) < 0)
+        return -1;
This seems wrong by definition. The function is supposed to query
current machine, but the capability lookup code uses 'none' machine
type. IIUC the support for wakeup in some cases depends on the presence
of ACPI in the guest and thus really can't be cached this way.
Yep, I was just about to write this but then got sidetracked.
So I see two options here:
1) Query if guest is capable of PM suspend at runtime. However, this has to
be done in a clever way. It would need to be done not only in
qemuProcessLaunch() (or some subordinate functions), it would need to be
done in qemuProcessReconnect() too (to cover case when user is upgrading
If you are upgrading the capabilities parsed from the status XML won't
have the bit representing the presence of the query command. This means
that new libvirt would still attempt the suspend ignoring whether it is
supported or not. I don't think it makes sense to re-detect presence of
stuff which was done prior to upgrading.
Well, then mere upgrade of libvirt won't fix this bug for you. You'd 
have to restart the domain to make this bug go away.
...
...
libvirt). For this, qemuMonitorConnect() looks like the best place - it's
called from both locations and it also already contains some 'runtime
capabilities' querying. Except not really - the migration capabilities
I was considering whether it's a good idea to keep runtime only
capabilities in the qemuCaps array. I can find both arguments for it and
against it. The convenient part is that it's automatically stored in the
status XML.
...
queried there are stored in @priv->migrationCaps rather than priv->qemuCaps.
Problem with storing runtime capabilities in priv->qemuCaps is that they
would get formatted into status XML. And this opens whole different can of
It is strongly desired to store it in the status XML. That way you don't
have to re-query.
Well, what if you upgrade from ver1 to ver2 which introduces a new 
runtime capability? How could it be used if we would not requery caps on 
reconnect? Note, I don't mean full capabilities querying which is 
expensive. I'm talking about runtime capabilities. For instance this 
case. Suppose we have a capability (as proposed here) called 
PM_WAKEUP_SUPPORT.
If we would requery caps at reconnect then we wouldn't fix this bug.
On the other hand, for this specific bug - if you'd pmsuspend a domain 
that is incapable of wakeup, the only thing left for you to do is 
destroy + start combo which will fetch you the capability. But that's 
not the point.
...
...
worms. For instance, this feature relies on 'query-commands' to see if
'query-current-machine' command is available. Well, we can't re-run
'query-commands' in runtime because that might change the set of qemuCaps
used when constructing cmd line.
Why bother? If qemu was started prior to the support being added I don't
see a compelling reason to requery everything. In most cases it would be
also wrong to do so.
...
Way around this is to have 'runtime only' capabilities, that would not be
stored in status XML. At some point we will need those, but writing that
infrastructure is not easy and a bit too much to ask for this small fix.
Especially, when we have another option:
2) Issue 'query-current-machine' from qemuDomainPMSuspendForDuration(). In
every run. This is a bit suboptimal, but how frequently do we expect this
I don't think it's suboptimal. Normally users don't use PM suspend for
the VM so you can argue that this saves one qemu call in all the cases
where PM suspend is never used.
...
API to be called? Caching qemu capabilities is there to speed up decission
process which makes sense if we'd be making the decission 1000 times a
second (e.g. building cmd line). With this approach the slowdown would be
negligible. Of course, we would need to run the command in 'silent' mode,
i.e. not report any error if qemu doesn't know it.
Or you can still use the capability bit whether the query command is
available.
Nope, you can't. Let's just issue the command in 
qemuDomainPMSuspendForDuration(), unconditionally, ignore/silent any 
errors and leave runtime caps for another day.

Michal