On Fri, 2019-11-08 at 10:44 +0100, Michal Privoznik wrote:
On 11/7/19 6:29 PM, Jonathon Jongsma wrote:
> On Thu, 2019-11-07 at 14:19 +0100, Michal Privoznik wrote:
> > The qemuDomainGetStatsIOThread() accesses the monitor by calling
> > qemuDomainGetIOThreadsMon(). And it's also marked as "need
> > monitor" in qemuDomainGetStatsWorkers[]. However, it's not
> > checking if acquiring job was successful.
> >
> > Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
> > ---
> > src/qemu/qemu_driver.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > index d17c18705b..146b4ee319 100644
> > --- a/src/qemu/qemu_driver.c
> > +++ b/src/qemu/qemu_driver.c
> > @@ -21194,7 +21194,7 @@ static int
> > qemuDomainGetStatsIOThread(virQEMUDriverPtr driver,
> > virDomainObjPtr dom,
> > virTypedParamListPtr params,
> > - unsigned int privflags G_GNUC_UNUSED)
> > + unsigned int privflags)
> > {
> > qemuDomainObjPrivatePtr priv = dom->privateData;
> > size_t i;
> > @@ -21202,7 +21202,7 @@
> > qemuDomainGetStatsIOThread(virQEMUDriverPtr
> > driver,
> > int niothreads;
> > int ret = -1;
> >
> > - if (!virDomainObjIsActive(dom))
> > + if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom))
> > return 0;
>
> It seems to me that not having acquired a job would be an error
> condition. Here you return 0 indicating that the query was
> successful.
>
> Also, although this change doesn't really harm anything, it doesn't
> quite seem like the proper fix to me. You're essentially calling a
> function that requires a job, and passing it a flag indicating
> whether
> we've acquired a job or not. That's a bit of an odd pattern; a flag
> parameter indicating whether the function should actually execute
> or
> not:
>
> void do_something(bool yes_i_really_mean_it)
>
> :) It seems like the proper fix would be to simply not call the
> function if we haven't acquired a job.
>
> Of course, you could still keep the above patch if you want to be
> cautious, but perhaps the real fix should be to filter out monitor-
> requiring jobs in qemuDomainGetStats() when we failed to acquire a
> job? Something like:
>
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 42922d2f04..a935ef6d97 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -21456,7 +21456,10 @@ qemuDomainGetStats(virConnectPtr conn,
> return -1;
>
> for (i = 0; qemuDomainGetStatsWorkers[i].func; i++) {
> - if (stats & qemuDomainGetStatsWorkers[i].stats) {
> + if (stats & qemuDomainGetStatsWorkers[i].stats &&
> + (!qemuDomainGetStatsWorkers[i].monitor ||
> HAVE_JOB(flags))) {
> if (qemuDomainGetStatsWorkers[i].func(conn-
> >privateData,
> dom, params,
> flags) < 0)
> return -1;
These stats querying callbacks work in a different way than the rest
of
the code. The entry point is qemuConnectGetAllDomainStats(). It
iterates
over the list of domains and tries to acquire job for each domain
that
it's currently iterating over. This may of course fail (or, if
NOWAIT
flag is specified then user specifically requested to skip domains
"blocked" with a job). Anyway, there are some stats that require
job,
some that don't, and some that are somewhere in the middle (e.g.
qemuDomainGetStatsVcpu), i.e. they can get some basic info and if job
is
acquired they can gather even more info. Therefore, we can't check
for
HAVE_JOB() in qemuDomainGetStats() but rather in each callback
individually.
Aha, I had missed the part where some functions can get a reduced set
of stats despite having .monitor=true. Sorry for that.
It still seems a bit of an odd pattern to me. In some ways, I think it
would be cleaner to change the qemuDomainGetStatsWorker.monitor
variable from a boolean to an enumeration that can be one of
MONITOR_NONE, MONITOR_OPTIONAL, MONITOR_REQUIRED, so we have enough
information to decide whether to call the function or not.
But this patch is consistent with the others, so
Reviewed-by: Jonathon Jongsma <jjongsma(a)redhat.com>
As for returning success, the virConnectGetAllDomainStats() API is
document as:
Note that entire stats groups or individual stat fields may be
missing
from the output in case they are not supported by the given
hypervisor,
are not applicable for the current state of the guest domain, or
their
retrieval was not successful.
The way I read this is: if unable to acquire job, omit iothread
stats
without reporting error. In fact, this is what other callbacks do
too.
Michal