On Tue, Jun 28, 2022 at 10:25:54PM +0200, Martin Kletzander wrote:
On Tue, Jun 28, 2022 at 10:15:28PM +0530, Amneesh Singh wrote:
> On Tue, Jun 28, 2022 at 05:23:11PM +0200, Michal Prívozník wrote:
> > On 6/24/22 10:14, Amneesh Singh wrote:
> > > Related:
https://gitlab.com/libvirt/libvirt/-/issues/276
> > >
> > > This patch uses qemuMonitorQueryStats to query
"halt_poll_success_ns"
> > > and "halt_poll_fail_ns" for every vCPU. The respective values
for each
> > > vCPU are then added together.
> > >
> > > Signed-off-by: Amneesh Singh <natto(a)weirdnatto.in>
> > > ---
> > > src/qemu/qemu_driver.c | 48 ++++++++++++++++++++++++++++++++++++++----
> > > 1 file changed, 44 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > > index 3b5c3db6..0a2be6d3 100644
> > > --- a/src/qemu/qemu_driver.c
> > > +++ b/src/qemu/qemu_driver.c
> > > @@ -18057,10 +18057,50 @@ qemuDomainGetStatsCpuHaltPollTime(virDomainObj
*dom,
> > > {
> > > unsigned long long haltPollSuccess = 0;
> > > unsigned long long haltPollFail = 0;
> > > - pid_t pid = dom->pid;
> > > + qemuDomainObjPrivate *priv = dom->privateData;
> > > + bool canQueryStats = virQEMUCapsGet(priv->qemuCaps,
QEMU_CAPS_QUERY_STATS);
> >
> > Is this variable really needed? I mean, we can just:
> >
> > if (virQEMUCapsGet(...) {
> >
> > } else {
> >
> > }
> >
> > But if you want to avoid long lines, then perhaps rename the variable to
> > queryStatsCap? This way it's more obvious what the variable reflects.
> > Stats can be queried in both cases ;-)
> Sure, that sounds doable :)
> >
> > >
> > > - if (virHostCPUGetHaltPollTime(pid, &haltPollSuccess,
&haltPollFail) < 0)
> > > - return 0;
> > > + if (!canQueryStats) {
> > > + pid_t pid = dom->pid;
> > > +
> > > + if (virHostCPUGetHaltPollTime(pid, &haltPollSuccess,
&haltPollFail) < 0)
> > > + return 0;
> > > + } else {
> > > + size_t i;
> > > + qemuMonitorQueryStatsTargetType target =
QEMU_MONITOR_QUERY_STATS_TARGET_VCPU;
> > > + qemuMonitorQueryStatsProvider *provider = NULL;
> > > + g_autoptr(GPtrArray) providers = NULL;
> > > + g_autoptr(GPtrArray) queried_stats = NULL;
> > > + const char *success_str = "halt_poll_success_ns";
> > > + const char *fail_str = "halt_poll_fail_ns";
> > > +
> > > + provider =
qemuMonitorQueryStatsProviderNew(QEMU_MONITOR_QUERY_STATS_PROVIDER_KVM);
> > > + provider->names = g_new0(char *, 3);
> > > + provider->names[0] = g_strdup(success_str),
provider->names[1] = g_strdup(fail_str);
> >
> > I'm starting to wonder whether this is a good interface. These
->names[]
> > array is never changed, so maybe we can have it as a NULL terminated
> > list of constant strings? For instance:
> >
> > provider = qemuMonitorQueryStatsProviderNew();
> > provider->names = {"halt_poll_success_ns",
"halt_poll_fail_ns", NULL};
> Well, cannot really assign char ** with a scalar initialization, but
> what might work is something like
>
> const char *names[] = { success_str, fail_str, NULL };
>
> provider = qemuMonitorQueryStatsProviderNew(QEMU_MONITOR_QUERY_STATS_PROVIDER_KVM);
> provider->names = g_strdupv((char **) names);
>
I think what Michal was trying to say is that since you do not change
the array anywhere, there is no need for that to be a dynamically
allocated array that needs to be freed. I, however, am not 100% if you
are going to need this to be dynamic and whether you will be changing
these arrays.
I don't think there will be any need to have the array mutable,
in
which case, maybe something like this should work
provider->names = (const char *[]){ success_str, fail_str, NULL };
provided that provider->names is changed to const char ** or the same
thing can be done with string literals (or non const string variables)
provided that the cast is (char *[]) and names is still char **.
Compound literals are part of C99 though.
> Another thought was using GStrvBuilder but it is not avaiable as per
> GLIB_VERSION_MAX.
>
> I too think that the current approach is not very nice. A variadic
> function similar to g_strv_builder_add_many that initializes a
> char ** would be nice.
> >
If that is something that helps, then we can write it ourselves and only
use our implementation if glib is too old.
> > > +
> > > + providers = g_ptr_array_new_full(1, (GDestroyNotify)
qemuMonitorQueryStatsProviderFree);
> > > + g_ptr_array_add(providers, provider);
> > > +
> > > + queried_stats = qemuMonitorQueryStats(priv->mon, target, NULL,
providers);
> >
> > This issues monitor command without proper checks. Firstly, you need to
> > check whether job acquiring (that s/false/true/ change done in the last
> > hunk) was successful and the domain is still running:
> >
> > if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom)) {
> > /* code here */
> > }
> >
> > Then, you need to so called "enter the monitor" and "exit the
monitor".
> > So what happens here is that @vm is when this function is called. Now,
> > issuing a monitor command with the domain object lock held is
> > potentially very dangerous because if QEMU takes long time to reply (or
> > it doesn't reply at all) the domain object is locked an can't be
> > destroyed (virsh destroy) - because the first thing that
> > qemuDomainDestroyFlags() does is locking the domain object. Also, you
> > want to make sure no other thread is currently talking on the monitor -
> > so the monitor has to be locked too. We have two convenient functions
> > for these operations:
> >
> > qemuDomainObjEnterMonitor()
> > qemuDomainObjExitMonitor()
> >
> > This is all described in more detail in
> > docs/kbase/internals/qemu-threads.rst.
> >
> > Having said all of this, I think we can fallback to the current behavior
> > if job wasn't acquired successfully. Therefore the condition at the very
> > beginning might look like this:
> >
> > if (queryStatsCap && HAVE_JOB() && virDomainObjIsActive()) {
> > ...
> > qemuDomainEnterMonitor();
> > qemuMonitorQueryStats();
> > qemuDomainExitMonitor();
> > } else {
> > virHostCPUGetHaltPollTime();
> > }
> >
> I see that makes sense. I shall do the necessary changes. Thanks for the
> detailed explanation.
> > > +
> > > + if (!queried_stats)
> > > + return 0;
> > > +
> > > + for (i = 0; i < queried_stats->len; i++) {
> > > + unsigned long long curHaltPollSuccess, curHaltPollFail;
> > > + GHashTable *cur_table = queried_stats->pdata[i];
> > > + virJSONValue *success, *fail;
> > > +
> > > + success = g_hash_table_lookup(cur_table, success_str);
> > > + fail = g_hash_table_lookup(cur_table, fail_str);
> > > +
> > > + ignore_value(virJSONValueGetNumberUlong(success,
&curHaltPollSuccess));
> > > + ignore_value(virJSONValueGetNumberUlong(fail,
&curHaltPollFail));
> >
> > I don't think this is a safe thing to do. What if either of @success or
> > @fail is NULL? That can happen when QEMU didn't return requested member.
> > True, at that point the 'query-stats' command should have failed, but
I
> > think it's more defensive if we checked these pointers. My worry is that
> > virJSONValueGetNumberUlong() dereferences the pointer right away.
> >
> I do understand that this is not an unfounded worry. I shall put the
> checks there.
> > > +
> > > + haltPollSuccess += curHaltPollSuccess;
> > > + haltPollFail += curHaltPollFail;
> > > + }
> > > + }
> > >
> > > if (virTypedParamListAddULLong(params, haltPollSuccess,
"cpu.haltpoll.success.time") < 0 ||
> > > virTypedParamListAddULLong(params, haltPollFail,
"cpu.haltpoll.fail.time") < 0)
> > > @@ -18915,7 +18955,7 @@ static virQEMUCapsFlags queryDirtyRateRequired[] =
{
> > >
> > > static struct qemuDomainGetStatsWorker qemuDomainGetStatsWorkers[] = {
> > > { qemuDomainGetStatsState, VIR_DOMAIN_STATS_STATE, false, NULL },
> > > - { qemuDomainGetStatsCpu, VIR_DOMAIN_STATS_CPU_TOTAL, false, NULL },
> > > + { qemuDomainGetStatsCpu, VIR_DOMAIN_STATS_CPU_TOTAL, true, NULL },
> > > { qemuDomainGetStatsBalloon, VIR_DOMAIN_STATS_BALLOON, true, NULL },
> > > { qemuDomainGetStatsVcpu, VIR_DOMAIN_STATS_VCPU, true, NULL },
> > > { qemuDomainGetStatsInterface, VIR_DOMAIN_STATS_INTERFACE, false,
NULL },
> >
> > Please feel free to object to anything I wrote. Maybe you have more
> > patches written on a local branch that justify your choices. I don't know.
> >
> > Michal
> >
> Thanks for taking the time to review the patches.
> Amneesh
Thanks
Amneesh