Re: [libvirt] [PATCH 1/2] qemu: Resolve issue with virsh schedinfo for non running domain

8 Jun 2013

On 06/07/2013 06:14 AM, Martin Kletzander wrote:
...
On 05/30/2013 02:24 PM, John Ferlan wrote:
...
Since commit '632f78ca' the 'virsh schedinfo <domain>' command returns:
Scheduler      : Unknown
error: Requested operation is not valid: cgroup CPU controller is not mounted
Prior to that change a non running domain would return:
Scheduler      : posix
cpu_shares     : 0
vcpu_period    : 0
vcpu_quota     : 0
emulator_period: 0
emulator_quota : 0
This change will result in the following:
Scheduler      : posix
cpu_shares     : 0
The code sequence is a 'virGetSchedulerType()' which returns the "*params"
for a subsequent 'virDomainGetSchedulerParameters[Flags]()' call. The latter
requires at least 1 'nparam' to be provided/returned.
I can't find where the nparams is required to be >= 1, but that might
changed since you've posted this patch.  I changed the '*nparams = 1' to
'0' and it works perfectly (returns only the scheduler type, so it's
visible that nothing is set).
The documentation for virDomainGetSchedulerParameters[Flags] has:

*
* Get all scheduler parameters. On input, @nparams gives the size of the
* @params array; on output, @nparams gives how many slots were filled
* with parameter information, which might be less but will not exceed
* the input value.  @nparams cannot be 0.
*

Also, the virDomainGetSchedulerParameters checks nparams > 0:

    virCheckPositiveArgGoto(*nparams, error);


Te reason why virsh schedinfo printed just the scheduler type is because
virsh-domain.c checks for 'nparams' before calling the
virDomainGetSchedulerParametersFlags() API.
...
...
---
 src/qemu/qemu_driver.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 4a76f14..7292149 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -6899,12 +6899,20 @@ static char *qemuDomainGetSchedulerType(virDomainPtr dom,
     }
     priv = vm->privateData;
+    /* Domain not running or no cgroup */
+    if (!priv->cgroup) {
+        if (nparams)
+            *nparams = 1;
+        goto cleanup;
+    }
+
     if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) {
         virReportError(VIR_ERR_OPERATION_INVALID,
                        "%s", _("cgroup CPU controller is not mounted"));
         goto cleanup;
     }
+
     if (nparams) {
         rc = qemuGetCpuBWStatus(priv->cgroup);
         if (rc < 0)
@@ -6915,11 +6923,12 @@ static char *qemuDomainGetSchedulerType(virDomainPtr dom,
             *nparams = 5;
     }
+cleanup:
     ignore_value(VIR_STRDUP(ret, "posix"));
You can get to here even when the domain doesn't exist and in that case
NULL should be returned to indicate an error.  It would also skip an
error on problem in qemuGetCpuBWStatus(), but there is none printed, so
at least we would return the scheduler type.
Oh yeah, right duh.  I was trying to cover the case where !priv->cgroup
without copying that same line when "!priv->cgroup".

Setting *nparams = 1 was taken from the viewpoint of the return 0 case
from the [lxc|qemu}GetCpuBWStatus() calls (since cgroups aren't
started). For a non running guest, I assumed that none of the 5 values
really meant anything, but I can see the logic that would get/set that
count once prior to vm startup and then repeatedly assume/use that value
to call virGetSchedulerParametersFlags() during the lifetime of whatever
is monitoring the vm, so I understand why setting it to 5 is
valid/desired. Once the guest is running, if there was an issue with the
[lxc|qemu]GetCPUBWStatus() call, then only the 'cpu_shares' would get
returned due to checks in the GetSchedulerParametersFlags() APIs.

Thus, If I'm reading the responses thus far correctly, then

When cgroups aren't enabled for the domain (eg, not active):

    qemuDomainGetSchedulerType() would return "posix" and 5 for *nparams
    lxcDomainGetSchedulerType() would return "posix" and 3 for *nparams

e.g., for qemu:

    /* Domain not running or no cgroup */
    if (!priv->cgroup) {
        if (nparams)
            *nparams = 5;
        ignore_value(VIR_STRDUP(ret, "posix"));
        goto cleanup;
    }

(and moving the ignore_value(VIR_STRDUP(ret, "posix")); before cleanup:

For the [lxc|qemu]DomainGetSchedulerType() API's, the only time
[lxc|qemu]GetCpuBWStatus() would be called is if the domain is active,
e.g., we get past the virCgroupPathOfController() call.


The [qemu|lxc]DomainGetSchedulerParametersFlags() API's need adjustments
as well because currently they make the same call to
[lxc|qemu]GetCpuBWStatus() regardless of whether the domain is active or
not and regardless of what 'flags' are set (current, live, config).
This code I was less sure of how to handle what gets returned.

Each has a call such as (from qemu):

    if (*nparams > 1) {
        rc = qemuGetCpuBWStatus(priv->cgroup);
        if (rc < 0)
            goto cleanup;
        cpu_bw_status = !!rc;
    }

which sets 'cpu_bw_status' based on the rc, which is set as follows:

 * Return 1 when CFS bandwidth is supported, 0 when CFS bandwidth is not
 * supported, -1 on error.

For the non active case, this returns 0 and thus clears the
cpu_bw_status flag which affects the code deciding how many
configuration parameters to return :


    if (flags & VIR_DOMAIN_AFFECT_CONFIG) {
        shares = persistentDef->cputune.shares;
        if (*nparams > 1 && cpu_bw_status) {
...


Again, given how I'm reading the responses, the check here for
cpu_bw_status could then be removed since if the vm is not active or the
"--config" flag is used, we return all 5 parameters regardless of
whether the vm is running and regardless of whether the cgroup is found.
Another option would be to add a check just prior that would set the
cpu_bw_status = true when we don't have a cgroup and we're just getting
config data, such as:

    if (!priv->cgroup && (flags & VIR_DOMAIN_AFFECT_CONFIG))
        cpu_bw_status = true;

One way or another the 'cpu_bw_status' needs to be set to true before
"goto out" so that all the data points are copied into the output
"params[]" array.

For "current" or "live" data of a running vm with cgroups mounted,
subsequent checks will return an error, 1, 3, or 5 elements depending on
success or failure of live checks.


Obviously, I'll post a v2, but I would prefer to get it "more right"
without too much more "churn".

Tks,

John
...
...
-cleanup:
     if (vm)
         virObjectUnlock(vm);
+
     return ret;
 }
This patch fixes the problem, but may create a new one.  Also 'virsh
schedinfo --config <domain>' is different for running and shutoff
domain, but I'm willing to overlook that since it is less problematic
than what happened before.
Martin