On 11/6/2018 2:44 AM, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
> Add functions for creating, destroying, reconnecting resctrl
> monitor in qemu according to the configuration in domain XML.
>
> Signed-off-by: Wang Huaqiang<huaqiang.wang(a)intel.com>
> ---
> src/qemu/qemu_process.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 65 insertions(+), 1 deletion(-)
>
[...]
> @@ -5430,6 +5441,7 @@ qemuProcessSetupVcpu(virDomainObjPtr vm,
> {
> pid_t vcpupid = qemuDomainGetVcpuPid(vm, vcpuid);
> virDomainVcpuDefPtr vcpu = virDomainDefGetVcpu(vm->def, vcpuid);
> + virDomainResctrlMonDefPtr mon = NULL;
> size_t i = 0;
>
> if (qemuProcessSetupPid(vm, vcpupid, VIR_CGROUP_THREAD_VCPU,
> @@ -5440,11 +5452,42 @@ qemuProcessSetupVcpu(virDomainObjPtr vm,
> return -1;
>
> for (i = 0; i < vm->def->nresctrls; i++) {
> + size_t j = 0;
> virDomainResctrlDefPtr ct = vm->def->resctrls[i];
>
> if (virBitmapIsBitSet(ct->vcpus, vcpuid)) {
> if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0)
> return -1;
> +
> + /* The order of invoking virResctrlMonitorAddPID matters, it is
> + * required to invoke this function first for monitor that has
> + * the same vcpus setting as the allocation in same def->resctrl.
> + * Otherwise, some other monitor's pid may be removed from its
> + * resource group's 'tasks' file.*/
> + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) {
> + mon = vm->def->resctrls[i]->monitors[j];
> +
> + if (!virBitmapEqual(ct->vcpus, mon->vcpus))
> + continue;
> +
> + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) {
> + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0)
> + return -1;
> + }
> + break;
> + }
> +
> + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) {
> + mon = vm->def->resctrls[i]->monitors[j];
> +
> + if (virBitmapEqual(ct->vcpus, mon->vcpus))
> + continue;
> +
> + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) {
> + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0)
> + return -1;
> + }
> + }
> break;
> }
> }
As I'm reading the subsequent patch, I'm wondering about the order of
the patches, what happens on the vcpu hotunplug case, and are we doing
the "correct thing" on the reconnect path.
1. with respect to order - it probably doesn't really matter, but I
guess adding another list to manage in the next patch got my attention
and thinking well shouldn't the above code for MonitorAddPID be "ready"
and not changed afterwards.
Then I need to merge the next patch, patch 14, with some previous patch,
at least
for the 'MonitorAddPID' function, to avoid a second changing of a same
function.
Got, thanks.
But, in my next series of patches, patch 14 is removed with all logic in
it. So the patch
order seems rational then.
2. Since qemuProcessSetupVcpu is called from
qemuDomainHotplugAddVcpu
that means we probably need to handle qemuDomainHotplugDelVcpu
processing. Of course, when Martin added the resctrl support in commit
9a2fc2db8, the corollary wasn't handled.
So the *tasks file could have stale pids in it if vcpus were unplugged
since there's nothing (obvious to me) that removes the pid from the
file. Furthermore a stale pid in the grand scheme of pid reusage could
become not stale and what happens if a pid is found - do we end up in
some error path...
The *tasks file is not a file kept in a block device, its content, the
PIDS, could be
removed if the process/task is terminated or PIDs have been added to
some other
resctrl group.
It seems the *tasks file will not be stale in scenario of vcpu hot-plug.
For vcpu unplugging case, qemuDomainHotplugDelVcpu removes and terminates
the vcpu process, right? Then the *tasks file will automatically be
updated by removing
the process's PID. The removal of PID from *tasks is done by resctrl
file system.
So there is no hot-plug issue for resctrl in terms of tracking PID in
*tasks.
3. In the reconnect logic - it would seem that we would be starting
from
scratch again, right?
In my understanding reconnect means the restart of libvirtd and the
information
re-building for VMs.
Originally I believed that vcpupid would be changed during a reconnect,
that is
the reason I add the function for validating monitor in patch14.
But the vcpupid has not been changed in process of reconnect. So I
decide to remove
patch14.
So would the *tasks file even be valid since vcpus
could be added/deleted and we didn't get notified...
Do you still believe "vcpus could be added/deleted and we didn't get
notified' in reconnect"?
If so, can you list more details?
And any operation for adding/deleting VM vcpus out of libvirt is not
permitted right?
So perhaps we need
some logic in the reconnect code to reinitialize the tasks file (IOW:
delete it since we're going to recreate it - I assume).
*tasks file is still there with the PIDs in it. Seems it is not needed
to recreate it.
So I removed my last patch, in that patch I invoked qemuProcessSetupVcpus in
process of re-connection.
Perhaps more questions now vis-a-vis any sort of ACK/push for
patches
starting here. At least 1-12 are in decent shape.
John
Thanks for review.
Huaqiang
> @@ -7207,8 +7250,18 @@ void qemuProcessStop(virQEMUDriverPtr
driver,
> /* Remove resctrl allocation after cgroups are cleaned up which makes it
> * kind of safer (although removing the allocation should work even with
> * pids in tasks file */
> - for (i = 0; i < vm->def->nresctrls; i++)
> + for (i = 0; i < vm->def->nresctrls; i++) {
> + size_t j = 0;
> +
> + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) {
> + virDomainResctrlMonDefPtr mon = NULL;
> +
> + mon = vm->def->resctrls[i]->monitors[j];
> + virResctrlMonitorRemove(mon->instance);
> + }
> +
> virResctrlAllocRemove(vm->def->resctrls[i]->alloc);
> + }
>
> qemuProcessRemoveDomainStatus(driver, vm);
>
> @@ -7939,9 +7992,20 @@ qemuProcessReconnect(void *opaque)
> goto error;
>
> for (i = 0; i < obj->def->nresctrls; i++) {
> + size_t j = 0;
> +
> if (virResctrlAllocDeterminePath(obj->def->resctrls[i]->alloc,
> priv->machineName) < 0)
> goto error;
> +
> + for (j = 0; j < obj->def->resctrls[i]->nmonitors; j++) {
> + virDomainResctrlMonDefPtr mon = NULL;
> +
> + mon = obj->def->resctrls[i]->monitors[j];
> + if (virResctrlMonitorDeterminePath(mon->instance,
> + priv->machineName) < 0)
> + goto error;
> + }
> }
>
> /* update domain state XML with possibly updated state in virDomainObj */
>