[libvirt PATCH v5 0/7] cgroup and thread management in ch driver

This patchset adds support for cgroup management of ch threads. This version correctly manages cgroups for vcpu and emulator threads created by ch. cgroup management for iothreads is not yet supported. Along with cgroup management, this patchset also enables support for pinning vcpu and emulator threads to selected host cpus. v5: * bumped the verion of callbacks in ch driver to 8.1.0 v4: * addressed all open comments in v3 * dropped all the merged commits v3: * addrressed all the formatting comments in v2 patch set * dropped indentation patches are they do not adhere to libvirt coding style * fixed build issue in qemu driver that was introduced in v2 Praveen K Paladugu (3): qemu,hypervisor: refactor some cgroup mgmt methods ch_process: Setup emulator and iothread settings ch_driver: emulator threadinfo & pinning callbacks Vineeth Pillai (4): ch: methods for cgroup mgmt in ch driver ch_driver,ch_domain: vcpupin callback in ch driver ch_driver: enable typed param string for numatune ch_driver: add numatune callbacks for CH driver src/ch/ch_conf.c | 2 + src/ch/ch_conf.h | 4 +- src/ch/ch_domain.c | 64 ++++ src/ch/ch_domain.h | 18 +- src/ch/ch_driver.c | 590 +++++++++++++++++++++++++++++++++ src/ch/ch_monitor.c | 156 +++++++++ src/ch/ch_monitor.h | 56 +++- src/ch/ch_process.c | 385 ++++++++++++++++++++- src/ch/ch_process.h | 3 + src/hypervisor/domain_cgroup.c | 457 ++++++++++++++++++++++++- src/hypervisor/domain_cgroup.h | 72 ++++ src/libvirt_private.syms | 14 +- src/qemu/qemu_cgroup.c | 413 +---------------------- src/qemu/qemu_cgroup.h | 11 - src/qemu/qemu_driver.c | 14 +- src/qemu/qemu_hotplug.c | 7 +- src/qemu/qemu_process.c | 24 +- 17 files changed, 1835 insertions(+), 455 deletions(-) -- 2.27.0

Refactor some cgroup management methods from qemu into hypervisor. These methods will be shared with ch driver for cgroup management. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/hypervisor/domain_cgroup.c | 457 ++++++++++++++++++++++++++++++++- src/hypervisor/domain_cgroup.h | 72 ++++++ src/libvirt_private.syms | 14 +- src/qemu/qemu_cgroup.c | 413 +---------------------------- src/qemu/qemu_cgroup.h | 11 - src/qemu/qemu_driver.c | 14 +- src/qemu/qemu_hotplug.c | 7 +- src/qemu/qemu_process.c | 24 +- 8 files changed, 580 insertions(+), 432 deletions(-) diff --git a/src/hypervisor/domain_cgroup.c b/src/hypervisor/domain_cgroup.c index 61b54f071c..05c53ff7d1 100644 --- a/src/hypervisor/domain_cgroup.c +++ b/src/hypervisor/domain_cgroup.c @@ -22,11 +22,12 @@ #include "domain_cgroup.h" #include "domain_driver.h" - +#include "util/virnuma.h" +#include "virlog.h" #include "virutil.h" #define VIR_FROM_THIS VIR_FROM_DOMAIN - +VIR_LOG_INIT("domain.cgroup"); int virDomainCgroupSetupBlkio(virCgroup *cgroup, virDomainBlkiotune blkio) @@ -269,3 +270,455 @@ virDomainCgroupSetMemoryLimitParameters(virCgroup *cgroup, return 0; } + + +int +virDomainCgroupSetupBlkioCgroup(virDomainObj *vm, + virCgroup *cgroup) +{ + + if (!virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (vm->def->blkio.weight || vm->def->blkio.ndevices) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Block I/O tuning is not available on this host")); + return -1; + } else { + return 0; + } + } + + return virDomainCgroupSetupBlkio(cgroup, vm->def->blkio); +} + + +int +virDomainCgroupSetupMemoryCgroup(virDomainObj *vm, + virCgroup *cgroup) +{ + + if (!virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { + if (virMemoryLimitIsSet(vm->def->mem.hard_limit) || + virMemoryLimitIsSet(vm->def->mem.soft_limit) || + virMemoryLimitIsSet(vm->def->mem.swap_hard_limit)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Memory cgroup is not available on this host")); + return -1; + } else { + return 0; + } + } + + return virDomainCgroupSetupMemtune(cgroup, vm->def->mem); +} + + +int +virDomainCgroupSetupCpusetCgroup(virCgroup *cgroup) +{ + + if (!virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) + return 0; + + if (virCgroupSetCpusetMemoryMigrate(cgroup, true) < 0) + return -1; + + return 0; +} + + +int +virDomainCgroupSetupCpuCgroup(virDomainObj *vm, + virCgroup *cgroup) +{ + + if (!virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + if (vm->def->cputune.sharesSpecified) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("CPU tuning is not available on this host")); + return -1; + } else { + return 0; + } + } + + if (vm->def->cputune.sharesSpecified) { + + if (virCgroupSetCpuShares(cgroup, vm->def->cputune.shares) < 0) + return -1; + + } + + return 0; +} + + +int +virDomainCgroupInitCgroup(const char *prefix, + virDomainObj * vm, + size_t nnicindexes, + int *nicindexes, + virCgroup * cgroup, + int cgroupControllers, + unsigned int maxThreadsPerProc, + bool privileged, + char *machineName) +{ + if (!privileged) + return 0; + + if (!virCgroupAvailable()) + return 0; + + virCgroupFree(cgroup); + cgroup = NULL; + + if (!vm->def->resource) + vm->def->resource = g_new0(virDomainResourceDef, 1); + + if (!vm->def->resource->partition) + vm->def->resource->partition = g_strdup("/machine"); + + if (!g_path_is_absolute(vm->def->resource->partition)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Resource partition '%s' must start with '/'"), + vm->def->resource->partition); + return -1; + } + + if (virCgroupNewMachine(machineName, + prefix, + vm->def->uuid, + NULL, + vm->pid, + false, + nnicindexes, nicindexes, + vm->def->resource->partition, + cgroupControllers, + maxThreadsPerProc, &cgroup) < 0) { + if (virCgroupNewIgnoreError()) + return 0; + + return -1; + } + + return 0; +} + + +void +virDomainCgroupRestoreCgroupState(virDomainObj *vm, + virCgroup *cgroup) +{ + g_autofree char *mem_mask = NULL; + size_t i = 0; + + g_autoptr(virBitmap) all_nodes = NULL; + + if (!virNumaIsAvailable() || + !virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) + return; + + if (!(all_nodes = virNumaGetHostMemoryNodeset())) + goto error; + + if (!(mem_mask = virBitmapFormat(all_nodes))) + goto error; + + if (virCgroupHasEmptyTasks(cgroup, VIR_CGROUP_CONTROLLER_CPUSET) <= 0) + goto error; + + if (virCgroupSetCpusetMems(cgroup, mem_mask) < 0) + goto error; + + for (i = 0; i < virDomainDefGetVcpusMax(vm->def); i++) { + virDomainVcpuDef *vcpu = virDomainDefGetVcpu(vm->def, i); + + if (!vcpu->online) + continue; + + if (virDomainCgroupRestoreCgroupThread(cgroup, + VIR_CGROUP_THREAD_VCPU, + i) < 0) + return; + } + + for (i = 0; i < vm->def->niothreadids; i++) { + if (virDomainCgroupRestoreCgroupThread(cgroup, + VIR_CGROUP_THREAD_IOTHREAD, + vm->def->iothreadids[i]->iothread_id) < 0) + return; + } + + if (virDomainCgroupRestoreCgroupThread(cgroup, + VIR_CGROUP_THREAD_EMULATOR, + 0) < 0) + return; + + return; + + error: + virResetLastError(); + VIR_DEBUG("Couldn't restore cgroups to meaningful state"); + return; +} + + +int +virDomainCgroupRestoreCgroupThread(virCgroup *cgroup, + virCgroupThreadName thread, + int id) +{ + g_autoptr(virCgroup) cgroup_temp = NULL; + g_autofree char *nodeset = NULL; + + if (virCgroupNewThread(cgroup, thread, id, false, &cgroup_temp) < 0) + return -1; + + if (virCgroupSetCpusetMemoryMigrate(cgroup_temp, true) < 0) + return -1; + + if (virCgroupGetCpusetMems(cgroup_temp, &nodeset) < 0) + return -1; + + if (virCgroupSetCpusetMems(cgroup_temp, nodeset) < 0) + return -1; + + return 0; +} + + +int +virDomainCgroupConnectCgroup(const char *prefix, + virDomainObj *vm, + virCgroup *cgroup, + int cgroupControllers, + bool privileged, + char *machineName) +{ + if (privileged) + return 0; + + if (!virCgroupAvailable()) + return 0; + + virCgroupFree(cgroup); + + if (virCgroupNewDetectMachine(vm->def->name, + prefix, + vm->pid, + cgroupControllers, machineName, &cgroup) < 0) + return -1; + + virDomainCgroupRestoreCgroupState(vm, cgroup); + return 0; +} + + +int +virDomainCgroupSetupCgroup(const char *prefix, + virDomainObj * vm, + size_t nnicindexes, + int *nicindexes, + virCgroup * cgroup, + int cgroupControllers, + unsigned int maxThreadsPerProc, + bool privileged, + char *machineName) +{ + if (!vm->pid) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot setup cgroups until process is started")); + return -1; + } + + if (virDomainCgroupInitCgroup(prefix, + vm, + nnicindexes, + nicindexes, + cgroup, + cgroupControllers, + maxThreadsPerProc, + privileged, + machineName) < 0) + return -1; + + if (!cgroup) + return 0; + + if (virDomainCgroupSetupBlkioCgroup(vm, cgroup) < 0) + return -1; + + if (virDomainCgroupSetupMemoryCgroup(vm, cgroup) < 0) + return -1; + + if (virDomainCgroupSetupCpuCgroup(vm, cgroup) < 0) + return -1; + + if (virDomainCgroupSetupCpusetCgroup(cgroup) < 0) + return -1; + + return 0; +} + + +int +virDomainCgroupSetupVcpuBW(virCgroup *cgroup, + unsigned long long period, + long long quota) +{ + return virCgroupSetupCpuPeriodQuota(cgroup, period, quota); +} + + +int +virDomainCgroupSetupCpusetCpus(virCgroup *cgroup, + virBitmap *cpumask) +{ + return virCgroupSetupCpusetCpus(cgroup, cpumask); +} + + +int +virDomainCgroupSetupGlobalCpuCgroup(virDomainObj *vm, + virCgroup *cgroup, + virBitmap *autoNodeset) +{ + unsigned long long period = vm->def->cputune.global_period; + long long quota = vm->def->cputune.global_quota; + g_autofree char *mem_mask = NULL; + virDomainNumatuneMemMode mem_mode; + + if ((period || quota) && + !virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cgroup cpu is required for scheduler tuning")); + return -1; + } + + /* + * If CPU cgroup controller is not initialized here, then we need + * neither period nor quota settings. And if CPUSET controller is + * not initialized either, then there's nothing to do anyway. + */ + if (!virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_CPU) && + !virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) + return 0; + + + if (virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) == 0 && + mem_mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT && + virDomainNumatuneMaybeFormatNodeset(vm->def->numa, + autoNodeset, &mem_mask, -1) < 0) + return -1; + + if (period || quota) { + if (virDomainCgroupSetupVcpuBW(cgroup, period, quota) < 0) + return -1; + } + + return 0; +} + + +int +virDomainCgroupRemoveCgroup(virDomainObj *vm, + virCgroup *cgroup, + char *machineName) +{ + if (cgroup == NULL) + return 0; /* Not supported, so claim success */ + + if (virCgroupTerminateMachine(machineName) < 0) { + if (!virCgroupNewIgnoreError()) + VIR_DEBUG("Failed to terminate cgroup for %s", vm->def->name); + } + + return virCgroupRemove(cgroup); +} + + +void +virDomainCgroupEmulatorAllNodesDataFree(virCgroupEmulatorAllNodesData *data) +{ + if (!data) + return; + + virCgroupFree(data->emulatorCgroup); + g_free(data->emulatorMemMask); + g_free(data); +} + + +/** + * virDomainCgroupEmulatorAllNodesAllow: + * @cgroup: domain cgroup pointer + * @retData: filled with structure used to roll back the operation + * + * Allows all NUMA nodes for the cloud hypervisor thread temporarily. This is + * necessary when hotplugging cpus since it requires memory allocated in the + * DMA region. Afterwards the operation can be reverted by + * virDomainCgroupEmulatorAllNodesRestore. + * + * Returns 0 on success -1 on error + */ +int +virDomainCgroupEmulatorAllNodesAllow(virCgroup *cgroup, + virCgroupEmulatorAllNodesData **retData) +{ + virCgroupEmulatorAllNodesData *data = NULL; + g_autofree char *all_nodes_str = NULL; + + g_autoptr(virBitmap) all_nodes = NULL; + int ret = -1; + + if (!virNumaIsAvailable() || + !virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) + return 0; + + if (!(all_nodes = virNumaGetHostMemoryNodeset())) + goto cleanup; + + if (!(all_nodes_str = virBitmapFormat(all_nodes))) + goto cleanup; + + data = g_new0(virCgroupEmulatorAllNodesData, 1); + + if (virCgroupNewThread(cgroup, VIR_CGROUP_THREAD_EMULATOR, 0, + false, &data->emulatorCgroup) < 0) + goto cleanup; + + if (virCgroupGetCpusetMems(data->emulatorCgroup, &data->emulatorMemMask) < 0 + || virCgroupSetCpusetMems(data->emulatorCgroup, all_nodes_str) < 0) + goto cleanup; + + *retData = g_steal_pointer(&data); + ret = 0; + + cleanup: + virDomainCgroupEmulatorAllNodesDataFree(data); + + return ret; +} + + +/** + * virDomainCgroupEmulatorAllNodesRestore: + * @data: data structure created by virDomainCgroupEmulatorAllNodesAllow + * + * Rolls back the setting done by virDomainCgroupEmulatorAllNodesAllow and frees the + * associated data. + */ +void +virDomainCgroupEmulatorAllNodesRestore(virCgroupEmulatorAllNodesData *data) +{ + virError *err; + + if (!data) + return; + + virErrorPreserveLast(&err); + virCgroupSetCpusetMems(data->emulatorCgroup, data->emulatorMemMask); + virErrorRestore(&err); + + virDomainCgroupEmulatorAllNodesDataFree(data); +} diff --git a/src/hypervisor/domain_cgroup.h b/src/hypervisor/domain_cgroup.h index f93e5f74fe..dfe7107674 100644 --- a/src/hypervisor/domain_cgroup.h +++ b/src/hypervisor/domain_cgroup.h @@ -23,6 +23,11 @@ #include "vircgroup.h" #include "domain_conf.h" +typedef struct _virCgroupEmulatorAllNodesData virCgroupEmulatorAllNodesData; +struct _virCgroupEmulatorAllNodesData { + virCgroup *emulatorCgroup; + char *emulatorMemMask; +}; int virDomainCgroupSetupBlkio(virCgroup *cgroup, virDomainBlkiotune blkio); int virDomainCgroupSetupMemtune(virCgroup *cgroup, virDomainMemtune mem); @@ -36,3 +41,70 @@ int virDomainCgroupSetMemoryLimitParameters(virCgroup *cgroup, virDomainDef *persistentDef, virTypedParameterPtr params, int nparams); +int +virDomainCgroupSetupBlkioCgroup(virDomainObj * vm, + virCgroup *cgroup); +int +virDomainCgroupSetupMemoryCgroup(virDomainObj * vm, + virCgroup *cgroup); +int +virDomainCgroupSetupCpusetCgroup(virCgroup *cgroup); +int +virDomainCgroupSetupCpuCgroup(virDomainObj * vm, + virCgroup *cgroup); +int +virDomainCgroupInitCgroup(const char *prefix, + virDomainObj * vm, + size_t nnicindexes, + int *nicindexes, + virCgroup *cgroup, + int cgroupControllers, + unsigned int maxThreadsPerProc, + bool privileged, + char *machineName); +void +virDomainCgroupRestoreCgroupState(virDomainObj * vm, + virCgroup *cgroup); +int +virDomainCgroupConnectCgroup(const char *prefix, + virDomainObj * vm, + virCgroup *cgroup, + int cgroupControllers, + bool privileged, + char *machineName); +int +virDomainCgroupSetupCgroup(const char *prefix, + virDomainObj * vm, + size_t nnicindexes, + int *nicindexes, + virCgroup *cgroup, + int cgroupControllers, + unsigned int maxThreadsPerProc, + bool privileged, + char *machineName); +void +virDomainCgroupEmulatorAllNodesDataFree(virCgroupEmulatorAllNodesData * data); +int +virDomainCgroupEmulatorAllNodesAllow(virCgroup * cgroup, + virCgroupEmulatorAllNodesData ** retData); +void +virDomainCgroupEmulatorAllNodesRestore(virCgroupEmulatorAllNodesData * data); +int +virDomainCgroupSetupVcpuBW(virCgroup * cgroup, + unsigned long long period, + long long quota); +int +virDomainCgroupSetupCpusetCpus(virCgroup * cgroup, + virBitmap * cpumask); +int +virDomainCgroupSetupGlobalCpuCgroup(virDomainObj * vm, + virCgroup * cgroup, + virBitmap *autoNodeset); +int +virDomainCgroupRemoveCgroup(virDomainObj * vm, + virCgroup * cgroup, + char *machineName); +int +virDomainCgroupRestoreCgroupThread(virCgroup *cgroup, + virCgroupThreadName thread, + int id); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ba3462d849..bc6fa191bf 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1544,11 +1544,23 @@ virSetConnectStorage; # hypervisor/domain_cgroup.h +virDomainCgroupConnectCgroup; +virDomainCgroupEmulatorAllNodesAllow; +virDomainCgroupEmulatorAllNodesRestore; +virDomainCgroupInitCgroup; +virDomainCgroupRemoveCgroup; virDomainCgroupSetMemoryLimitParameters; virDomainCgroupSetupBlkio; +virDomainCgroupSetupBlkioCgroup; +virDomainCgroupSetupCgroup; +virDomainCgroupSetupCpuCgroup; +virDomainCgroupSetupCpusetCgroup; +virDomainCgroupSetupCpusetCpus; virDomainCgroupSetupDomainBlkioParameters; +virDomainCgroupSetupGlobalCpuCgroup; +virDomainCgroupSetupMemoryCgroup; virDomainCgroupSetupMemtune; - +virDomainCgroupSetupVcpuBW; # hypervisor/domain_driver.h virDomainDriverAddIOThreadCheck; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 22a6f56cf9..40bd4e2f9d 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -593,46 +593,6 @@ qemuSetupVideoCgroup(virDomainObj *vm, return ret; } - -static int -qemuSetupBlkioCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (!virCgroupHasController(priv->cgroup, - VIR_CGROUP_CONTROLLER_BLKIO)) { - if (vm->def->blkio.weight || vm->def->blkio.ndevices) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("Block I/O tuning is not available on this host")); - return -1; - } - return 0; - } - - return virDomainCgroupSetupBlkio(priv->cgroup, vm->def->blkio); -} - - -static int -qemuSetupMemoryCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { - if (virMemoryLimitIsSet(vm->def->mem.hard_limit) || - virMemoryLimitIsSet(vm->def->mem.soft_limit) || - virMemoryLimitIsSet(vm->def->mem.swap_hard_limit)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("Memory cgroup is not available on this host")); - return -1; - } - return 0; - } - - return virDomainCgroupSetupMemtune(priv->cgroup, vm->def->mem); -} - - static int qemuSetupFirmwareCgroup(virDomainObj *vm) { @@ -861,44 +821,6 @@ qemuSetupDevicesCgroup(virDomainObj *vm) } -static int -qemuSetupCpusetCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) - return 0; - - if (virCgroupSetCpusetMemoryMigrate(priv->cgroup, true) < 0) - return -1; - - return 0; -} - - -static int -qemuSetupCpuCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { - if (vm->def->cputune.sharesSpecified) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("CPU tuning is not available on this host")); - return -1; - } - return 0; - } - - if (vm->def->cputune.sharesSpecified) { - if (virCgroupSetCpuShares(priv->cgroup, vm->def->cputune.shares) < 0) - return -1; - } - - return 0; -} - - static int qemuSetupCgroupAppid(virDomainObj *vm) { @@ -927,166 +849,13 @@ qemuSetupCgroupAppid(virDomainObj *vm) } -static int -qemuInitCgroup(virDomainObj *vm, - size_t nnicindexes, - int *nicindexes) -{ - qemuDomainObjPrivate *priv = vm->privateData; - g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(priv->driver); - - if (!priv->driver->privileged) - return 0; - - if (!virCgroupAvailable()) - return 0; - - virCgroupFree(priv->cgroup); - priv->cgroup = NULL; - - if (!vm->def->resource) - vm->def->resource = g_new0(virDomainResourceDef, 1); - - if (!vm->def->resource->partition) - vm->def->resource->partition = g_strdup("/machine"); - - if (!g_path_is_absolute(vm->def->resource->partition)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("Resource partition '%s' must start with '/'"), - vm->def->resource->partition); - return -1; - } - - if (virCgroupNewMachine(priv->machineName, - "qemu", - vm->def->uuid, - NULL, - vm->pid, - false, - nnicindexes, nicindexes, - vm->def->resource->partition, - cfg->cgroupControllers, - cfg->maxThreadsPerProc, - &priv->cgroup) < 0) { - if (virCgroupNewIgnoreError()) - return 0; - - return -1; - } - - return 0; -} - -static int -qemuRestoreCgroupThread(virCgroup *cgroup, - virCgroupThreadName thread, - int id) -{ - g_autoptr(virCgroup) cgroup_temp = NULL; - g_autofree char *nodeset = NULL; - - if (virCgroupNewThread(cgroup, thread, id, false, &cgroup_temp) < 0) - return -1; - - if (virCgroupSetCpusetMemoryMigrate(cgroup_temp, true) < 0) - return -1; - - if (virCgroupGetCpusetMems(cgroup_temp, &nodeset) < 0) - return -1; - - if (virCgroupSetCpusetMems(cgroup_temp, nodeset) < 0) - return -1; - - return 0; -} - -static void -qemuRestoreCgroupState(virDomainObj *vm) -{ - g_autofree char *mem_mask = NULL; - qemuDomainObjPrivate *priv = vm->privateData; - size_t i = 0; - g_autoptr(virBitmap) all_nodes = NULL; - - if (!virNumaIsAvailable() || - !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) - return; - - if (!(all_nodes = virNumaGetHostMemoryNodeset())) - goto error; - - if (!(mem_mask = virBitmapFormat(all_nodes))) - goto error; - - if (virCgroupHasEmptyTasks(priv->cgroup, - VIR_CGROUP_CONTROLLER_CPUSET) <= 0) - goto error; - - if (virCgroupSetCpusetMems(priv->cgroup, mem_mask) < 0) - goto error; - - for (i = 0; i < virDomainDefGetVcpusMax(vm->def); i++) { - virDomainVcpuDef *vcpu = virDomainDefGetVcpu(vm->def, i); - - if (!vcpu->online) - continue; - - if (qemuRestoreCgroupThread(priv->cgroup, - VIR_CGROUP_THREAD_VCPU, i) < 0) - return; - } - - for (i = 0; i < vm->def->niothreadids; i++) { - if (qemuRestoreCgroupThread(priv->cgroup, VIR_CGROUP_THREAD_IOTHREAD, - vm->def->iothreadids[i]->iothread_id) < 0) - return; - } - - if (qemuRestoreCgroupThread(priv->cgroup, - VIR_CGROUP_THREAD_EMULATOR, 0) < 0) - return; - - return; - - error: - virResetLastError(); - VIR_DEBUG("Couldn't restore cgroups to meaningful state"); - return; -} - -int -qemuConnectCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(priv->driver); - - if (!priv->driver->privileged) - return 0; - - if (!virCgroupAvailable()) - return 0; - - virCgroupFree(priv->cgroup); - priv->cgroup = NULL; - - if (virCgroupNewDetectMachine(vm->def->name, - "qemu", - vm->pid, - cfg->cgroupControllers, - priv->machineName, - &priv->cgroup) < 0) - return -1; - - qemuRestoreCgroupState(vm); - return 0; -} - int qemuSetupCgroup(virDomainObj *vm, size_t nnicindexes, int *nicindexes) { qemuDomainObjPrivate *priv = vm->privateData; + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(priv->driver); if (!vm->pid) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -1094,7 +863,15 @@ qemuSetupCgroup(virDomainObj *vm, return -1; } - if (qemuInitCgroup(vm, nnicindexes, nicindexes) < 0) + if (virDomainCgroupInitCgroup("qemu", + vm, + nnicindexes, + nicindexes, + priv->cgroup, + cfg->cgroupControllers, + cfg->maxThreadsPerProc, + priv->driver->privileged, + priv->machineName) < 0) return -1; if (!priv->cgroup) @@ -1103,16 +880,16 @@ qemuSetupCgroup(virDomainObj *vm, if (qemuSetupDevicesCgroup(vm) < 0) return -1; - if (qemuSetupBlkioCgroup(vm) < 0) + if (virDomainCgroupSetupBlkioCgroup(vm, priv->cgroup) < 0) return -1; - if (qemuSetupMemoryCgroup(vm) < 0) + if (virDomainCgroupSetupMemoryCgroup(vm, priv->cgroup) < 0) return -1; - if (qemuSetupCpuCgroup(vm) < 0) + if (virDomainCgroupSetupCpuCgroup(vm, priv->cgroup) < 0) return -1; - if (qemuSetupCpusetCgroup(vm) < 0) + if (virDomainCgroupSetupCpusetCgroup(priv->cgroup) < 0) return -1; if (qemuSetupCgroupAppid(vm) < 0) @@ -1121,23 +898,6 @@ qemuSetupCgroup(virDomainObj *vm, return 0; } -int -qemuSetupCgroupVcpuBW(virCgroup *cgroup, - unsigned long long period, - long long quota) -{ - return virCgroupSetupCpuPeriodQuota(cgroup, period, quota); -} - - -int -qemuSetupCgroupCpusetCpus(virCgroup *cgroup, - virBitmap *cpumask) -{ - return virCgroupSetupCpusetCpus(cgroup, cpumask); -} - - int qemuSetupCgroupForExtDevices(virDomainObj *vm, virQEMUDriver *driver) @@ -1164,148 +924,3 @@ qemuSetupCgroupForExtDevices(virDomainObj *vm, return qemuExtDevicesSetupCgroup(driver, vm, cgroup_temp); } - - -int -qemuSetupGlobalCpuCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - unsigned long long period = vm->def->cputune.global_period; - long long quota = vm->def->cputune.global_quota; - g_autofree char *mem_mask = NULL; - virDomainNumatuneMemMode mem_mode; - - if ((period || quota) && - !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("cgroup cpu is required for scheduler tuning")); - return -1; - } - - /* - * If CPU cgroup controller is not initialized here, then we need - * neither period nor quota settings. And if CPUSET controller is - * not initialized either, then there's nothing to do anyway. - */ - if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU) && - !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) - return 0; - - - if (virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) == 0 && - mem_mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT && - virDomainNumatuneMaybeFormatNodeset(vm->def->numa, - priv->autoNodeset, - &mem_mask, -1) < 0) - return -1; - - if (period || quota) { - if (qemuSetupCgroupVcpuBW(priv->cgroup, period, quota) < 0) - return -1; - } - - return 0; -} - - -int -qemuRemoveCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (priv->cgroup == NULL) - return 0; /* Not supported, so claim success */ - - if (virCgroupTerminateMachine(priv->machineName) < 0) { - if (!virCgroupNewIgnoreError()) - VIR_DEBUG("Failed to terminate cgroup for %s", vm->def->name); - } - - return virCgroupRemove(priv->cgroup); -} - - -static void -qemuCgroupEmulatorAllNodesDataFree(qemuCgroupEmulatorAllNodesData *data) -{ - if (!data) - return; - - virCgroupFree(data->emulatorCgroup); - g_free(data->emulatorMemMask); - g_free(data); -} - - -/** - * qemuCgroupEmulatorAllNodesAllow: - * @cgroup: domain cgroup pointer - * @retData: filled with structure used to roll back the operation - * - * Allows all NUMA nodes for the qemu emulator thread temporarily. This is - * necessary when hotplugging cpus since it requires memory allocated in the - * DMA region. Afterwards the operation can be reverted by - * qemuCgroupEmulatorAllNodesRestore. - * - * Returns 0 on success -1 on error - */ -int -qemuCgroupEmulatorAllNodesAllow(virCgroup *cgroup, - qemuCgroupEmulatorAllNodesData **retData) -{ - qemuCgroupEmulatorAllNodesData *data = NULL; - g_autofree char *all_nodes_str = NULL; - g_autoptr(virBitmap) all_nodes = NULL; - int ret = -1; - - if (!virNumaIsAvailable() || - !virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) - return 0; - - if (!(all_nodes = virNumaGetHostMemoryNodeset())) - goto cleanup; - - if (!(all_nodes_str = virBitmapFormat(all_nodes))) - goto cleanup; - - data = g_new0(qemuCgroupEmulatorAllNodesData, 1); - - if (virCgroupNewThread(cgroup, VIR_CGROUP_THREAD_EMULATOR, 0, - false, &data->emulatorCgroup) < 0) - goto cleanup; - - if (virCgroupGetCpusetMems(data->emulatorCgroup, &data->emulatorMemMask) < 0 || - virCgroupSetCpusetMems(data->emulatorCgroup, all_nodes_str) < 0) - goto cleanup; - - *retData = g_steal_pointer(&data); - ret = 0; - - cleanup: - qemuCgroupEmulatorAllNodesDataFree(data); - - return ret; -} - - -/** - * qemuCgroupEmulatorAllNodesRestore: - * @data: data structure created by qemuCgroupEmulatorAllNodesAllow - * - * Rolls back the setting done by qemuCgroupEmulatorAllNodesAllow and frees the - * associated data. - */ -void -qemuCgroupEmulatorAllNodesRestore(qemuCgroupEmulatorAllNodesData *data) -{ - virErrorPtr err; - - if (!data) - return; - - virErrorPreserveLast(&err); - virCgroupSetCpusetMems(data->emulatorCgroup, data->emulatorMemMask); - virErrorRestore(&err); - - qemuCgroupEmulatorAllNodesDataFree(data); -} diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index cd537ebd82..f09134947f 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -56,18 +56,11 @@ int qemuSetupChardevCgroup(virDomainObj *vm, virDomainChrDef *dev); int qemuTeardownChardevCgroup(virDomainObj *vm, virDomainChrDef *dev); -int qemuConnectCgroup(virDomainObj *vm); int qemuSetupCgroup(virDomainObj *vm, size_t nnicindexes, int *nicindexes); -int qemuSetupCgroupVcpuBW(virCgroup *cgroup, - unsigned long long period, - long long quota); -int qemuSetupCgroupCpusetCpus(virCgroup *cgroup, virBitmap *cpumask); -int qemuSetupGlobalCpuCgroup(virDomainObj *vm); int qemuSetupCgroupForExtDevices(virDomainObj *vm, virQEMUDriver *driver); -int qemuRemoveCgroup(virDomainObj *vm); typedef struct _qemuCgroupEmulatorAllNodesData qemuCgroupEmulatorAllNodesData; struct _qemuCgroupEmulatorAllNodesData { @@ -75,8 +68,4 @@ struct _qemuCgroupEmulatorAllNodesData { char *emulatorMemMask; }; -int qemuCgroupEmulatorAllNodesAllow(virCgroup *cgroup, - qemuCgroupEmulatorAllNodesData **data); -void qemuCgroupEmulatorAllNodesRestore(qemuCgroupEmulatorAllNodesData *data); - extern const char *const defaultDeviceACL[]; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0a1ba74e65..1141efef4b 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4419,7 +4419,7 @@ qemuDomainPinVcpuLive(virDomainObj *vm, if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_VCPU, vcpu, false, &cgroup_vcpu) < 0) goto cleanup; - if (qemuSetupCgroupCpusetCpus(cgroup_vcpu, cpumap) < 0) + if (virDomainCgroupSetupCpusetCpus(cgroup_vcpu, cpumap) < 0) goto cleanup; } @@ -4628,7 +4628,7 @@ qemuDomainPinEmulator(virDomainPtr dom, 0, false, &cgroup_emulator) < 0) goto endjob; - if (qemuSetupCgroupCpusetCpus(cgroup_emulator, pcpumap) < 0) { + if (virDomainCgroupSetupCpusetCpus(cgroup_emulator, pcpumap) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("failed to set cpuset.cpus in cgroup" " for emulator threads")); @@ -5025,7 +5025,7 @@ qemuDomainPinIOThread(virDomainPtr dom, if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_IOTHREAD, iothread_id, false, &cgroup_iothread) < 0) goto endjob; - if (qemuSetupCgroupCpusetCpus(cgroup_iothread, pcpumap) < 0) { + if (virDomainCgroupSetupCpusetCpus(cgroup_iothread, pcpumap) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, _("failed to set cpuset.cpus in cgroup" " for iothread %d"), iothread_id); @@ -8925,7 +8925,7 @@ qemuSetGlobalBWLive(virCgroup *cgroup, unsigned long long period, if (period == 0 && quota == 0) return 0; - if (qemuSetupCgroupVcpuBW(cgroup, period, quota) < 0) + if (virDomainCgroupSetupVcpuBW(cgroup, period, quota) < 0) return -1; return 0; @@ -9120,7 +9120,7 @@ qemuSetVcpusBWLive(virDomainObj *vm, virCgroup *cgroup, false, &cgroup_vcpu) < 0) return -1; - if (qemuSetupCgroupVcpuBW(cgroup_vcpu, period, quota) < 0) + if (virDomainCgroupSetupVcpuBW(cgroup_vcpu, period, quota) < 0) return -1; } @@ -9141,7 +9141,7 @@ qemuSetEmulatorBandwidthLive(virCgroup *cgroup, false, &cgroup_emulator) < 0) return -1; - if (qemuSetupCgroupVcpuBW(cgroup_emulator, period, quota) < 0) + if (virDomainCgroupSetupVcpuBW(cgroup_emulator, period, quota) < 0) return -1; return 0; @@ -9168,7 +9168,7 @@ qemuSetIOThreadsBWLive(virDomainObj *vm, virCgroup *cgroup, false, &cgroup_iothread) < 0) return -1; - if (qemuSetupCgroupVcpuBW(cgroup_iothread, period, quota) < 0) + if (virDomainCgroupSetupVcpuBW(cgroup_iothread, period, quota) < 0) return -1; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index f36de2385a..da7143eba2 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -37,6 +37,7 @@ #include "qemu_snapshot.h" #include "qemu_virtiofs.h" #include "domain_audit.h" +#include "domain_cgroup.h" #include "netdev_bandwidth_conf.h" #include "domain_nwfilter.h" #include "virlog.h" @@ -6527,11 +6528,11 @@ qemuDomainSetVcpusLive(virQEMUDriver *driver, bool enable) { qemuDomainObjPrivate *priv = vm->privateData; - qemuCgroupEmulatorAllNodesData *emulatorCgroup = NULL; + virCgroupEmulatorAllNodesData *emulatorCgroup = NULL; ssize_t nextvcpu = -1; int ret = -1; - if (qemuCgroupEmulatorAllNodesAllow(priv->cgroup, &emulatorCgroup) < 0) + if (virDomainCgroupEmulatorAllNodesAllow(priv->cgroup, &emulatorCgroup) < 0) goto cleanup; if (enable) { @@ -6552,7 +6553,7 @@ qemuDomainSetVcpusLive(virQEMUDriver *driver, ret = 0; cleanup: - qemuCgroupEmulatorAllNodesRestore(emulatorCgroup); + virDomainCgroupEmulatorAllNodesRestore(emulatorCgroup); return ret; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 336f0bab2e..1673c05497 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -73,6 +73,7 @@ #include "virpidfile.h" #include "virhostcpu.h" #include "domain_audit.h" +#include "domain_cgroup.h" #include "domain_nwfilter.h" #include "domain_validate.h" #include "locking/domain_lock.h" @@ -2686,7 +2687,7 @@ qemuProcessSetupPid(virDomainObj *vm, if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { if (use_cpumask && - qemuSetupCgroupCpusetCpus(cgroup, use_cpumask) < 0) + virDomainCgroupSetupCpusetCpus(cgroup, use_cpumask) < 0) goto cleanup; if (mem_mask && virCgroupSetCpusetMems(cgroup, mem_mask) < 0) @@ -2695,7 +2696,7 @@ qemuProcessSetupPid(virDomainObj *vm, } if ((period || quota) && - qemuSetupCgroupVcpuBW(cgroup, period, quota) < 0) + virDomainCgroupSetupVcpuBW(cgroup, period, quota) < 0) goto cleanup; /* Move the thread to the sub dir */ @@ -5952,7 +5953,7 @@ qemuProcessSetupHotpluggableVcpus(virQEMUDriver *driver, { unsigned int maxvcpus = virDomainDefGetVcpusMax(vm->def); qemuDomainObjPrivate *priv = vm->privateData; - qemuCgroupEmulatorAllNodesData *emulatorCgroup = NULL; + virCgroupEmulatorAllNodesData *emulatorCgroup = NULL; virDomainVcpuDef *vcpu; qemuDomainVcpuPrivate *vcpupriv; size_t i; @@ -5980,7 +5981,7 @@ qemuProcessSetupHotpluggableVcpus(virQEMUDriver *driver, qsort(bootHotplug, nbootHotplug, sizeof(*bootHotplug), qemuProcessVcpusSortOrder); - if (qemuCgroupEmulatorAllNodesAllow(priv->cgroup, &emulatorCgroup) < 0) + if (virDomainCgroupEmulatorAllNodesAllow(priv->cgroup, &emulatorCgroup) < 0) goto cleanup; for (i = 0; i < nbootHotplug; i++) { @@ -6004,7 +6005,7 @@ qemuProcessSetupHotpluggableVcpus(virQEMUDriver *driver, ret = 0; cleanup: - qemuCgroupEmulatorAllNodesRestore(emulatorCgroup); + virDomainCgroupEmulatorAllNodesRestore(emulatorCgroup); return ret; } @@ -6994,7 +6995,7 @@ qemuProcessPrepareHost(virQEMUDriver *driver, /* Ensure no historical cgroup for this VM is lying around bogus * settings */ VIR_DEBUG("Ensuring no historical cgroup is lying around"); - qemuRemoveCgroup(vm); + virDomainCgroupRemoveCgroup(vm, priv->cgroup, priv->machineName); if (g_mkdir_with_parents(cfg->logDir, 0777) < 0) { virReportSystemError(errno, @@ -7603,7 +7604,7 @@ qemuProcessLaunch(virConnectPtr conn, goto cleanup; VIR_DEBUG("Setting global CPU cgroup (if required)"); - if (qemuSetupGlobalCpuCgroup(vm) < 0) + if (virDomainCgroupSetupGlobalCpuCgroup(vm, priv->cgroup, priv->autoNodeset) < 0) goto cleanup; VIR_DEBUG("Setting vCPU tuning/settings"); @@ -8202,7 +8203,7 @@ void qemuProcessStop(virQEMUDriver *driver, } retry: - if ((ret = qemuRemoveCgroup(vm)) < 0) { + if ((ret = virDomainCgroupRemoveCgroup(vm, priv->cgroup, priv->machineName)) < 0) { if (ret == -EBUSY && (retries++ < 5)) { g_usleep(200*1000); goto retry; @@ -8761,7 +8762,12 @@ qemuProcessReconnect(void *opaque) if (!priv->machineName) goto error; - if (qemuConnectCgroup(obj) < 0) + if (virDomainCgroupConnectCgroup("qemu", + obj, + priv->cgroup, + cfg->cgroupControllers, + priv->driver->privileged, + priv->machineName) < 0) goto error; if (qemuDomainPerfRestart(obj) < 0) -- 2.27.0

On 1/25/22 17:19, Praveen K Paladugu wrote:
Refactor some cgroup management methods from qemu into hypervisor. These methods will be shared with ch driver for cgroup management.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/hypervisor/domain_cgroup.c | 457 ++++++++++++++++++++++++++++++++- src/hypervisor/domain_cgroup.h | 72 ++++++ src/libvirt_private.syms | 14 +- src/qemu/qemu_cgroup.c | 413 +---------------------------- src/qemu/qemu_cgroup.h | 11 - src/qemu/qemu_driver.c | 14 +- src/qemu/qemu_hotplug.c | 7 +- src/qemu/qemu_process.c | 24 +- 8 files changed, 580 insertions(+), 432 deletions(-)
diff --git a/src/hypervisor/domain_cgroup.c b/src/hypervisor/domain_cgroup.c index 61b54f071c..05c53ff7d1 100644 --- a/src/hypervisor/domain_cgroup.c +++ b/src/hypervisor/domain_cgroup.c @@ -22,11 +22,12 @@
#include "domain_cgroup.h" #include "domain_driver.h" - +#include "util/virnuma.h" +#include "virlog.h" #include "virutil.h"
#define VIR_FROM_THIS VIR_FROM_DOMAIN - +VIR_LOG_INIT("domain.cgroup");
int virDomainCgroupSetupBlkio(virCgroup *cgroup, virDomainBlkiotune blkio) @@ -269,3 +270,455 @@ virDomainCgroupSetMemoryLimitParameters(virCgroup *cgroup,
return 0; } + + +int +virDomainCgroupSetupBlkioCgroup(virDomainObj *vm, + virCgroup *cgroup) +{ +
No need to start a function with an empty line.
+ if (!virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (vm->def->blkio.weight || vm->def->blkio.ndevices) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Block I/O tuning is not available on this host")); + return -1; + } else { + return 0;
Here and in the rest of the patch: the else branch is not necessary and in fact is not in the original qemu code.
+ } + } + + return virDomainCgroupSetupBlkio(cgroup, vm->def->blkio); +} + +
+ +int +virDomainCgroupInitCgroup(const char *prefix, + virDomainObj * vm, + size_t nnicindexes, + int *nicindexes, + virCgroup * cgroup, + int cgroupControllers, + unsigned int maxThreadsPerProc, + bool privileged, + char *machineName) +{ + if (!privileged) + return 0; + + if (!virCgroupAvailable()) + return 0; + + virCgroupFree(cgroup); + cgroup = NULL; +
This doesn't do what you think it does. This merely clears the local variable, but doesn't affect the variable in caller. Therefore, @cgroup really needs to be a double pointer. And this can then be just: g_clear_pointer(cgroup, virCgroupFree); Subsequently, virDomainCgroupSetupCgroup() will need to take a double pointer too.
+ if (!vm->def->resource) + vm->def->resource = g_new0(virDomainResourceDef, 1); + + if (!vm->def->resource->partition) + vm->def->resource->partition = g_strdup("/machine"); + + if (!g_path_is_absolute(vm->def->resource->partition)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Resource partition '%s' must start with '/'"), + vm->def->resource->partition); + return -1; + } + + if (virCgroupNewMachine(machineName, + prefix, + vm->def->uuid, + NULL, + vm->pid, + false, + nnicindexes, nicindexes, + vm->def->resource->partition, + cgroupControllers, + maxThreadsPerProc, &cgroup) < 0) { + if (virCgroupNewIgnoreError()) + return 0; + + return -1; + } + + return 0; +} + +
+int +virDomainCgroupSetupCgroup(const char *prefix, + virDomainObj * vm, + size_t nnicindexes, + int *nicindexes, + virCgroup * cgroup, + int cgroupControllers, + unsigned int maxThreadsPerProc, + bool privileged, + char *machineName) +{ + if (!vm->pid) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot setup cgroups until process is started")); + return -1; + } + + if (virDomainCgroupInitCgroup(prefix, + vm, + nnicindexes, + nicindexes, + cgroup, + cgroupControllers, + maxThreadsPerProc, + privileged, + machineName) < 0) + return -1; + + if (!cgroup) + return 0; + + if (virDomainCgroupSetupBlkioCgroup(vm, cgroup) < 0) + return -1; + + if (virDomainCgroupSetupMemoryCgroup(vm, cgroup) < 0) + return -1; + + if (virDomainCgroupSetupCpuCgroup(vm, cgroup) < 0) + return -1; + + if (virDomainCgroupSetupCpusetCgroup(cgroup) < 0) + return -1;
Any reason why qemuSetupCgroup() doesn't call this function then? I mean, virDomainCgroupSetupCgroup() is a (strictly smaller) subset of qemuSetupCgroup() so the latter could call the former and then do those extra steps.
+ + return 0; +} +
diff --git a/src/hypervisor/domain_cgroup.h b/src/hypervisor/domain_cgroup.h index f93e5f74fe..dfe7107674 100644 --- a/src/hypervisor/domain_cgroup.h +++ b/src/hypervisor/domain_cgroup.h @@ -23,6 +23,11 @@ #include "vircgroup.h" #include "domain_conf.h"
+typedef struct _virCgroupEmulatorAllNodesData virCgroupEmulatorAllNodesData; +struct _virCgroupEmulatorAllNodesData { + virCgroup *emulatorCgroup; + char *emulatorMemMask; +};
int virDomainCgroupSetupBlkio(virCgroup *cgroup, virDomainBlkiotune blkio); int virDomainCgroupSetupMemtune(virCgroup *cgroup, virDomainMemtune mem); @@ -36,3 +41,70 @@ int virDomainCgroupSetMemoryLimitParameters(virCgroup *cgroup, virDomainDef *persistentDef, virTypedParameterPtr params, int nparams); +int +virDomainCgroupSetupBlkioCgroup(virDomainObj * vm,
Here and everywhere else: please do not write it like this. Write it together: *variable.
+ virCgroup *cgroup); +int +virDomainCgroupSetupMemoryCgroup(virDomainObj * vm, + virCgroup *cgroup); +int +virDomainCgroupSetupCpusetCgroup(virCgroup *cgroup); +int +virDomainCgroupSetupCpuCgroup(virDomainObj * vm, + virCgroup *cgroup); +int +virDomainCgroupInitCgroup(const char *prefix, + virDomainObj * vm, + size_t nnicindexes, + int *nicindexes, + virCgroup *cgroup, + int cgroupControllers, + unsigned int maxThreadsPerProc, + bool privileged, + char *machineName); +void +virDomainCgroupRestoreCgroupState(virDomainObj * vm, + virCgroup *cgroup); +int +virDomainCgroupConnectCgroup(const char *prefix, + virDomainObj * vm, + virCgroup *cgroup, + int cgroupControllers, + bool privileged, + char *machineName); +int +virDomainCgroupSetupCgroup(const char *prefix, + virDomainObj * vm, + size_t nnicindexes, + int *nicindexes, + virCgroup *cgroup, + int cgroupControllers, + unsigned int maxThreadsPerProc, + bool privileged, + char *machineName); +void +virDomainCgroupEmulatorAllNodesDataFree(virCgroupEmulatorAllNodesData * data); +int +virDomainCgroupEmulatorAllNodesAllow(virCgroup * cgroup, + virCgroupEmulatorAllNodesData ** retData); +void +virDomainCgroupEmulatorAllNodesRestore(virCgroupEmulatorAllNodesData * data); +int +virDomainCgroupSetupVcpuBW(virCgroup * cgroup, + unsigned long long period, + long long quota); +int +virDomainCgroupSetupCpusetCpus(virCgroup * cgroup, + virBitmap * cpumask); +int +virDomainCgroupSetupGlobalCpuCgroup(virDomainObj * vm, + virCgroup * cgroup, + virBitmap *autoNodeset); +int +virDomainCgroupRemoveCgroup(virDomainObj * vm, + virCgroup * cgroup, + char *machineName); +int +virDomainCgroupRestoreCgroupThread(virCgroup *cgroup, + virCgroupThreadName thread, + int id); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ba3462d849..bc6fa191bf 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1544,11 +1544,23 @@ virSetConnectStorage;
# hypervisor/domain_cgroup.h +virDomainCgroupConnectCgroup; +virDomainCgroupEmulatorAllNodesAllow; +virDomainCgroupEmulatorAllNodesRestore; +virDomainCgroupInitCgroup; +virDomainCgroupRemoveCgroup; virDomainCgroupSetMemoryLimitParameters; virDomainCgroupSetupBlkio; +virDomainCgroupSetupBlkioCgroup; +virDomainCgroupSetupCgroup; +virDomainCgroupSetupCpuCgroup; +virDomainCgroupSetupCpusetCgroup; +virDomainCgroupSetupCpusetCpus; virDomainCgroupSetupDomainBlkioParameters; +virDomainCgroupSetupGlobalCpuCgroup; +virDomainCgroupSetupMemoryCgroup; virDomainCgroupSetupMemtune; - +virDomainCgroupSetupVcpuBW;
# hypervisor/domain_driver.h virDomainDriverAddIOThreadCheck; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 22a6f56cf9..40bd4e2f9d 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -593,46 +593,6 @@ qemuSetupVideoCgroup(virDomainObj *vm, return ret; }
- -static int -qemuSetupBlkioCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (!virCgroupHasController(priv->cgroup, - VIR_CGROUP_CONTROLLER_BLKIO)) { - if (vm->def->blkio.weight || vm->def->blkio.ndevices) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("Block I/O tuning is not available on this host")); - return -1; - } - return 0; - } - - return virDomainCgroupSetupBlkio(priv->cgroup, vm->def->blkio); -} - - -static int -qemuSetupMemoryCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { - if (virMemoryLimitIsSet(vm->def->mem.hard_limit) || - virMemoryLimitIsSet(vm->def->mem.soft_limit) || - virMemoryLimitIsSet(vm->def->mem.swap_hard_limit)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("Memory cgroup is not available on this host")); - return -1; - } - return 0; - } - - return virDomainCgroupSetupMemtune(priv->cgroup, vm->def->mem); -} - - static int qemuSetupFirmwareCgroup(virDomainObj *vm) { @@ -861,44 +821,6 @@ qemuSetupDevicesCgroup(virDomainObj *vm) }
-static int -qemuSetupCpusetCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) - return 0; - - if (virCgroupSetCpusetMemoryMigrate(priv->cgroup, true) < 0) - return -1; - - return 0; -} - - -static int -qemuSetupCpuCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { - if (vm->def->cputune.sharesSpecified) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("CPU tuning is not available on this host")); - return -1; - } - return 0; - } - - if (vm->def->cputune.sharesSpecified) { - if (virCgroupSetCpuShares(priv->cgroup, vm->def->cputune.shares) < 0) - return -1; - } - - return 0; -} - - static int qemuSetupCgroupAppid(virDomainObj *vm) { @@ -927,166 +849,13 @@ qemuSetupCgroupAppid(virDomainObj *vm) }
-static int -qemuInitCgroup(virDomainObj *vm, - size_t nnicindexes, - int *nicindexes) -{ - qemuDomainObjPrivate *priv = vm->privateData; - g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(priv->driver); - - if (!priv->driver->privileged) - return 0; - - if (!virCgroupAvailable()) - return 0; - - virCgroupFree(priv->cgroup); - priv->cgroup = NULL; - - if (!vm->def->resource) - vm->def->resource = g_new0(virDomainResourceDef, 1); - - if (!vm->def->resource->partition) - vm->def->resource->partition = g_strdup("/machine"); - - if (!g_path_is_absolute(vm->def->resource->partition)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("Resource partition '%s' must start with '/'"), - vm->def->resource->partition); - return -1; - } - - if (virCgroupNewMachine(priv->machineName, - "qemu", - vm->def->uuid, - NULL, - vm->pid, - false, - nnicindexes, nicindexes, - vm->def->resource->partition, - cfg->cgroupControllers, - cfg->maxThreadsPerProc, - &priv->cgroup) < 0) { - if (virCgroupNewIgnoreError()) - return 0; - - return -1; - } - - return 0; -} - -static int -qemuRestoreCgroupThread(virCgroup *cgroup, - virCgroupThreadName thread, - int id) -{ - g_autoptr(virCgroup) cgroup_temp = NULL; - g_autofree char *nodeset = NULL; - - if (virCgroupNewThread(cgroup, thread, id, false, &cgroup_temp) < 0) - return -1; - - if (virCgroupSetCpusetMemoryMigrate(cgroup_temp, true) < 0) - return -1; - - if (virCgroupGetCpusetMems(cgroup_temp, &nodeset) < 0) - return -1; - - if (virCgroupSetCpusetMems(cgroup_temp, nodeset) < 0) - return -1; - - return 0; -} - -static void -qemuRestoreCgroupState(virDomainObj *vm) -{ - g_autofree char *mem_mask = NULL; - qemuDomainObjPrivate *priv = vm->privateData; - size_t i = 0; - g_autoptr(virBitmap) all_nodes = NULL; - - if (!virNumaIsAvailable() || - !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) - return; - - if (!(all_nodes = virNumaGetHostMemoryNodeset())) - goto error; - - if (!(mem_mask = virBitmapFormat(all_nodes))) - goto error; - - if (virCgroupHasEmptyTasks(priv->cgroup, - VIR_CGROUP_CONTROLLER_CPUSET) <= 0) - goto error; - - if (virCgroupSetCpusetMems(priv->cgroup, mem_mask) < 0) - goto error; - - for (i = 0; i < virDomainDefGetVcpusMax(vm->def); i++) { - virDomainVcpuDef *vcpu = virDomainDefGetVcpu(vm->def, i); - - if (!vcpu->online) - continue; - - if (qemuRestoreCgroupThread(priv->cgroup, - VIR_CGROUP_THREAD_VCPU, i) < 0) - return; - } - - for (i = 0; i < vm->def->niothreadids; i++) { - if (qemuRestoreCgroupThread(priv->cgroup, VIR_CGROUP_THREAD_IOTHREAD, - vm->def->iothreadids[i]->iothread_id) < 0) - return; - } - - if (qemuRestoreCgroupThread(priv->cgroup, - VIR_CGROUP_THREAD_EMULATOR, 0) < 0) - return; - - return; - - error: - virResetLastError(); - VIR_DEBUG("Couldn't restore cgroups to meaningful state"); - return; -} - -int -qemuConnectCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(priv->driver); - - if (!priv->driver->privileged) - return 0; - - if (!virCgroupAvailable()) - return 0; - - virCgroupFree(priv->cgroup); - priv->cgroup = NULL; - - if (virCgroupNewDetectMachine(vm->def->name, - "qemu", - vm->pid, - cfg->cgroupControllers, - priv->machineName, - &priv->cgroup) < 0) - return -1; - - qemuRestoreCgroupState(vm); - return 0; -} - int qemuSetupCgroup(virDomainObj *vm, size_t nnicindexes, int *nicindexes) { qemuDomainObjPrivate *priv = vm->privateData; + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(priv->driver);
if (!vm->pid) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -1094,7 +863,15 @@ qemuSetupCgroup(virDomainObj *vm, return -1; }
- if (qemuInitCgroup(vm, nnicindexes, nicindexes) < 0) + if (virDomainCgroupInitCgroup("qemu", + vm, + nnicindexes, + nicindexes, + priv->cgroup, + cfg->cgroupControllers, + cfg->maxThreadsPerProc, + priv->driver->privileged, + priv->machineName) < 0) return -1;
if (!priv->cgroup) @@ -1103,16 +880,16 @@ qemuSetupCgroup(virDomainObj *vm, if (qemuSetupDevicesCgroup(vm) < 0) return -1;
- if (qemuSetupBlkioCgroup(vm) < 0) + if (virDomainCgroupSetupBlkioCgroup(vm, priv->cgroup) < 0) return -1;
- if (qemuSetupMemoryCgroup(vm) < 0) + if (virDomainCgroupSetupMemoryCgroup(vm, priv->cgroup) < 0) return -1;
- if (qemuSetupCpuCgroup(vm) < 0) + if (virDomainCgroupSetupCpuCgroup(vm, priv->cgroup) < 0) return -1;
- if (qemuSetupCpusetCgroup(vm) < 0) + if (virDomainCgroupSetupCpusetCgroup(priv->cgroup) < 0) return -1;
if (qemuSetupCgroupAppid(vm) < 0) @@ -1121,23 +898,6 @@ qemuSetupCgroup(virDomainObj *vm, return 0; }
-int -qemuSetupCgroupVcpuBW(virCgroup *cgroup, - unsigned long long period, - long long quota) -{ - return virCgroupSetupCpuPeriodQuota(cgroup, period, quota); -} - - -int -qemuSetupCgroupCpusetCpus(virCgroup *cgroup, - virBitmap *cpumask) -{ - return virCgroupSetupCpusetCpus(cgroup, cpumask); -} - - int qemuSetupCgroupForExtDevices(virDomainObj *vm, virQEMUDriver *driver) @@ -1164,148 +924,3 @@ qemuSetupCgroupForExtDevices(virDomainObj *vm,
return qemuExtDevicesSetupCgroup(driver, vm, cgroup_temp); } - - -int -qemuSetupGlobalCpuCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - unsigned long long period = vm->def->cputune.global_period; - long long quota = vm->def->cputune.global_quota; - g_autofree char *mem_mask = NULL; - virDomainNumatuneMemMode mem_mode; - - if ((period || quota) && - !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("cgroup cpu is required for scheduler tuning")); - return -1; - } - - /* - * If CPU cgroup controller is not initialized here, then we need - * neither period nor quota settings. And if CPUSET controller is - * not initialized either, then there's nothing to do anyway. - */ - if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU) && - !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) - return 0; - - - if (virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) == 0 && - mem_mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT && - virDomainNumatuneMaybeFormatNodeset(vm->def->numa, - priv->autoNodeset, - &mem_mask, -1) < 0) - return -1; - - if (period || quota) { - if (qemuSetupCgroupVcpuBW(priv->cgroup, period, quota) < 0) - return -1; - } - - return 0; -} - - -int -qemuRemoveCgroup(virDomainObj *vm) -{ - qemuDomainObjPrivate *priv = vm->privateData; - - if (priv->cgroup == NULL) - return 0; /* Not supported, so claim success */ - - if (virCgroupTerminateMachine(priv->machineName) < 0) { - if (!virCgroupNewIgnoreError()) - VIR_DEBUG("Failed to terminate cgroup for %s", vm->def->name); - } - - return virCgroupRemove(priv->cgroup); -} - - -static void -qemuCgroupEmulatorAllNodesDataFree(qemuCgroupEmulatorAllNodesData *data) -{ - if (!data) - return; - - virCgroupFree(data->emulatorCgroup); - g_free(data->emulatorMemMask); - g_free(data); -} - - -/** - * qemuCgroupEmulatorAllNodesAllow: - * @cgroup: domain cgroup pointer - * @retData: filled with structure used to roll back the operation - * - * Allows all NUMA nodes for the qemu emulator thread temporarily. This is - * necessary when hotplugging cpus since it requires memory allocated in the - * DMA region. Afterwards the operation can be reverted by - * qemuCgroupEmulatorAllNodesRestore. - * - * Returns 0 on success -1 on error - */ -int -qemuCgroupEmulatorAllNodesAllow(virCgroup *cgroup, - qemuCgroupEmulatorAllNodesData **retData) -{ - qemuCgroupEmulatorAllNodesData *data = NULL; - g_autofree char *all_nodes_str = NULL; - g_autoptr(virBitmap) all_nodes = NULL; - int ret = -1; - - if (!virNumaIsAvailable() || - !virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) - return 0; - - if (!(all_nodes = virNumaGetHostMemoryNodeset())) - goto cleanup; - - if (!(all_nodes_str = virBitmapFormat(all_nodes))) - goto cleanup; - - data = g_new0(qemuCgroupEmulatorAllNodesData, 1); - - if (virCgroupNewThread(cgroup, VIR_CGROUP_THREAD_EMULATOR, 0, - false, &data->emulatorCgroup) < 0) - goto cleanup; - - if (virCgroupGetCpusetMems(data->emulatorCgroup, &data->emulatorMemMask) < 0 || - virCgroupSetCpusetMems(data->emulatorCgroup, all_nodes_str) < 0) - goto cleanup; - - *retData = g_steal_pointer(&data); - ret = 0; - - cleanup: - qemuCgroupEmulatorAllNodesDataFree(data); - - return ret; -} - - -/** - * qemuCgroupEmulatorAllNodesRestore: - * @data: data structure created by qemuCgroupEmulatorAllNodesAllow - * - * Rolls back the setting done by qemuCgroupEmulatorAllNodesAllow and frees the - * associated data. - */ -void -qemuCgroupEmulatorAllNodesRestore(qemuCgroupEmulatorAllNodesData *data) -{ - virErrorPtr err; - - if (!data) - return; - - virErrorPreserveLast(&err); - virCgroupSetCpusetMems(data->emulatorCgroup, data->emulatorMemMask); - virErrorRestore(&err); - - qemuCgroupEmulatorAllNodesDataFree(data); -} diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index cd537ebd82..f09134947f 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -56,18 +56,11 @@ int qemuSetupChardevCgroup(virDomainObj *vm, virDomainChrDef *dev); int qemuTeardownChardevCgroup(virDomainObj *vm, virDomainChrDef *dev); -int qemuConnectCgroup(virDomainObj *vm); int qemuSetupCgroup(virDomainObj *vm, size_t nnicindexes, int *nicindexes); -int qemuSetupCgroupVcpuBW(virCgroup *cgroup, - unsigned long long period, - long long quota); -int qemuSetupCgroupCpusetCpus(virCgroup *cgroup, virBitmap *cpumask); -int qemuSetupGlobalCpuCgroup(virDomainObj *vm); int qemuSetupCgroupForExtDevices(virDomainObj *vm, virQEMUDriver *driver); -int qemuRemoveCgroup(virDomainObj *vm);
typedef struct _qemuCgroupEmulatorAllNodesData qemuCgroupEmulatorAllNodesData; struct _qemuCgroupEmulatorAllNodesData { @@ -75,8 +68,4 @@ struct _qemuCgroupEmulatorAllNodesData { char *emulatorMemMask; };
-int qemuCgroupEmulatorAllNodesAllow(virCgroup *cgroup, - qemuCgroupEmulatorAllNodesData **data); -void qemuCgroupEmulatorAllNodesRestore(qemuCgroupEmulatorAllNodesData *data); - extern const char *const defaultDeviceACL[]; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0a1ba74e65..1141efef4b 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4419,7 +4419,7 @@ qemuDomainPinVcpuLive(virDomainObj *vm, if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_VCPU, vcpu, false, &cgroup_vcpu) < 0) goto cleanup; - if (qemuSetupCgroupCpusetCpus(cgroup_vcpu, cpumap) < 0) + if (virDomainCgroupSetupCpusetCpus(cgroup_vcpu, cpumap) < 0) goto cleanup; }
@@ -4628,7 +4628,7 @@ qemuDomainPinEmulator(virDomainPtr dom, 0, false, &cgroup_emulator) < 0) goto endjob;
- if (qemuSetupCgroupCpusetCpus(cgroup_emulator, pcpumap) < 0) { + if (virDomainCgroupSetupCpusetCpus(cgroup_emulator, pcpumap) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("failed to set cpuset.cpus in cgroup" " for emulator threads")); @@ -5025,7 +5025,7 @@ qemuDomainPinIOThread(virDomainPtr dom, if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_IOTHREAD, iothread_id, false, &cgroup_iothread) < 0) goto endjob; - if (qemuSetupCgroupCpusetCpus(cgroup_iothread, pcpumap) < 0) { + if (virDomainCgroupSetupCpusetCpus(cgroup_iothread, pcpumap) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, _("failed to set cpuset.cpus in cgroup" " for iothread %d"), iothread_id); @@ -8925,7 +8925,7 @@ qemuSetGlobalBWLive(virCgroup *cgroup, unsigned long long period, if (period == 0 && quota == 0) return 0;
- if (qemuSetupCgroupVcpuBW(cgroup, period, quota) < 0) + if (virDomainCgroupSetupVcpuBW(cgroup, period, quota) < 0) return -1;
return 0; @@ -9120,7 +9120,7 @@ qemuSetVcpusBWLive(virDomainObj *vm, virCgroup *cgroup, false, &cgroup_vcpu) < 0) return -1;
- if (qemuSetupCgroupVcpuBW(cgroup_vcpu, period, quota) < 0) + if (virDomainCgroupSetupVcpuBW(cgroup_vcpu, period, quota) < 0) return -1; }
@@ -9141,7 +9141,7 @@ qemuSetEmulatorBandwidthLive(virCgroup *cgroup, false, &cgroup_emulator) < 0) return -1;
- if (qemuSetupCgroupVcpuBW(cgroup_emulator, period, quota) < 0) + if (virDomainCgroupSetupVcpuBW(cgroup_emulator, period, quota) < 0) return -1;
return 0; @@ -9168,7 +9168,7 @@ qemuSetIOThreadsBWLive(virDomainObj *vm, virCgroup *cgroup, false, &cgroup_iothread) < 0) return -1;
- if (qemuSetupCgroupVcpuBW(cgroup_iothread, period, quota) < 0) + if (virDomainCgroupSetupVcpuBW(cgroup_iothread, period, quota) < 0) return -1; }
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index f36de2385a..da7143eba2 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -37,6 +37,7 @@ #include "qemu_snapshot.h" #include "qemu_virtiofs.h" #include "domain_audit.h" +#include "domain_cgroup.h" #include "netdev_bandwidth_conf.h" #include "domain_nwfilter.h" #include "virlog.h" @@ -6527,11 +6528,11 @@ qemuDomainSetVcpusLive(virQEMUDriver *driver, bool enable) { qemuDomainObjPrivate *priv = vm->privateData; - qemuCgroupEmulatorAllNodesData *emulatorCgroup = NULL; + virCgroupEmulatorAllNodesData *emulatorCgroup = NULL; ssize_t nextvcpu = -1; int ret = -1;
- if (qemuCgroupEmulatorAllNodesAllow(priv->cgroup, &emulatorCgroup) < 0) + if (virDomainCgroupEmulatorAllNodesAllow(priv->cgroup, &emulatorCgroup) < 0) goto cleanup;
if (enable) { @@ -6552,7 +6553,7 @@ qemuDomainSetVcpusLive(virQEMUDriver *driver, ret = 0;
cleanup: - qemuCgroupEmulatorAllNodesRestore(emulatorCgroup); + virDomainCgroupEmulatorAllNodesRestore(emulatorCgroup);
return ret; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 336f0bab2e..1673c05497 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -73,6 +73,7 @@ #include "virpidfile.h" #include "virhostcpu.h" #include "domain_audit.h" +#include "domain_cgroup.h" #include "domain_nwfilter.h" #include "domain_validate.h" #include "locking/domain_lock.h" @@ -2686,7 +2687,7 @@ qemuProcessSetupPid(virDomainObj *vm,
if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { if (use_cpumask && - qemuSetupCgroupCpusetCpus(cgroup, use_cpumask) < 0) + virDomainCgroupSetupCpusetCpus(cgroup, use_cpumask) < 0) goto cleanup;
if (mem_mask && virCgroupSetCpusetMems(cgroup, mem_mask) < 0) @@ -2695,7 +2696,7 @@ qemuProcessSetupPid(virDomainObj *vm, }
if ((period || quota) && - qemuSetupCgroupVcpuBW(cgroup, period, quota) < 0) + virDomainCgroupSetupVcpuBW(cgroup, period, quota) < 0) goto cleanup;
/* Move the thread to the sub dir */ @@ -5952,7 +5953,7 @@ qemuProcessSetupHotpluggableVcpus(virQEMUDriver *driver, { unsigned int maxvcpus = virDomainDefGetVcpusMax(vm->def); qemuDomainObjPrivate *priv = vm->privateData; - qemuCgroupEmulatorAllNodesData *emulatorCgroup = NULL; + virCgroupEmulatorAllNodesData *emulatorCgroup = NULL; virDomainVcpuDef *vcpu; qemuDomainVcpuPrivate *vcpupriv; size_t i; @@ -5980,7 +5981,7 @@ qemuProcessSetupHotpluggableVcpus(virQEMUDriver *driver, qsort(bootHotplug, nbootHotplug, sizeof(*bootHotplug), qemuProcessVcpusSortOrder);
- if (qemuCgroupEmulatorAllNodesAllow(priv->cgroup, &emulatorCgroup) < 0) + if (virDomainCgroupEmulatorAllNodesAllow(priv->cgroup, &emulatorCgroup) < 0) goto cleanup;
for (i = 0; i < nbootHotplug; i++) { @@ -6004,7 +6005,7 @@ qemuProcessSetupHotpluggableVcpus(virQEMUDriver *driver, ret = 0;
cleanup: - qemuCgroupEmulatorAllNodesRestore(emulatorCgroup); + virDomainCgroupEmulatorAllNodesRestore(emulatorCgroup); return ret; }
@@ -6994,7 +6995,7 @@ qemuProcessPrepareHost(virQEMUDriver *driver, /* Ensure no historical cgroup for this VM is lying around bogus * settings */ VIR_DEBUG("Ensuring no historical cgroup is lying around"); - qemuRemoveCgroup(vm); + virDomainCgroupRemoveCgroup(vm, priv->cgroup, priv->machineName);
if (g_mkdir_with_parents(cfg->logDir, 0777) < 0) { virReportSystemError(errno, @@ -7603,7 +7604,7 @@ qemuProcessLaunch(virConnectPtr conn, goto cleanup;
VIR_DEBUG("Setting global CPU cgroup (if required)"); - if (qemuSetupGlobalCpuCgroup(vm) < 0) + if (virDomainCgroupSetupGlobalCpuCgroup(vm, priv->cgroup, priv->autoNodeset) < 0) goto cleanup;
VIR_DEBUG("Setting vCPU tuning/settings"); @@ -8202,7 +8203,7 @@ void qemuProcessStop(virQEMUDriver *driver, }
retry: - if ((ret = qemuRemoveCgroup(vm)) < 0) { + if ((ret = virDomainCgroupRemoveCgroup(vm, priv->cgroup, priv->machineName)) < 0) { if (ret == -EBUSY && (retries++ < 5)) { g_usleep(200*1000); goto retry; @@ -8761,7 +8762,12 @@ qemuProcessReconnect(void *opaque) if (!priv->machineName) goto error;
- if (qemuConnectCgroup(obj) < 0) + if (virDomainCgroupConnectCgroup("qemu", + obj, + priv->cgroup, + cfg->cgroupControllers, + priv->driver->privileged, + priv->machineName) < 0) goto error;
if (qemuDomainPerfRestart(obj) < 0)

From: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_conf.c | 2 + src/ch/ch_conf.h | 4 +- src/ch/ch_domain.c | 34 +++++ src/ch/ch_domain.h | 11 +- src/ch/ch_monitor.c | 96 ++++++++++++++ src/ch/ch_monitor.h | 54 +++++++- src/ch/ch_process.c | 308 ++++++++++++++++++++++++++++++++++++++++++-- src/ch/ch_process.h | 3 + 8 files changed, 492 insertions(+), 20 deletions(-) diff --git a/src/ch/ch_conf.c b/src/ch/ch_conf.c index 98f1e89003..be12934dcd 100644 --- a/src/ch/ch_conf.c +++ b/src/ch/ch_conf.c @@ -125,6 +125,8 @@ virCHDriverConfigNew(bool privileged) if (!(cfg = virObjectNew(virCHDriverConfigClass))) return NULL; + cfg->cgroupControllers = -1; /* Auto detect */ + if (privileged) { if (virGetUserID(CH_USER, &cfg->user) < 0) return NULL; diff --git a/src/ch/ch_conf.h b/src/ch/ch_conf.h index 8fe69c8545..1790295ede 100644 --- a/src/ch/ch_conf.h +++ b/src/ch/ch_conf.h @@ -35,11 +35,13 @@ struct _virCHDriverConfig { char *stateDir; char *logDir; - + int cgroupControllers; uid_t user; gid_t group; }; +G_DEFINE_AUTOPTR_CLEANUP_FUNC(virCHDriverConfig, virObjectUnref); + struct _virCHDriver { virMutex lock; diff --git a/src/ch/ch_domain.c b/src/ch/ch_domain.c index a746d0f5fd..6f0cec8c6e 100644 --- a/src/ch/ch_domain.c +++ b/src/ch/ch_domain.c @@ -319,6 +319,40 @@ chValidateDomainDeviceDef(const virDomainDeviceDef *dev, _("Serial can only be enabled for a PTY")); return -1; } + return 0; +} + +int +virCHDomainRefreshThreadInfo(virDomainObj *vm) +{ + size_t maxvcpus = virDomainDefGetVcpusMax(vm->def); + virCHMonitorThreadInfo *info = NULL; + size_t nthreads, ncpus = 0; + size_t i; + + nthreads = virCHMonitorGetThreadInfo(virCHDomainGetMonitor(vm), + true, &info); + + for (i = 0; i < nthreads; i++) { + virCHDomainVcpuPrivate *vcpupriv; + virDomainVcpuDef *vcpu; + virCHMonitorCPUInfo *vcpuInfo; + + if (info[i].type != virCHThreadTypeVcpu) + continue; + + /* TODO: hotplug support */ + vcpuInfo = &info[i].vcpuInfo; + vcpu = virDomainDefGetVcpu(vm->def, vcpuInfo->cpuid); + vcpupriv = CH_DOMAIN_VCPU_PRIVATE(vcpu); + vcpupriv->tid = vcpuInfo->tid; + ncpus++; + } + + /* TODO: Remove the warning when hotplug is implemented.*/ + if (ncpus != maxvcpus) + VIR_WARN("Mismatch in the number of cpus, expected: %ld, actual: %ld", + maxvcpus, ncpus); return 0; } diff --git a/src/ch/ch_domain.h b/src/ch/ch_domain.h index 4d0b5479b8..cb94905b94 100644 --- a/src/ch/ch_domain.h +++ b/src/ch/ch_domain.h @@ -53,11 +53,13 @@ typedef struct _virCHDomainObjPrivate virCHDomainObjPrivate; struct _virCHDomainObjPrivate { struct virCHDomainJobObj job; - virChrdevs *chrdevs; - virCHDriver *driver; - virCHMonitor *monitor; char *machineName; virBitmap *autoCpuset; + virBitmap *autoNodeset; + virCHDriver *driver; + virCHMonitor *monitor; + virCgroup *cgroup; + virChrdevs *chrdevs; }; #define CH_DOMAIN_PRIVATE(vm) \ @@ -87,7 +89,8 @@ void virCHDomainObjEndJob(virDomainObj *obj); int -virCHDomainRefreshVcpuInfo(virDomainObj *vm); +virCHDomainRefreshThreadInfo(virDomainObj *vm); + pid_t virCHDomainGetVcpuPid(virDomainObj *vm, unsigned int vcpuid); diff --git a/src/ch/ch_monitor.c b/src/ch/ch_monitor.c index a19f0c7e33..d984bf9822 100644 --- a/src/ch/ch_monitor.c +++ b/src/ch/ch_monitor.c @@ -41,6 +41,7 @@ VIR_LOG_INIT("ch.ch_monitor"); static virClass *virCHMonitorClass; static void virCHMonitorDispose(void *obj); +static void virCHMonitorThreadInfoFree(virCHMonitor * mon); static int virCHMonitorOnceInit(void) { @@ -578,6 +579,7 @@ static void virCHMonitorDispose(void *opaque) virCHMonitor *mon = opaque; VIR_DEBUG("mon=%p", mon); + virCHMonitorThreadInfoFree(mon); virObjectUnref(mon->vm); } @@ -743,6 +745,100 @@ virCHMonitorGet(virCHMonitor *mon, const char *endpoint, virJSONValue **response return ret; } +static void +virCHMonitorThreadInfoFree(virCHMonitor *mon) +{ + mon->nthreads = 0; + if (mon->threads) + VIR_FREE(mon->threads); +} + +static size_t +virCHMonitorRefreshThreadInfo(virCHMonitor *mon) +{ + virCHMonitorThreadInfo *info = NULL; + g_autofree pid_t *tids = NULL; + virDomainObj *vm = mon->vm; + size_t ntids = 0; + size_t i; + + + virCHMonitorThreadInfoFree(mon); + if (virProcessGetPids(vm->pid, &ntids, &tids) < 0) { + mon->threads = NULL; + return 0; + } + + info = g_new0(virCHMonitorThreadInfo, ntids); + for (i = 0; i < ntids; i++) { + g_autofree char *proc = NULL; + g_autofree char *data = NULL; + + proc = g_strdup_printf("/proc/%d/task/%d/comm", + (int)vm->pid, (int)tids[i]); + + if (virFileReadAll(proc, (1 << 16), &data) < 0) { + continue; + } + + VIR_DEBUG("VM PID: %d, TID %d, COMM: %s", + (int)vm->pid, (int)tids[i], data); + if (STRPREFIX(data, "vcpu")) { + int cpuid; + char *tmp; + + if (virStrToLong_i(data + 4, &tmp, 0, &cpuid) < 0) { + VIR_WARN("Index is not specified correctly"); + continue; + } + info[i].type = virCHThreadTypeVcpu; + info[i].vcpuInfo.tid = tids[i]; + info[i].vcpuInfo.online = true; + info[i].vcpuInfo.cpuid = cpuid; + VIR_DEBUG("vcpu%d -> tid: %d", cpuid, tids[i]); + } else if (STRPREFIX(data, "_disk") || STRPREFIX(data, "_net") || + STRPREFIX(data, "_rng")) { + /* Prefixes used by cloud-hypervisor for IO Threads are captured at + * https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/vmm/src/devic... */ + info[i].type = virCHThreadTypeIO; + info[i].ioInfo.tid = tids[i]; + virStrcpy(info[i].ioInfo.thrName, data, VIRCH_THREAD_NAME_LEN - 1); + } else { + info[i].type = virCHThreadTypeEmulator; + info[i].emuInfo.tid = tids[i]; + virStrcpy(info[i].emuInfo.thrName, data, VIRCH_THREAD_NAME_LEN - 1); + } + mon->nthreads++; + + } + mon->threads = info; + + return mon->nthreads; +} + +/** + * virCHMonitorGetThreadInfo: + * @mon: Pointer to the monitor + * @refresh: Refresh thread info or not + * + * Retrive thread info and store to @threads + * + * Returns count of threads on success. + */ +size_t +virCHMonitorGetThreadInfo(virCHMonitor *mon, bool refresh, + virCHMonitorThreadInfo **threads) +{ + int nthreads = 0; + + if (refresh) + nthreads = virCHMonitorRefreshThreadInfo(mon); + + *threads = mon->threads; + + return nthreads; +} + int virCHMonitorShutdownVMM(virCHMonitor *mon) { diff --git a/src/ch/ch_monitor.h b/src/ch/ch_monitor.h index f3b6978366..6646316454 100644 --- a/src/ch/ch_monitor.h +++ b/src/ch/ch_monitor.h @@ -37,6 +37,50 @@ #define URL_VM_RESUME "vm.resume" #define URL_VM_INFO "vm.info" +#define VIRCH_THREAD_NAME_LEN 16 + +typedef enum { + virCHThreadTypeEmulator, + virCHThreadTypeVcpu, + virCHThreadTypeIO, + virCHThreadTypeMax +} virCHThreadType; + +typedef struct _virCHMonitorCPUInfo virCHMonitorCPUInfo; + +struct _virCHMonitorCPUInfo { + int cpuid; + pid_t tid; + + bool online; +}; + +typedef struct _virCHMonitorEmuThreadInfo virCHMonitorEmuThreadInfo; + +struct _virCHMonitorEmuThreadInfo { + char thrName[VIRCH_THREAD_NAME_LEN]; + pid_t tid; +}; + +typedef struct _virCHMonitorIOThreadInfo virCHMonitorIOThreadInfo; + +struct _virCHMonitorIOThreadInfo { + char thrName[VIRCH_THREAD_NAME_LEN]; + pid_t tid; +}; + +typedef struct _virCHMonitorThreadInfo virCHMonitorThreadInfo; + +struct _virCHMonitorThreadInfo { + virCHThreadType type; + + union { + virCHMonitorCPUInfo vcpuInfo; + virCHMonitorEmuThreadInfo emuInfo; + virCHMonitorIOThreadInfo ioInfo; + }; +}; + typedef struct _virCHMonitor virCHMonitor; struct _virCHMonitor { @@ -49,6 +93,9 @@ struct _virCHMonitor { pid_t pid; virDomainObj *vm; + + size_t nthreads; + virCHMonitorThreadInfo *threads; }; virCHMonitor *virCHMonitorNew(virDomainObj *vm, const char *socketdir); @@ -66,12 +113,9 @@ int virCHMonitorSuspendVM(virCHMonitor *mon); int virCHMonitorResumeVM(virCHMonitor *mon); int virCHMonitorGetInfo(virCHMonitor *mon, virJSONValue **info); -typedef struct _virCHMonitorCPUInfo virCHMonitorCPUInfo; -struct _virCHMonitorCPUInfo { - pid_t tid; - bool online; -}; void virCHMonitorCPUInfoFree(virCHMonitorCPUInfo *cpus); int virCHMonitorGetCPUInfo(virCHMonitor *mon, virCHMonitorCPUInfo **vcpus, size_t maxvcpus); +size_t virCHMonitorGetThreadInfo(virCHMonitor *mon, bool refresh, + virCHMonitorThreadInfo **threads); diff --git a/src/ch/ch_process.c b/src/ch/ch_process.c index 49976d769e..1a0730a4d1 100644 --- a/src/ch/ch_process.c +++ b/src/ch/ch_process.c @@ -26,6 +26,8 @@ #include "ch_domain.h" #include "ch_monitor.h" #include "ch_process.h" +#include "domain_cgroup.h" +#include "virnuma.h" #include "viralloc.h" #include "virerror.h" #include "virjson.h" @@ -131,6 +133,254 @@ virCHProcessUpdateInfo(virDomainObj *vm) return 0; } +static int +virCHProcessGetAllCpuAffinity(virBitmap **cpumapRet) +{ + *cpumapRet = NULL; + + if (!virHostCPUHasBitmap()) + return 0; + + if (!(*cpumapRet = virHostCPUGetOnlineBitmap())) + return -1; + + return 0; +} + +#if defined(WITH_SCHED_GETAFFINITY) || defined(WITH_BSD_CPU_AFFINITY) +static int +virCHProcessInitCpuAffinity(virDomainObj *vm) +{ + g_autoptr(virBitmap) cpumapToSet = NULL; + virDomainNumatuneMemMode mem_mode; + virCHDomainObjPrivate *priv = vm->privateData; + + if (!vm->pid) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot setup CPU affinity until process is started")); + return -1; + } + + if (virDomainNumaGetNodeCount(vm->def->numa) <= 1 && + virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) == 0 && + mem_mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT) { + virBitmap *nodeset = NULL; + + if (virDomainNumatuneMaybeGetNodeset(vm->def->numa, + priv->autoNodeset, + &nodeset, -1) < 0) + return -1; + + if (virNumaNodesetToCPUset(nodeset, &cpumapToSet) < 0) + return -1; + } else if (vm->def->cputune.emulatorpin) { + if (!(cpumapToSet = virBitmapNewCopy(vm->def->cputune.emulatorpin))) + return -1; + } else { + if (virCHProcessGetAllCpuAffinity(&cpumapToSet) < 0) + return -1; + } + + if (cpumapToSet && virProcessSetAffinity(vm->pid, cpumapToSet, false) < 0) { + return -1; + } + + return 0; +} +#else /* !defined(WITH_SCHED_GETAFFINITY) && !defined(WITH_BSD_CPU_AFFINITY) */ +static int +virCHProcessInitCpuAffinity(virDomainObj *vm G_GNUC_UNUSED) +{ + return 0; +} +#endif /* !defined(WITH_SCHED_GETAFFINITY) && !defined(WITH_BSD_CPU_AFFINITY) */ + +/** + * virCHProcessSetupPid: + * + * This function sets resource properties (affinity, cgroups, + * scheduler) for any PID associated with a domain. It should be used + * to set up emulator PIDs as well as vCPU and I/O thread pids to + * ensure they are all handled the same way. + * + * Returns 0 on success, -1 on error. + */ +static int +virCHProcessSetupPid(virDomainObj *vm, + pid_t pid, + virCgroupThreadName nameval, + int id, + virBitmap *cpumask, + unsigned long long period, + long long quota, + virDomainThreadSchedParam *sched) +{ + virCHDomainObjPrivate *priv = vm->privateData; + virDomainNumatuneMemMode mem_mode; + virCgroup *cgroup = NULL; + virBitmap *use_cpumask = NULL; + virBitmap *affinity_cpumask = NULL; + g_autoptr(virBitmap) hostcpumap = NULL; + g_autofree char *mem_mask = NULL; + int ret = -1; + + if ((period || quota) && + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cgroup cpu is required for scheduler tuning")); + goto cleanup; + } + + /* Infer which cpumask shall be used. */ + if (cpumask) { + use_cpumask = cpumask; + } else if (vm->def->placement_mode == VIR_DOMAIN_CPU_PLACEMENT_MODE_AUTO) { + use_cpumask = priv->autoCpuset; + } else if (vm->def->cpumask) { + use_cpumask = vm->def->cpumask; + } else { + /* we can't assume cloud-hypervisor itself is running on all pCPUs, + * so we need to explicitly set the spawned instance to all pCPUs. */ + if (virCHProcessGetAllCpuAffinity(&hostcpumap) < 0) + goto cleanup; + affinity_cpumask = hostcpumap; + } + + /* + * If CPU cgroup controller is not initialized here, then we need + * neither period nor quota settings. And if CPUSET controller is + * not initialized either, then there's nothing to do anyway. + */ + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU) || + virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { + + if (virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) == 0 && + mem_mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT && + virDomainNumatuneMaybeFormatNodeset(vm->def->numa, + priv->autoNodeset, + &mem_mask, -1) < 0) + goto cleanup; + + if (virCgroupNewThread(priv->cgroup, nameval, id, true, &cgroup) < 0) + goto cleanup; + + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (use_cpumask && + virDomainCgroupSetupCpusetCpus(cgroup, use_cpumask) < 0) + goto cleanup; + + if (mem_mask && virCgroupSetCpusetMems(cgroup, mem_mask) < 0) + goto cleanup; + + } + + if ((period || quota) && + virDomainCgroupSetupVcpuBW(cgroup, period, quota) < 0) + goto cleanup; + + /* Move the thread to the sub dir */ + VIR_INFO("Adding pid %d to cgroup", pid); + if (virCgroupAddThread(cgroup, pid) < 0) + goto cleanup; + + } + + if (!affinity_cpumask) + affinity_cpumask = use_cpumask; + + /* Setup legacy affinity. */ + if (affinity_cpumask + && virProcessSetAffinity(pid, affinity_cpumask, false) < 0) + goto cleanup; + + /* Set scheduler type and priority, but not for the main thread. */ + if (sched && + nameval != VIR_CGROUP_THREAD_EMULATOR && + virProcessSetScheduler(pid, sched->policy, sched->priority) < 0) + goto cleanup; + + ret = 0; + cleanup: + if (cgroup) { + if (ret < 0) + virCgroupRemove(cgroup); + virCgroupFree(cgroup); + } + + return ret; +} + +/** + * virCHProcessSetupVcpu: + * @vm: domain object + * @vcpuid: id of VCPU to set defaults + * + * This function sets resource properties (cgroups, affinity, scheduler) for a + * vCPU. This function expects that the vCPU is online and the vCPU pids were + * correctly detected at the point when it's called. + * + * Returns 0 on success, -1 on error. + */ +int +virCHProcessSetupVcpu(virDomainObj *vm, unsigned int vcpuid) +{ + pid_t vcpupid = virCHDomainGetVcpuPid(vm, vcpuid); + virDomainVcpuDef *vcpu = virDomainDefGetVcpu(vm->def, vcpuid); + + return virCHProcessSetupPid(vm, vcpupid, VIR_CGROUP_THREAD_VCPU, + vcpuid, vcpu->cpumask, + vm->def->cputune.period, + vm->def->cputune.quota, &vcpu->sched); +} + +static int +virCHProcessSetupVcpus(virDomainObj *vm) +{ + virDomainVcpuDef *vcpu; + unsigned int maxvcpus = virDomainDefGetVcpusMax(vm->def); + size_t i; + + if ((vm->def->cputune.period || vm->def->cputune.quota) && + !virCgroupHasController(((virCHDomainObjPrivate *) vm->privateData)-> + cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cgroup cpu is required for scheduler tuning")); + return -1; + } + + if (!virCHDomainHasVcpuPids(vm)) { + /* If any CPU has custom affinity that differs from the + * VM default affinity, we must reject it */ + for (i = 0; i < maxvcpus; i++) { + vcpu = virDomainDefGetVcpu(vm->def, i); + + if (!vcpu->online) + continue; + + if (vcpu->cpumask && + !virBitmapEqual(vm->def->cpumask, vcpu->cpumask)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("cpu affinity is not supported")); + return -1; + } + } + + return 0; + } + + for (i = 0; i < maxvcpus; i++) { + vcpu = virDomainDefGetVcpu(vm->def, i); + + if (!vcpu->online) + continue; + + if (virCHProcessSetupVcpu(vm, i) < 0) + return -1; + } + + return 0; +} + /** * virCHProcessStart: * @driver: pointer to driver structure @@ -141,12 +391,13 @@ virCHProcessUpdateInfo(virDomainObj *vm) * * Returns 0 on success or -1 in case of error */ -int virCHProcessStart(virCHDriver *driver, - virDomainObj *vm, - virDomainRunningReason reason) +int +virCHProcessStart(virCHDriver *driver, + virDomainObj *vm, virDomainRunningReason reason) { int ret = -1; virCHDomainObjPrivate *priv = vm->privateData; + g_autoptr(virCHDriverConfig) cfg = virCHDriverGetConfig(priv->driver); g_autofree int *nicindexes = NULL; size_t nnicindexes = 0; @@ -166,18 +417,41 @@ int virCHProcessStart(virCHDriver *driver, } } + vm->pid = priv->monitor->pid; + vm->def->id = vm->pid; + priv->machineName = virCHDomainGetMachineName(vm); + + if (virDomainCgroupSetupCgroup("ch", vm, + nnicindexes, nicindexes, + priv->cgroup, + cfg->cgroupControllers, + 0, /*maxThreadsPerProc*/ + priv->driver->privileged, + priv->machineName) < 0) + goto cleanup; + + if (virCHProcessInitCpuAffinity(vm) < 0) + goto cleanup; + if (virCHMonitorBootVM(priv->monitor) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("failed to boot guest VM")); goto cleanup; } - priv->machineName = virCHDomainGetMachineName(vm); - vm->pid = priv->monitor->pid; - vm->def->id = vm->pid; + virCHDomainRefreshThreadInfo(vm); - virCHProcessUpdateInfo(vm); + VIR_DEBUG("Setting global CPU cgroup (if required)"); + if (virDomainCgroupSetupGlobalCpuCgroup(vm, + priv->cgroup, + priv->autoNodeset) < 0) + goto cleanup; + + VIR_DEBUG("Setting vCPU tuning/settings"); + if (virCHProcessSetupVcpus(vm) < 0) + goto cleanup; + virCHProcessUpdateInfo(vm); virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, reason); return 0; @@ -189,10 +463,12 @@ int virCHProcessStart(virCHDriver *driver, return ret; } -int virCHProcessStop(virCHDriver *driver G_GNUC_UNUSED, - virDomainObj *vm, - virDomainShutoffReason reason) +int +virCHProcessStop(virCHDriver *driver G_GNUC_UNUSED, + virDomainObj *vm, virDomainShutoffReason reason) { + int ret; + int retries = 0; virCHDomainObjPrivate *priv = vm->privateData; VIR_DEBUG("Stopping VM name=%s pid=%d reason=%d", @@ -203,6 +479,18 @@ int virCHProcessStop(virCHDriver *driver G_GNUC_UNUSED, priv->monitor = NULL; } + retry: + if ((ret = virDomainCgroupRemoveCgroup(vm, + priv->cgroup, + priv->machineName)) < 0) { + if (ret == -EBUSY && (retries++ < 5)) { + g_usleep(200*1000); + goto retry; + } + VIR_WARN("Failed to remove cgroup for %s", + vm->def->name); + } + vm->pid = -1; vm->def->id = -1; g_clear_pointer(&priv->machineName, g_free); diff --git a/src/ch/ch_process.h b/src/ch/ch_process.h index abc4915979..800e3f4e23 100644 --- a/src/ch/ch_process.h +++ b/src/ch/ch_process.h @@ -29,3 +29,6 @@ int virCHProcessStart(virCHDriver *driver, int virCHProcessStop(virCHDriver *driver, virDomainObj *vm, virDomainShutoffReason reason); + +int virCHProcessSetupVcpu(virDomainObj *vm, + unsigned int vcpuid); -- 2.27.0

On 1/25/22 17:19, Praveen K Paladugu wrote:
From: Vineeth Pillai <viremana@linux.microsoft.com>
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_conf.c | 2 + src/ch/ch_conf.h | 4 +- src/ch/ch_domain.c | 34 +++++ src/ch/ch_domain.h | 11 +- src/ch/ch_monitor.c | 96 ++++++++++++++ src/ch/ch_monitor.h | 54 +++++++- src/ch/ch_process.c | 308 ++++++++++++++++++++++++++++++++++++++++++-- src/ch/ch_process.h | 3 + 8 files changed, 492 insertions(+), 20 deletions(-)
diff --git a/src/ch/ch_domain.c b/src/ch/ch_domain.c index a746d0f5fd..6f0cec8c6e 100644 --- a/src/ch/ch_domain.c +++ b/src/ch/ch_domain.c @@ -319,6 +319,40 @@ chValidateDomainDeviceDef(const virDomainDeviceDef *dev, _("Serial can only be enabled for a PTY")); return -1; } + return 0; +} + +int +virCHDomainRefreshThreadInfo(virDomainObj *vm) +{ + size_t maxvcpus = virDomainDefGetVcpusMax(vm->def); + virCHMonitorThreadInfo *info = NULL; + size_t nthreads, ncpus = 0;
We like one variable per line more, because it allows smaller diffs should this area of code be ever changed.
+ size_t i; + + nthreads = virCHMonitorGetThreadInfo(virCHDomainGetMonitor(vm), + true, &info); + + for (i = 0; i < nthreads; i++) { + virCHDomainVcpuPrivate *vcpupriv; + virDomainVcpuDef *vcpu; + virCHMonitorCPUInfo *vcpuInfo; + + if (info[i].type != virCHThreadTypeVcpu) + continue; + + /* TODO: hotplug support */ + vcpuInfo = &info[i].vcpuInfo; + vcpu = virDomainDefGetVcpu(vm->def, vcpuInfo->cpuid); + vcpupriv = CH_DOMAIN_VCPU_PRIVATE(vcpu); + vcpupriv->tid = vcpuInfo->tid; + ncpus++; + } + + /* TODO: Remove the warning when hotplug is implemented.*/ + if (ncpus != maxvcpus) + VIR_WARN("Mismatch in the number of cpus, expected: %ld, actual: %ld", + maxvcpus, ncpus);
return 0; } diff --git a/src/ch/ch_domain.h b/src/ch/ch_domain.h index 4d0b5479b8..cb94905b94 100644 --- a/src/ch/ch_domain.h +++ b/src/ch/ch_domain.h @@ -53,11 +53,13 @@ typedef struct _virCHDomainObjPrivate virCHDomainObjPrivate; struct _virCHDomainObjPrivate { struct virCHDomainJobObj job;
- virChrdevs *chrdevs; - virCHDriver *driver; - virCHMonitor *monitor; char *machineName; virBitmap *autoCpuset; + virBitmap *autoNodeset; + virCHDriver *driver; + virCHMonitor *monitor; + virCgroup *cgroup; + virChrdevs *chrdevs;
Looks like you can't make up your mind about the order. I remember seeing the order of the members being changed over and over. Any reason for that?
};
#define CH_DOMAIN_PRIVATE(vm) \ @@ -87,7 +89,8 @@ void virCHDomainObjEndJob(virDomainObj *obj);
int -virCHDomainRefreshVcpuInfo(virDomainObj *vm); +virCHDomainRefreshThreadInfo(virDomainObj *vm); + pid_t virCHDomainGetVcpuPid(virDomainObj *vm, unsigned int vcpuid); diff --git a/src/ch/ch_monitor.c b/src/ch/ch_monitor.c index a19f0c7e33..d984bf9822 100644 --- a/src/ch/ch_monitor.c +++ b/src/ch/ch_monitor.c @@ -41,6 +41,7 @@ VIR_LOG_INIT("ch.ch_monitor");
static virClass *virCHMonitorClass; static void virCHMonitorDispose(void *obj); +static void virCHMonitorThreadInfoFree(virCHMonitor * mon);
static int virCHMonitorOnceInit(void) { @@ -578,6 +579,7 @@ static void virCHMonitorDispose(void *opaque) virCHMonitor *mon = opaque;
VIR_DEBUG("mon=%p", mon); + virCHMonitorThreadInfoFree(mon); virObjectUnref(mon->vm); }
@@ -743,6 +745,100 @@ virCHMonitorGet(virCHMonitor *mon, const char *endpoint, virJSONValue **response return ret; }
+static void +virCHMonitorThreadInfoFree(virCHMonitor *mon) +{ + mon->nthreads = 0; + if (mon->threads) + VIR_FREE(mon->threads);
Since VIR_FREE() is really just g_clear_pointer() there's no point in checking the pointer. In fact, glib will do that for us.
+} + +static size_t +virCHMonitorRefreshThreadInfo(virCHMonitor *mon) +{ + virCHMonitorThreadInfo *info = NULL; + g_autofree pid_t *tids = NULL; + virDomainObj *vm = mon->vm; + size_t ntids = 0; + size_t i; + + + virCHMonitorThreadInfoFree(mon); + if (virProcessGetPids(vm->pid, &ntids, &tids) < 0) { + mon->threads = NULL;
No need to set to NULL. It was done by virCHMonitorThreadInfoFree() above.
+ return 0; + } +
diff --git a/src/ch/ch_process.c b/src/ch/ch_process.c index 49976d769e..1a0730a4d1 100644 --- a/src/ch/ch_process.c +++ b/src/ch/ch_process.c
+/** + * virCHProcessSetupPid: + * + * This function sets resource properties (affinity, cgroups, + * scheduler) for any PID associated with a domain. It should be used + * to set up emulator PIDs as well as vCPU and I/O thread pids to + * ensure they are all handled the same way. + * + * Returns 0 on success, -1 on error. + */ +static int +virCHProcessSetupPid(virDomainObj *vm, + pid_t pid, + virCgroupThreadName nameval, + int id, + virBitmap *cpumask, + unsigned long long period, + long long quota, + virDomainThreadSchedParam *sched) +{ + virCHDomainObjPrivate *priv = vm->privateData; + virDomainNumatuneMemMode mem_mode; + virCgroup *cgroup = NULL;
g_autoptr(virCgroup)
+ virBitmap *use_cpumask = NULL; + virBitmap *affinity_cpumask = NULL; + g_autoptr(virBitmap) hostcpumap = NULL; + g_autofree char *mem_mask = NULL; + int ret = -1;
Michal

From: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_domain.c | 30 +++++++++ src/ch/ch_domain.h | 7 ++- src/ch/ch_driver.c | 145 ++++++++++++++++++++++++++++++++++++++++++++ src/ch/ch_monitor.c | 2 +- 4 files changed, 181 insertions(+), 3 deletions(-) diff --git a/src/ch/ch_domain.c b/src/ch/ch_domain.c index 6f0cec8c6e..6fde7122f7 100644 --- a/src/ch/ch_domain.c +++ b/src/ch/ch_domain.c @@ -20,6 +20,7 @@ #include <config.h> +#include "datatypes.h" #include "ch_domain.h" #include "domain_driver.h" #include "viralloc.h" @@ -416,3 +417,32 @@ virCHDomainGetMachineName(virDomainObj *vm) return ret; } + +/** + * virCHDomainObjFromDomain: + * @domain: Domain pointer that has to be looked up + * + * This function looks up @domain and returns the appropriate virDomainObjPtr + * that has to be released by calling virDomainObjEndAPI(). + * + * Returns the domain object with incremented reference counter which is locked + * on success, NULL otherwise. + */ +virDomainObj * +virCHDomainObjFromDomain(virDomainPtr domain) +{ + virDomainObj *vm; + virCHDriver *driver = domain->conn->privateData; + char uuidstr[VIR_UUID_STRING_BUFLEN]; + + vm = virDomainObjListFindByUUID(driver->domains, domain->uuid); + if (!vm) { + virUUIDFormat(domain->uuid, uuidstr); + virReportError(VIR_ERR_NO_DOMAIN, + _("no domain with matching uuid '%s' (%s)"), + uuidstr, domain->name); + return NULL; + } + + return vm; +} diff --git a/src/ch/ch_domain.h b/src/ch/ch_domain.h index cb94905b94..60a761ac84 100644 --- a/src/ch/ch_domain.h +++ b/src/ch/ch_domain.h @@ -97,5 +97,8 @@ virCHDomainGetVcpuPid(virDomainObj *vm, bool virCHDomainHasVcpuPids(virDomainObj *vm); -char * -virCHDomainGetMachineName(virDomainObj *vm); +char +*virCHDomainGetMachineName(virDomainObj *vm); + +virDomainObj +*virCHDomainObjFromDomain(virDomain *domain); diff --git a/src/ch/ch_driver.c b/src/ch/ch_driver.c index 53e0872207..956189e83f 100644 --- a/src/ch/ch_driver.c +++ b/src/ch/ch_driver.c @@ -25,6 +25,7 @@ #include "ch_driver.h" #include "ch_monitor.h" #include "ch_process.h" +#include "domain_cgroup.h" #include "datatypes.h" #include "driver.h" #include "viraccessapicheck.h" @@ -1129,6 +1130,148 @@ chDomainGetVcpus(virDomainPtr dom, return ret; } +static int +chDomainPinVcpuLive(virDomainObj *vm, + virDomainDef *def, + int vcpu, + virCHDriver *driver, + virCHDriverConfig *cfg, + virBitmap *cpumap) +{ + g_autoptr(virBitmap) tmpmap = NULL; + g_autoptr(virCgroup) cgroup_vcpu = NULL; + virDomainVcpuDef *vcpuinfo; + virCHDomainObjPrivate *priv = vm->privateData; + + g_autofree char *str = NULL; + + if (!virCHDomainHasVcpuPids(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("cpu affinity is not supported")); + return -1; + } + + if (!(vcpuinfo = virDomainDefGetVcpu(def, vcpu))) { + virReportError(VIR_ERR_INVALID_ARG, + _("vcpu %d is out of range of live cpu count %d"), + vcpu, virDomainDefGetVcpusMax(def)); + return -1; + } + + if (!(tmpmap = virBitmapNewCopy(cpumap))) + return -1; + + if (!(str = virBitmapFormat(cpumap))) + return -1; + + if (vcpuinfo->online) { + /* Configure the corresponding cpuset cgroup before set affinity. */ + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_VCPU, vcpu, + false, &cgroup_vcpu) < 0) + return -1; + if (virDomainCgroupSetupCpusetCpus(cgroup_vcpu, cpumap) < 0) + return -1; + } + + if (virProcessSetAffinity(virCHDomainGetVcpuPid(vm, vcpu), cpumap, false) < 0) + return -1; + } + + virBitmapFree(vcpuinfo->cpumask); + vcpuinfo->cpumask = tmpmap; + tmpmap = NULL; + + if (virDomainObjSave(vm, driver->xmlopt, cfg->stateDir) < 0) + return -1; + + return 0; +} + + +static int +chDomainPinVcpuFlags(virDomainPtr dom, + unsigned int vcpu, + unsigned char *cpumap, + int maplen, + unsigned int flags) +{ + virCHDriver *driver = dom->conn->privateData; + virDomainObj *vm; + virDomainDef *def; + virDomainDef *persistentDef; + int ret = -1; + virBitmap *pcpumap = NULL; + virDomainVcpuDef *vcpuinfo = NULL; + g_autoptr(virCHDriverConfig) cfg = NULL; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + cfg = virCHDriverGetConfig(driver); + + if (!(vm = virCHDomainObjFromDomain(dom))) + goto cleanup; + + if (virDomainPinVcpuFlagsEnsureACL(dom->conn, vm->def, flags) < 0) + goto cleanup; + + if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0) + goto cleanup; + + if (virDomainObjGetDefs(vm, flags, &def, &persistentDef) < 0) + goto endjob; + + if (persistentDef && + !(vcpuinfo = virDomainDefGetVcpu(persistentDef, vcpu))) { + virReportError(VIR_ERR_INVALID_ARG, + _("vcpu %d is out of range of persistent cpu count %d"), + vcpu, virDomainDefGetVcpus(persistentDef)); + goto endjob; + } + + if (!(pcpumap = virBitmapNewData(cpumap, maplen))) + goto endjob; + + if (virBitmapIsAllClear(pcpumap)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Empty cpu list for pinning")); + goto endjob; + } + + if (def && + chDomainPinVcpuLive(vm, def, vcpu, driver, cfg, pcpumap) < 0) + goto endjob; + + if (persistentDef) { + virBitmapFree(vcpuinfo->cpumask); + vcpuinfo->cpumask = pcpumap; + pcpumap = NULL; + + goto endjob; + } + + ret = 0; + + endjob: + virCHDomainObjEndJob(vm); + + cleanup: + virDomainObjEndAPI(&vm); + virBitmapFree(pcpumap); + return ret; +} + +static int +chDomainPinVcpu(virDomainPtr dom, + unsigned int vcpu, + unsigned char *cpumap, + int maplen) +{ + return chDomainPinVcpuFlags(dom, vcpu, cpumap, maplen, + VIR_DOMAIN_AFFECT_LIVE); +} + /* Function Tables */ static virHypervisorDriver chHypervisorDriver = { .name = "CH", @@ -1169,6 +1312,8 @@ static virHypervisorDriver chHypervisorDriver = { .domainGetVcpusFlags = chDomainGetVcpusFlags, /* 8.0.0 */ .domainGetMaxVcpus = chDomainGetMaxVcpus, /* 8.0.0 */ .domainGetVcpuPinInfo = chDomainGetVcpuPinInfo, /* 8.0.0 */ + .domainPinVcpu = chDomainPinVcpu, /* 8.1.0 */ + .domainPinVcpuFlags = chDomainPinVcpuFlags, /* 8.1.0 */ .nodeGetCPUMap = chNodeGetCPUMap, /* 8.0.0 */ }; diff --git a/src/ch/ch_monitor.c b/src/ch/ch_monitor.c index d984bf9822..6c1d889a82 100644 --- a/src/ch/ch_monitor.c +++ b/src/ch/ch_monitor.c @@ -41,7 +41,7 @@ VIR_LOG_INIT("ch.ch_monitor"); static virClass *virCHMonitorClass; static void virCHMonitorDispose(void *obj); -static void virCHMonitorThreadInfoFree(virCHMonitor * mon); +static void virCHMonitorThreadInfoFree(virCHMonitor *mon); static int virCHMonitorOnceInit(void) { -- 2.27.0

On 1/25/22 17:19, Praveen K Paladugu wrote:
From: Vineeth Pillai <viremana@linux.microsoft.com>
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_domain.c | 30 +++++++++ src/ch/ch_domain.h | 7 ++- src/ch/ch_driver.c | 145 ++++++++++++++++++++++++++++++++++++++++++++ src/ch/ch_monitor.c | 2 +- 4 files changed, 181 insertions(+), 3 deletions(-)
diff --git a/src/ch/ch_domain.c b/src/ch/ch_domain.c index 6f0cec8c6e..6fde7122f7 100644 --- a/src/ch/ch_domain.c +++ b/src/ch/ch_domain.c @@ -20,6 +20,7 @@
#include <config.h>
+#include "datatypes.h" #include "ch_domain.h" #include "domain_driver.h" #include "viralloc.h" @@ -416,3 +417,32 @@ virCHDomainGetMachineName(virDomainObj *vm)
return ret; } + +/** + * virCHDomainObjFromDomain: + * @domain: Domain pointer that has to be looked up + * + * This function looks up @domain and returns the appropriate virDomainObjPtr + * that has to be released by calling virDomainObjEndAPI(). + * + * Returns the domain object with incremented reference counter which is locked + * on success, NULL otherwise. + */ +virDomainObj * +virCHDomainObjFromDomain(virDomainPtr domain) +{ + virDomainObj *vm; + virCHDriver *driver = domain->conn->privateData; + char uuidstr[VIR_UUID_STRING_BUFLEN]; + + vm = virDomainObjListFindByUUID(driver->domains, domain->uuid); + if (!vm) { + virUUIDFormat(domain->uuid, uuidstr); + virReportError(VIR_ERR_NO_DOMAIN, + _("no domain with matching uuid '%s' (%s)"), + uuidstr, domain->name); + return NULL; + } + + return vm; +}
I'm not against moving this function, but: 1) it needs to be done in a separate commit 2) don't forget to remove the function from its original place (ch_driver.c)
diff --git a/src/ch/ch_domain.h b/src/ch/ch_domain.h index cb94905b94..60a761ac84 100644 --- a/src/ch/ch_domain.h +++ b/src/ch/ch_domain.h @@ -97,5 +97,8 @@ virCHDomainGetVcpuPid(virDomainObj *vm, bool virCHDomainHasVcpuPids(virDomainObj *vm);
-char * -virCHDomainGetMachineName(virDomainObj *vm); +char +*virCHDomainGetMachineName(virDomainObj *vm); + +virDomainObj +*virCHDomainObjFromDomain(virDomain *domain); diff --git a/src/ch/ch_driver.c b/src/ch/ch_driver.c index 53e0872207..956189e83f 100644 --- a/src/ch/ch_driver.c +++ b/src/ch/ch_driver.c @@ -25,6 +25,7 @@ #include "ch_driver.h" #include "ch_monitor.h" #include "ch_process.h" +#include "domain_cgroup.h" #include "datatypes.h" #include "driver.h" #include "viraccessapicheck.h" @@ -1129,6 +1130,148 @@ chDomainGetVcpus(virDomainPtr dom, return ret; }
+static int +chDomainPinVcpuLive(virDomainObj *vm, + virDomainDef *def, + int vcpu, + virCHDriver *driver, + virCHDriverConfig *cfg, + virBitmap *cpumap) +{ + g_autoptr(virBitmap) tmpmap = NULL; + g_autoptr(virCgroup) cgroup_vcpu = NULL; + virDomainVcpuDef *vcpuinfo; + virCHDomainObjPrivate *priv = vm->privateData; + + g_autofree char *str = NULL; + + if (!virCHDomainHasVcpuPids(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("cpu affinity is not supported")); + return -1; + } + + if (!(vcpuinfo = virDomainDefGetVcpu(def, vcpu))) { + virReportError(VIR_ERR_INVALID_ARG, + _("vcpu %d is out of range of live cpu count %d"), + vcpu, virDomainDefGetVcpusMax(def)); + return -1; + } + + if (!(tmpmap = virBitmapNewCopy(cpumap))) + return -1; + + if (!(str = virBitmapFormat(cpumap))) + return -1; + + if (vcpuinfo->online) { + /* Configure the corresponding cpuset cgroup before set affinity. */ + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_VCPU, vcpu, + false, &cgroup_vcpu) < 0) + return -1; + if (virDomainCgroupSetupCpusetCpus(cgroup_vcpu, cpumap) < 0) + return -1; + } + + if (virProcessSetAffinity(virCHDomainGetVcpuPid(vm, vcpu), cpumap, false) < 0) + return -1; + } + + virBitmapFree(vcpuinfo->cpumask); + vcpuinfo->cpumask = tmpmap; + tmpmap = NULL;
g_steal_pointer() in other words.
+ + if (virDomainObjSave(vm, driver->xmlopt, cfg->stateDir) < 0) + return -1; + + return 0; +} + + +static int +chDomainPinVcpuFlags(virDomainPtr dom, + unsigned int vcpu, + unsigned char *cpumap, + int maplen, + unsigned int flags) +{ + virCHDriver *driver = dom->conn->privateData; + virDomainObj *vm; + virDomainDef *def; + virDomainDef *persistentDef; + int ret = -1; + virBitmap *pcpumap = NULL;
g_autoptr(virBitmap)
+ virDomainVcpuDef *vcpuinfo = NULL; + g_autoptr(virCHDriverConfig) cfg = NULL; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + cfg = virCHDriverGetConfig(driver); + + if (!(vm = virCHDomainObjFromDomain(dom))) + goto cleanup; + + if (virDomainPinVcpuFlagsEnsureACL(dom->conn, vm->def, flags) < 0) + goto cleanup; + + if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0) + goto cleanup; + + if (virDomainObjGetDefs(vm, flags, &def, &persistentDef) < 0) + goto endjob; + + if (persistentDef && + !(vcpuinfo = virDomainDefGetVcpu(persistentDef, vcpu))) { + virReportError(VIR_ERR_INVALID_ARG, + _("vcpu %d is out of range of persistent cpu count %d"), + vcpu, virDomainDefGetVcpus(persistentDef)); + goto endjob; + } + + if (!(pcpumap = virBitmapNewData(cpumap, maplen))) + goto endjob; + + if (virBitmapIsAllClear(pcpumap)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Empty cpu list for pinning")); + goto endjob; + } + + if (def && + chDomainPinVcpuLive(vm, def, vcpu, driver, cfg, pcpumap) < 0) + goto endjob; + + if (persistentDef) { + virBitmapFree(vcpuinfo->cpumask); + vcpuinfo->cpumask = pcpumap; + pcpumap = NULL;
g_steal_pointer()
+ + goto endjob; + } + + ret = 0; + + endjob: + virCHDomainObjEndJob(vm); + + cleanup: + virDomainObjEndAPI(&vm); + virBitmapFree(pcpumap); + return ret; +} + +static int +chDomainPinVcpu(virDomainPtr dom, + unsigned int vcpu, + unsigned char *cpumap, + int maplen) +{ + return chDomainPinVcpuFlags(dom, vcpu, cpumap, maplen, + VIR_DOMAIN_AFFECT_LIVE);
misaligned
+} + /* Function Tables */ static virHypervisorDriver chHypervisorDriver = { .name = "CH", @@ -1169,6 +1312,8 @@ static virHypervisorDriver chHypervisorDriver = { .domainGetVcpusFlags = chDomainGetVcpusFlags, /* 8.0.0 */ .domainGetMaxVcpus = chDomainGetMaxVcpus, /* 8.0.0 */ .domainGetVcpuPinInfo = chDomainGetVcpuPinInfo, /* 8.0.0 */ + .domainPinVcpu = chDomainPinVcpu, /* 8.1.0 */ + .domainPinVcpuFlags = chDomainPinVcpuFlags, /* 8.1.0 */ .nodeGetCPUMap = chNodeGetCPUMap, /* 8.0.0 */ };
diff --git a/src/ch/ch_monitor.c b/src/ch/ch_monitor.c index d984bf9822..6c1d889a82 100644 --- a/src/ch/ch_monitor.c +++ b/src/ch/ch_monitor.c @@ -41,7 +41,7 @@ VIR_LOG_INIT("ch.ch_monitor");
static virClass *virCHMonitorClass; static void virCHMonitorDispose(void *obj); -static void virCHMonitorThreadInfoFree(virCHMonitor * mon); +static void virCHMonitorThreadInfoFree(virCHMonitor *mon);
This does not belong here. If anything, it should have been done in the previous patch that introduced this function. Michal

From: Vineeth Pillai <viremana@linux.microsoft.com> Enable support of VIR_DRV_FEATURE_TYPED_PARAM_STRING to enable numatune Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_driver.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/src/ch/ch_driver.c b/src/ch/ch_driver.c index 956189e83f..32ae15f940 100644 --- a/src/ch/ch_driver.c +++ b/src/ch/ch_driver.c @@ -943,6 +943,36 @@ static int chStateInitialize(bool privileged, return ret; } +/* Which features are supported by this driver? */ +static int +chConnectSupportsFeature(virConnectPtr conn, int feature) +{ + if (virConnectSupportsFeatureEnsureACL(conn) < 0) + return -1; + + switch ((virDrvFeature) feature) { + case VIR_DRV_FEATURE_TYPED_PARAM_STRING: + return 1; + case VIR_DRV_FEATURE_MIGRATION_V2: + case VIR_DRV_FEATURE_MIGRATION_V3: + case VIR_DRV_FEATURE_MIGRATION_P2P: + case VIR_DRV_FEATURE_MIGRATE_CHANGE_PROTECTION: + case VIR_DRV_FEATURE_FD_PASSING: + case VIR_DRV_FEATURE_XML_MIGRATABLE: + case VIR_DRV_FEATURE_MIGRATION_OFFLINE: + case VIR_DRV_FEATURE_MIGRATION_PARAMS: + case VIR_DRV_FEATURE_MIGRATION_DIRECT: + case VIR_DRV_FEATURE_MIGRATION_V1: + case VIR_DRV_FEATURE_PROGRAM_KEEPALIVE: + case VIR_DRV_FEATURE_REMOTE: + case VIR_DRV_FEATURE_REMOTE_CLOSE_CALLBACK: + case VIR_DRV_FEATURE_REMOTE_EVENT_CALLBACK: + case VIR_DRV_FEATURE_NETWORK_UPDATE_HAS_CORRECT_ORDER: + default: + return 0; + } +} + static int chDomainGetVcpusFlags(virDomainPtr dom, unsigned int flags) @@ -1285,6 +1315,7 @@ static virHypervisorDriver chHypervisorDriver = { .connectListAllDomains = chConnectListAllDomains, /* 7.5.0 */ .connectListDomains = chConnectListDomains, /* 7.5.0 */ .connectGetCapabilities = chConnectGetCapabilities, /* 7.5.0 */ + .connectSupportsFeature = chConnectSupportsFeature, /* 8.1.0 */ .domainCreateXML = chDomainCreateXML, /* 7.5.0 */ .domainCreate = chDomainCreate, /* 7.5.0 */ .domainCreateWithFlags = chDomainCreateWithFlags, /* 7.5.0 */ -- 2.27.0

From: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_driver.c | 260 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 260 insertions(+) diff --git a/src/ch/ch_driver.c b/src/ch/ch_driver.c index 32ae15f940..d257c025ef 100644 --- a/src/ch/ch_driver.c +++ b/src/ch/ch_driver.c @@ -43,6 +43,7 @@ #include "viruri.h" #include "virutil.h" #include "viruuid.h" +#include "virnuma.h" #define VIR_FROM_THIS VIR_FROM_CH @@ -1302,6 +1303,263 @@ chDomainPinVcpu(virDomainPtr dom, VIR_DOMAIN_AFFECT_LIVE); } +#define CH_NB_NUMA_PARAM 2 + +static int +chDomainGetNumaParameters(virDomainPtr dom, + virTypedParameterPtr params, + int *nparams, + unsigned int flags) +{ + size_t i; + virDomainObj *vm = NULL; + virDomainNumatuneMemMode tmpmode = VIR_DOMAIN_NUMATUNE_MEM_STRICT; + virCHDomainObjPrivate *priv; + g_autofree char *nodeset = NULL; + int ret = -1; + virDomainDef *def = NULL; + bool live = false; + virBitmap *autoNodeset = NULL; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG | + VIR_TYPED_PARAM_STRING_OKAY, -1); + + if (!(vm = virCHDomainObjFromDomain(dom))) + return -1; + priv = vm->privateData; + + if (virDomainGetNumaParametersEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + if (!(def = virDomainObjGetOneDefState(vm, flags, &live))) + goto cleanup; + + if (live) + autoNodeset = priv->autoNodeset; + + if ((*nparams) == 0) { + *nparams = CH_NB_NUMA_PARAM; + ret = 0; + goto cleanup; + } + + for (i = 0; i < CH_NB_NUMA_PARAM && i < *nparams; i++) { + virMemoryParameterPtr param = ¶ms[i]; + + switch (i) { + case 0: /* fill numa mode here */ + ignore_value(virDomainNumatuneGetMode(def->numa, -1, &tmpmode)); + + if (virTypedParameterAssign(param, VIR_DOMAIN_NUMA_MODE, + VIR_TYPED_PARAM_INT, tmpmode) < 0) + goto cleanup; + + break; + + case 1: /* fill numa nodeset here */ + nodeset = virDomainNumatuneFormatNodeset(def->numa, autoNodeset, -1); + + if (!nodeset || + virTypedParameterAssign(param, VIR_DOMAIN_NUMA_NODESET, + VIR_TYPED_PARAM_STRING, nodeset) < 0) + goto cleanup; + + nodeset = NULL; + break; + + /* coverity[dead_error_begin] */ + default: + break; + /* should not hit here */ + } + } + + if (*nparams > CH_NB_NUMA_PARAM) + *nparams = CH_NB_NUMA_PARAM; + ret = 0; + + cleanup: + virDomainObjEndAPI(&vm); + return ret; +} + +static int +chDomainSetNumaParamsLive(virDomainObj *vm, + virBitmap *nodeset) +{ + g_autoptr(virCgroup) cgroup_temp = NULL; + virCHDomainObjPrivate *priv = vm->privateData; + g_autofree char *nodeset_str = NULL; + virDomainNumatuneMemMode mode; + size_t i = 0; + + + if (virDomainNumatuneGetMode(vm->def->numa, -1, &mode) == 0 && + mode != VIR_DOMAIN_NUMATUNE_MEM_STRICT) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("change of nodeset for running domain requires strict numa mode")); + return -1; + } + + if (!virNumaNodesetIsAvailable(nodeset)) + return -1; + + /* Ensure the cpuset string is formatted before passing to cgroup */ + if (!(nodeset_str = virBitmapFormat(nodeset))) + return -1; + + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_EMULATOR, 0, + false, &cgroup_temp) < 0 || + virCgroupSetCpusetMems(cgroup_temp, nodeset_str) < 0) + return -1; + + + for (i = 0; i < virDomainDefGetVcpusMax(vm->def); i++) { + virDomainVcpuDef *vcpu = virDomainDefGetVcpu(vm->def, i); + + if (!vcpu->online) + continue; + + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_VCPU, i, + false, &cgroup_temp) < 0 || + virCgroupSetCpusetMems(cgroup_temp, nodeset_str) < 0) + return -1; + + return 0; + } + + for (i = 0; i < vm->def->niothreadids; i++) { + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_IOTHREAD, + vm->def->iothreadids[i]->iothread_id, + false, &cgroup_temp) < 0 || + virCgroupSetCpusetMems(cgroup_temp, nodeset_str) < 0) + return -1; + } + + /* set nodeset for root cgroup */ + if (virCgroupSetCpusetMems(priv->cgroup, nodeset_str) < 0) + return -1; + + return 0; +} + +static int +chDomainSetNumaParameters(virDomainPtr dom, + virTypedParameterPtr params, + int nparams, + unsigned int flags) +{ + virCHDriver *driver = dom->conn->privateData; + size_t i; + virDomainDef *def; + virDomainDef *persistentDef; + virDomainObj *vm = NULL; + int ret = -1; + g_autoptr(virCHDriverConfig) cfg = NULL; + virCHDomainObjPrivate *priv; + virBitmap *nodeset = NULL; + virDomainNumatuneMemMode config_mode; + int mode = -1; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + if (virTypedParamsValidate(params, nparams, + VIR_DOMAIN_NUMA_MODE, + VIR_TYPED_PARAM_INT, + VIR_DOMAIN_NUMA_NODESET, + VIR_TYPED_PARAM_STRING, + NULL) < 0) + return -1; + + if (!(vm = virCHDomainObjFromDomain(dom))) + return -1; + + priv = vm->privateData; + cfg = virCHDriverGetConfig(driver); + + if (virDomainSetNumaParametersEnsureACL(dom->conn, vm->def, flags) < 0) + goto cleanup; + + for (i = 0; i < nparams; i++) { + virTypedParameterPtr param = ¶ms[i]; + + if (STREQ(param->field, VIR_DOMAIN_NUMA_MODE)) { + mode = param->value.i; + + if (mode < 0 || mode >= VIR_DOMAIN_NUMATUNE_MEM_LAST) { + virReportError(VIR_ERR_INVALID_ARG, + _("unsupported numatune mode: '%d'"), mode); + goto cleanup; + } + + } else if (STREQ(param->field, VIR_DOMAIN_NUMA_NODESET)) { + if (virBitmapParse(param->value.s, &nodeset, + VIR_DOMAIN_CPUMASK_LEN) < 0) + goto cleanup; + + if (virBitmapIsAllClear(nodeset)) { + virReportError(VIR_ERR_OPERATION_INVALID, + _("Invalid nodeset of 'numatune': %s"), + param->value.s); + goto cleanup; + } + } + } + + if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0) + goto cleanup; + + if (virDomainObjGetDefs(vm, flags, &def, &persistentDef) < 0) + goto endjob; + + if (def) { + if (!driver->privileged) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("NUMA tuning is not available in session mode")); + goto endjob; + } + + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("cgroup cpuset controller is not mounted")); + goto endjob; + } + + if (mode != -1 && + virDomainNumatuneGetMode(def->numa, -1, &config_mode) == 0 && + config_mode != mode) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("can't change numatune mode for running domain")); + goto endjob; + } + + if (nodeset && + chDomainSetNumaParamsLive(vm, nodeset) < 0) + goto endjob; + + if (virDomainNumatuneSet(def->numa, + def->placement_mode == + VIR_DOMAIN_CPU_PLACEMENT_MODE_STATIC, + -1, mode, nodeset) < 0) + goto endjob; + + if (virDomainObjSave(vm, driver->xmlopt, cfg->stateDir) < 0) + goto endjob; + } + + ret = 0; + + endjob: + virCHDomainObjEndJob(vm); + + cleanup: + virBitmapFree(nodeset); + virDomainObjEndAPI(&vm); + return ret; +} + /* Function Tables */ static virHypervisorDriver chHypervisorDriver = { .name = "CH", @@ -1346,6 +1604,8 @@ static virHypervisorDriver chHypervisorDriver = { .domainPinVcpu = chDomainPinVcpu, /* 8.1.0 */ .domainPinVcpuFlags = chDomainPinVcpuFlags, /* 8.1.0 */ .nodeGetCPUMap = chNodeGetCPUMap, /* 8.0.0 */ + .domainSetNumaParameters = chDomainSetNumaParameters, /* 8.1.0 */ + .domainGetNumaParameters = chDomainGetNumaParameters, /* 8.1.0 */ }; static virConnectDriver chConnectDriver = { -- 2.27.0

On 1/25/22 17:19, Praveen K Paladugu wrote:
From: Vineeth Pillai <viremana@linux.microsoft.com>
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_driver.c | 260 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 260 insertions(+)
diff --git a/src/ch/ch_driver.c b/src/ch/ch_driver.c index 32ae15f940..d257c025ef 100644 --- a/src/ch/ch_driver.c +++ b/src/ch/ch_driver.c @@ -43,6 +43,7 @@ #include "viruri.h" #include "virutil.h" #include "viruuid.h" +#include "virnuma.h"
#define VIR_FROM_THIS VIR_FROM_CH
@@ -1302,6 +1303,263 @@ chDomainPinVcpu(virDomainPtr dom, VIR_DOMAIN_AFFECT_LIVE); }
+#define CH_NB_NUMA_PARAM 2 + +static int +chDomainGetNumaParameters(virDomainPtr dom, + virTypedParameterPtr params, + int *nparams, + unsigned int flags) +{ + size_t i; + virDomainObj *vm = NULL; + virDomainNumatuneMemMode tmpmode = VIR_DOMAIN_NUMATUNE_MEM_STRICT; + virCHDomainObjPrivate *priv; + g_autofree char *nodeset = NULL; + int ret = -1; + virDomainDef *def = NULL; + bool live = false; + virBitmap *autoNodeset = NULL; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG | + VIR_TYPED_PARAM_STRING_OKAY, -1); + + if (!(vm = virCHDomainObjFromDomain(dom))) + return -1; + priv = vm->privateData; + + if (virDomainGetNumaParametersEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + if (!(def = virDomainObjGetOneDefState(vm, flags, &live))) + goto cleanup; + + if (live) + autoNodeset = priv->autoNodeset; + + if ((*nparams) == 0) { + *nparams = CH_NB_NUMA_PARAM; + ret = 0; + goto cleanup; + } + + for (i = 0; i < CH_NB_NUMA_PARAM && i < *nparams; i++) { + virMemoryParameterPtr param = ¶ms[i]; + + switch (i) { + case 0: /* fill numa mode here */ + ignore_value(virDomainNumatuneGetMode(def->numa, -1, &tmpmode)); + + if (virTypedParameterAssign(param, VIR_DOMAIN_NUMA_MODE, + VIR_TYPED_PARAM_INT, tmpmode) < 0) + goto cleanup; + + break; + + case 1: /* fill numa nodeset here */ + nodeset = virDomainNumatuneFormatNodeset(def->numa, autoNodeset, -1); + + if (!nodeset || + virTypedParameterAssign(param, VIR_DOMAIN_NUMA_NODESET, + VIR_TYPED_PARAM_STRING, nodeset) < 0) + goto cleanup; + + nodeset = NULL; + break; + + /* coverity[dead_error_begin] */
I don't think this is correct. Does coverity really complain?
+ default: + break; + /* should not hit here */ + } + } + + if (*nparams > CH_NB_NUMA_PARAM) + *nparams = CH_NB_NUMA_PARAM; + ret = 0; + + cleanup: + virDomainObjEndAPI(&vm); + return ret; +} + +static int +chDomainSetNumaParamsLive(virDomainObj *vm, + virBitmap *nodeset) +{ + g_autoptr(virCgroup) cgroup_temp = NULL; + virCHDomainObjPrivate *priv = vm->privateData; + g_autofree char *nodeset_str = NULL; + virDomainNumatuneMemMode mode; + size_t i = 0; + + + if (virDomainNumatuneGetMode(vm->def->numa, -1, &mode) == 0 && + mode != VIR_DOMAIN_NUMATUNE_MEM_STRICT) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("change of nodeset for running domain requires strict numa mode")); + return -1; + } + + if (!virNumaNodesetIsAvailable(nodeset)) + return -1; + + /* Ensure the cpuset string is formatted before passing to cgroup */ + if (!(nodeset_str = virBitmapFormat(nodeset))) + return -1; + + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_EMULATOR, 0, + false, &cgroup_temp) < 0 || + virCgroupSetCpusetMems(cgroup_temp, nodeset_str) < 0) + return -1; + + + for (i = 0; i < virDomainDefGetVcpusMax(vm->def); i++) { + virDomainVcpuDef *vcpu = virDomainDefGetVcpu(vm->def, i); + + if (!vcpu->online) + continue; + + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_VCPU, i, + false, &cgroup_temp) < 0 || + virCgroupSetCpusetMems(cgroup_temp, nodeset_str) < 0) + return -1; + + return 0; + } +
If anything, coverity should complain about the block below. Because of this return 0 ^^^ the block below is effectivelly a dead code.
+ for (i = 0; i < vm->def->niothreadids; i++) { + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_IOTHREAD, + vm->def->iothreadids[i]->iothread_id, + false, &cgroup_temp) < 0 || + virCgroupSetCpusetMems(cgroup_temp, nodeset_str) < 0) + return -1; + } + + /* set nodeset for root cgroup */ + if (virCgroupSetCpusetMems(priv->cgroup, nodeset_str) < 0) + return -1; + + return 0; +} + +static int +chDomainSetNumaParameters(virDomainPtr dom, + virTypedParameterPtr params, + int nparams, + unsigned int flags) +{ + virCHDriver *driver = dom->conn->privateData; + size_t i; + virDomainDef *def; + virDomainDef *persistentDef; + virDomainObj *vm = NULL; + int ret = -1; + g_autoptr(virCHDriverConfig) cfg = NULL; + virCHDomainObjPrivate *priv; + virBitmap *nodeset = NULL; + virDomainNumatuneMemMode config_mode; + int mode = -1; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + if (virTypedParamsValidate(params, nparams, + VIR_DOMAIN_NUMA_MODE, + VIR_TYPED_PARAM_INT, + VIR_DOMAIN_NUMA_NODESET, + VIR_TYPED_PARAM_STRING, + NULL) < 0) + return -1; + + if (!(vm = virCHDomainObjFromDomain(dom))) + return -1; + + priv = vm->privateData; + cfg = virCHDriverGetConfig(driver); + + if (virDomainSetNumaParametersEnsureACL(dom->conn, vm->def, flags) < 0) + goto cleanup; + + for (i = 0; i < nparams; i++) { + virTypedParameterPtr param = ¶ms[i]; + + if (STREQ(param->field, VIR_DOMAIN_NUMA_MODE)) { + mode = param->value.i; + + if (mode < 0 || mode >= VIR_DOMAIN_NUMATUNE_MEM_LAST) { + virReportError(VIR_ERR_INVALID_ARG, + _("unsupported numatune mode: '%d'"), mode); + goto cleanup; + } + + } else if (STREQ(param->field, VIR_DOMAIN_NUMA_NODESET)) { + if (virBitmapParse(param->value.s, &nodeset, + VIR_DOMAIN_CPUMASK_LEN) < 0) + goto cleanup; + + if (virBitmapIsAllClear(nodeset)) { + virReportError(VIR_ERR_OPERATION_INVALID, + _("Invalid nodeset of 'numatune': %s"), + param->value.s); + goto cleanup; + } + } + } + + if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0) + goto cleanup; + + if (virDomainObjGetDefs(vm, flags, &def, &persistentDef) < 0)
This @persistentDef is never used. I know CH driver doesn't store XMLs on the disk yet, but it does implement virDomainDefineXML() API (and holds the domain definitions in memory). Firstly, we should really address that, because with socket activation it's going to drive users mad. I mean, if a client connects, defines a domain, disconnects, waits for 2 minutes the daemon dies and domain is gone. This is all because of the --timeout=120 argument we are passing to daemon(s) (both monolithic daemon and driver daemons). What's the reason we are not storing the XML? Secondly, leaving --timeout= asaide, since virDomainDefineXML() is defined an user can define a domain and then call APIs to change it. For instance, it could call virDomainSetNumaParameters() to change the domain definition.
+ goto endjob; + + if (def) { + if (!driver->privileged) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("NUMA tuning is not available in session mode")); + goto endjob; + } + + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("cgroup cpuset controller is not mounted")); + goto endjob; + } + + if (mode != -1 && + virDomainNumatuneGetMode(def->numa, -1, &config_mode) == 0 && + config_mode != mode) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("can't change numatune mode for running domain")); + goto endjob; + } + + if (nodeset && + chDomainSetNumaParamsLive(vm, nodeset) < 0) + goto endjob; + + if (virDomainNumatuneSet(def->numa, + def->placement_mode == + VIR_DOMAIN_CPU_PLACEMENT_MODE_STATIC, + -1, mode, nodeset) < 0) + goto endjob;
Meanwhile, in QEMU driver we've disabled changing of NUMA nodeset for static mode, because we can't guarantee that QEMU will move all of its memory to the new nodeset. How does CH behave in this case? I'm going to merge these patches, but we should figure out whether CH behaves the same (which I suspect it does), and deny the operation. Can you please cook a patch in that case? Michal

using virCHProcessSetupPid Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_monitor.c | 60 +++++++++++++++++++++++++++++++++++ src/ch/ch_monitor.h | 2 ++ src/ch/ch_process.c | 77 ++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 138 insertions(+), 1 deletion(-) diff --git a/src/ch/ch_monitor.c b/src/ch/ch_monitor.c index 6c1d889a82..2a7cffb696 100644 --- a/src/ch/ch_monitor.c +++ b/src/ch/ch_monitor.c @@ -930,3 +930,63 @@ virCHMonitorGetInfo(virCHMonitor *mon, virJSONValue **info) { return virCHMonitorGet(mon, URL_VM_INFO, info); } + +/** + * virCHMonitorGetIOThreads: + * @mon: Pointer to the monitor + * @iothreads: Location to return array of IOThreadInfo data + * + * Retrieve the list of iothreads defined/running for the machine + * + * Returns count of IOThreadInfo structures on success + * -1 on error. + */ +int virCHMonitorGetIOThreads(virCHMonitor *mon, + virDomainIOThreadInfo ***iothreads) +{ + size_t nthreads = 0, niothreads = 0; + int thd_index; + virDomainIOThreadInfo **iothreadinfolist = NULL, *iothreadinfo = NULL; + + *iothreads = NULL; + nthreads = virCHMonitorRefreshThreadInfo(mon); + + iothreadinfolist = g_new0(virDomainIOThreadInfo*, nthreads); + + for (thd_index = 0; thd_index < nthreads; thd_index++) { + virBitmap *map = NULL; + if (mon->threads[thd_index].type == virCHThreadTypeIO) { + iothreadinfo = g_new0(virDomainIOThreadInfo, 1); + + iothreadinfo->iothread_id = mon->threads[thd_index].ioInfo.tid; + + if (!(map = virProcessGetAffinity(iothreadinfo->iothread_id))) + goto cleanup; + + if (virBitmapToData(map, &(iothreadinfo->cpumap), + &(iothreadinfo->cpumaplen)) < 0) { + virBitmapFree(map); + goto cleanup; + } + virBitmapFree(map); + /* Append to iothreadinfolist */ + iothreadinfolist[niothreads] = iothreadinfo; + niothreads++; + } + } + VIR_DELETE_ELEMENT_INPLACE(iothreadinfolist, + niothreads, nthreads); + *iothreads = iothreadinfolist; + VIR_DEBUG("niothreads = %ld", niothreads); + return niothreads; + + cleanup: + if (iothreadinfolist) { + for (thd_index = 0; thd_index < niothreads; thd_index++) + VIR_FREE(iothreadinfolist[thd_index]); + VIR_FREE(iothreadinfolist); + } + if (iothreadinfo) + VIR_FREE(iothreadinfo); + return -1; +} diff --git a/src/ch/ch_monitor.h b/src/ch/ch_monitor.h index 6646316454..ffc80e8910 100644 --- a/src/ch/ch_monitor.h +++ b/src/ch/ch_monitor.h @@ -119,3 +119,5 @@ int virCHMonitorGetCPUInfo(virCHMonitor *mon, size_t maxvcpus); size_t virCHMonitorGetThreadInfo(virCHMonitor *mon, bool refresh, virCHMonitorThreadInfo **threads); +int virCHMonitorGetIOThreads(virCHMonitor *mon, + virDomainIOThreadInfo ***iothreads); diff --git a/src/ch/ch_process.c b/src/ch/ch_process.c index 1a0730a4d1..2b81c9f757 100644 --- a/src/ch/ch_process.c +++ b/src/ch/ch_process.c @@ -41,7 +41,6 @@ VIR_LOG_INIT("ch.ch_process"); #define START_VM_POSTFIX ": starting up vm\n" - static virCHMonitor * virCHProcessConnectMonitor(virCHDriver *driver, virDomainObj *vm) @@ -310,6 +309,74 @@ virCHProcessSetupPid(virDomainObj *vm, return ret; } +static int +virCHProcessSetupIOThread(virDomainObj *vm, + virDomainIOThreadInfo *iothread) +{ + virCHDomainObjPrivate *priv = vm->privateData; + return virCHProcessSetupPid(vm, iothread->iothread_id, + VIR_CGROUP_THREAD_IOTHREAD, + iothread->iothread_id, + priv->autoCpuset, /* This should be updated when CLH supports accepting + iothread settings from input domain definition */ + vm->def->cputune.iothread_period, + vm->def->cputune.iothread_quota, + NULL); /* CLH doesn't allow choosing a scheduler for iothreads.*/ +} + +static int +virCHProcessSetupIOThreads(virDomainObj *vm) +{ + virCHDomainObjPrivate *priv = vm->privateData; + virDomainIOThreadInfo **iothreads = NULL; + size_t i; + size_t niothreads; + + niothreads = virCHMonitorGetIOThreads(priv->monitor, &iothreads); + for (i = 0; i < niothreads; i++) { + VIR_DEBUG("IOThread index = %ld , tid = %d", i, iothreads[i]->iothread_id); + if (virCHProcessSetupIOThread(vm, iothreads[i]) < 0) + return -1; + } + return 0; +} + + +static int +virCHProcessSetupEmulatorThread(virDomainObj *vm, + virCHMonitorEmuThreadInfo emuthread) +{ + return virCHProcessSetupPid(vm, emuthread.tid, VIR_CGROUP_THREAD_EMULATOR, + 0, vm->def->cputune.emulatorpin, + vm->def->cputune.emulator_period, + vm->def->cputune.emulator_quota, + vm->def->cputune.emulatorsched); +} + +static int +virCHProcessSetupEmulatorThreads(virDomainObj *vm) +{ + int thd_index = 0; + virCHDomainObjPrivate *priv = vm->privateData; + /* + * Cloud-hypervisor start 4 Emulator threads by default: + * vmm + * cloud-hypervisor + * http-server + * signal_handler + */ + for (thd_index = 0; thd_index < priv->monitor->nthreads; thd_index++) { + if (priv->monitor->threads[thd_index].type == virCHThreadTypeEmulator) { + VIR_DEBUG("Setup tid = %d (%s) Emulator thread", priv->monitor->threads[thd_index].emuInfo.tid, + priv->monitor->threads[thd_index].emuInfo.thrName); + + if (virCHProcessSetupEmulatorThread(vm, priv->monitor->threads[thd_index].emuInfo) < 0) + return -1; + } + } + return 0; +} + /** * virCHProcessSetupVcpu: * @vm: domain object @@ -441,6 +508,14 @@ virCHProcessStart(virCHDriver *driver, virCHDomainRefreshThreadInfo(vm); + VIR_DEBUG("Setting emulator tuning/settings"); + if (virCHProcessSetupEmulatorThreads(vm) < 0) + goto cleanup; + + VIR_DEBUG("Setting iothread tuning/settings"); + if (virCHProcessSetupIOThreads(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting global CPU cgroup (if required)"); if (virDomainCgroupSetupGlobalCpuCgroup(vm, priv->cgroup, -- 2.27.0

On 1/25/22 17:19, Praveen K Paladugu wrote:
using virCHProcessSetupPid
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_monitor.c | 60 +++++++++++++++++++++++++++++++++++ src/ch/ch_monitor.h | 2 ++ src/ch/ch_process.c | 77 ++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 138 insertions(+), 1 deletion(-)
diff --git a/src/ch/ch_monitor.c b/src/ch/ch_monitor.c index 6c1d889a82..2a7cffb696 100644 --- a/src/ch/ch_monitor.c +++ b/src/ch/ch_monitor.c @@ -930,3 +930,63 @@ virCHMonitorGetInfo(virCHMonitor *mon, virJSONValue **info) { return virCHMonitorGet(mon, URL_VM_INFO, info); } + +/** + * virCHMonitorGetIOThreads: + * @mon: Pointer to the monitor + * @iothreads: Location to return array of IOThreadInfo data + * + * Retrieve the list of iothreads defined/running for the machine + * + * Returns count of IOThreadInfo structures on success + * -1 on error. + */ +int virCHMonitorGetIOThreads(virCHMonitor *mon, + virDomainIOThreadInfo ***iothreads) +{ + size_t nthreads = 0, niothreads = 0; + int thd_index; + virDomainIOThreadInfo **iothreadinfolist = NULL, *iothreadinfo = NULL; + + *iothreads = NULL; + nthreads = virCHMonitorRefreshThreadInfo(mon); + + iothreadinfolist = g_new0(virDomainIOThreadInfo*, nthreads); + + for (thd_index = 0; thd_index < nthreads; thd_index++) { + virBitmap *map = NULL; + if (mon->threads[thd_index].type == virCHThreadTypeIO) { + iothreadinfo = g_new0(virDomainIOThreadInfo, 1); + + iothreadinfo->iothread_id = mon->threads[thd_index].ioInfo.tid; + + if (!(map = virProcessGetAffinity(iothreadinfo->iothread_id))) + goto cleanup; + + if (virBitmapToData(map, &(iothreadinfo->cpumap), + &(iothreadinfo->cpumaplen)) < 0) { + virBitmapFree(map); + goto cleanup; + } + virBitmapFree(map); + /* Append to iothreadinfolist */ + iothreadinfolist[niothreads] = iothreadinfo; + niothreads++; + } + } + VIR_DELETE_ELEMENT_INPLACE(iothreadinfolist, + niothreads, nthreads);
I'm sorry, but what are you trying to say here? It looks like you're trying to turn @iothreadinfolist into a null terminated array. Well, for that all you need to to pass nthreads + 1 into g_new0() above.
+ *iothreads = iothreadinfolist; + VIR_DEBUG("niothreads = %ld", niothreads); + return niothreads; + + cleanup:
This needs to be named 'error' because the control gets here only in error cases. The label is not used in successful path.
+ if (iothreadinfolist) { + for (thd_index = 0; thd_index < niothreads; thd_index++) + VIR_FREE(iothreadinfolist[thd_index]);
Life's not that simple. The argument is type of virDomainIOThreadInfo* and as such might have ->cpumap member allocated. You need to call virDomainIOThreadInfoFree() instead of plain VIR_FREE().
+ VIR_FREE(iothreadinfolist);
This one is okay, because this is just an array of pointers.
+ } + if (iothreadinfo) + VIR_FREE(iothreadinfo);
And again, no need for the if() and plain VIR_FREE() is not enough.
+ return -1; +}
Michal

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_driver.c | 154 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 154 insertions(+) diff --git a/src/ch/ch_driver.c b/src/ch/ch_driver.c index d257c025ef..d60ff468f0 100644 --- a/src/ch/ch_driver.c +++ b/src/ch/ch_driver.c @@ -1303,6 +1303,158 @@ chDomainPinVcpu(virDomainPtr dom, VIR_DOMAIN_AFFECT_LIVE); } + + +static int +chDomainGetEmulatorPinInfo(virDomainPtr dom, + unsigned char *cpumaps, + int maplen, + unsigned int flags) +{ + virDomainObj *vm = NULL; + virDomainDef *def; + virCHDomainObjPrivate *priv; + bool live; + int ret = -1; + virBitmap *cpumask = NULL; + g_autoptr(virBitmap) bitmap = NULL; + virBitmap *autoCpuset = NULL; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + if (!(vm = chDomObjFromDomain(dom))) + goto cleanup; + + if (virDomainGetEmulatorPinInfoEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + if (!(def = virDomainObjGetOneDefState(vm, flags, &live))) + goto cleanup; + + if (live) { + priv = vm->privateData; + autoCpuset = priv->autoCpuset; + } + if (def->cputune.emulatorpin) { + cpumask = def->cputune.emulatorpin; + } else if (def->cpumask) { + cpumask = def->cpumask; + } else if (vm->def->placement_mode == VIR_DOMAIN_CPU_PLACEMENT_MODE_AUTO && + autoCpuset) { + cpumask = autoCpuset; + } else { + if (!(bitmap = virHostCPUGetAvailableCPUsBitmap())) + goto cleanup; + cpumask = bitmap; + } + + virBitmapToDataBuf(cpumask, cpumaps, maplen); + + ret = 1; + + cleanup: + virDomainObjEndAPI(&vm); + return ret; +} + +static int +chDomainPinEmulator(virDomainPtr dom, + unsigned char *cpumap, + int maplen, + unsigned int flags) +{ + virCHDriver *driver = dom->conn->privateData; + virDomainObj *vm; + virCgroup *cgroup_emulator = NULL; + virDomainDef *def; + virDomainDef *persistentDef; + int ret = -1; + virCHDomainObjPrivate *priv; + virBitmap *pcpumap = NULL; + g_autoptr(virCHDriverConfig) cfg = NULL; + g_autofree char *str = NULL; + virTypedParameterPtr eventParams = NULL; + int eventNparams = 0; + int eventMaxparams = 0; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + cfg = virCHDriverGetConfig(driver); + + if (!(vm = chDomObjFromDomain(dom))) + goto cleanup; + + if (virDomainPinEmulatorEnsureACL(dom->conn, vm->def, flags) < 0) + goto cleanup; + + if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0) + goto cleanup; + + if (virDomainObjGetDefs(vm, flags, &def, &persistentDef) < 0) + goto endjob; + + priv = vm->privateData; + + if (!(pcpumap = virBitmapNewData(cpumap, maplen))) + goto endjob; + + if (virBitmapIsAllClear(pcpumap)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Empty cpu list for pinning")); + goto endjob; + } + + if (def) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_EMULATOR, + 0, false, &cgroup_emulator) < 0) + goto endjob; + + if (virDomainCgroupSetupCpusetCpus(cgroup_emulator, pcpumap) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("failed to set cpuset.cpus in cgroup" + " for emulator threads")); + goto endjob; + } + } + + if (virProcessSetAffinity(vm->pid, pcpumap, false) < 0) + goto endjob; + + virBitmapFree(def->cputune.emulatorpin); + def->cputune.emulatorpin = NULL; + + if (!(def->cputune.emulatorpin = virBitmapNewCopy(pcpumap))) + goto endjob; + + if (virDomainObjSave(vm, driver->xmlopt, cfg->stateDir) < 0) + goto endjob; + + str = virBitmapFormat(pcpumap); + if (virTypedParamsAddString(&eventParams, &eventNparams, + &eventMaxparams, + VIR_DOMAIN_TUNABLE_CPU_EMULATORPIN, + str) < 0) + goto endjob; + + } + + + ret = 0; + + endjob: + virCHDomainObjEndJob(vm); + + cleanup: + if (cgroup_emulator) + virCgroupFree(cgroup_emulator); + virBitmapFree(pcpumap); + virDomainObjEndAPI(&vm); + return ret; +} + #define CH_NB_NUMA_PARAM 2 static int @@ -1603,6 +1755,8 @@ static virHypervisorDriver chHypervisorDriver = { .domainGetVcpuPinInfo = chDomainGetVcpuPinInfo, /* 8.0.0 */ .domainPinVcpu = chDomainPinVcpu, /* 8.1.0 */ .domainPinVcpuFlags = chDomainPinVcpuFlags, /* 8.1.0 */ + .domainPinEmulator = chDomainPinEmulator, /* 8.1.0 */ + .domainGetEmulatorPinInfo = chDomainGetEmulatorPinInfo, /* 8.1.0 */ .nodeGetCPUMap = chNodeGetCPUMap, /* 8.0.0 */ .domainSetNumaParameters = chDomainSetNumaParameters, /* 8.1.0 */ .domainGetNumaParameters = chDomainGetNumaParameters, /* 8.1.0 */ -- 2.27.0

On 1/25/22 17:19, Praveen K Paladugu wrote:
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --- src/ch/ch_driver.c | 154 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 154 insertions(+)
diff --git a/src/ch/ch_driver.c b/src/ch/ch_driver.c index d257c025ef..d60ff468f0 100644 --- a/src/ch/ch_driver.c +++ b/src/ch/ch_driver.c @@ -1303,6 +1303,158 @@ chDomainPinVcpu(virDomainPtr dom, VIR_DOMAIN_AFFECT_LIVE); }
+ + +static int +chDomainGetEmulatorPinInfo(virDomainPtr dom, + unsigned char *cpumaps, + int maplen, + unsigned int flags) +{ + virDomainObj *vm = NULL; + virDomainDef *def; + virCHDomainObjPrivate *priv; + bool live; + int ret = -1; + virBitmap *cpumask = NULL; + g_autoptr(virBitmap) bitmap = NULL; + virBitmap *autoCpuset = NULL; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + if (!(vm = chDomObjFromDomain(dom)))
This needs to be virCHDomainObjFromDomain(). And in the other function too.
+ goto cleanup; + + if (virDomainGetEmulatorPinInfoEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + if (!(def = virDomainObjGetOneDefState(vm, flags, &live))) + goto cleanup; + + if (live) { + priv = vm->privateData; + autoCpuset = priv->autoCpuset; + } + if (def->cputune.emulatorpin) { + cpumask = def->cputune.emulatorpin; + } else if (def->cpumask) { + cpumask = def->cpumask; + } else if (vm->def->placement_mode == VIR_DOMAIN_CPU_PLACEMENT_MODE_AUTO && + autoCpuset) { + cpumask = autoCpuset; + } else { + if (!(bitmap = virHostCPUGetAvailableCPUsBitmap())) + goto cleanup; + cpumask = bitmap; + } + + virBitmapToDataBuf(cpumask, cpumaps, maplen); + + ret = 1; + + cleanup: + virDomainObjEndAPI(&vm); + return ret; +} +
Michal

Ping.. If this patch set is ready to be merged, I'd like to get started on next set. Thank you, Praveen K Paladugu On 1/25/2022 10:19 AM, Praveen K Paladugu wrote:
This patchset adds support for cgroup management of ch threads. This version correctly manages cgroups for vcpu and emulator threads created by ch. cgroup management for iothreads is not yet supported.
Along with cgroup management, this patchset also enables support for pinning vcpu and emulator threads to selected host cpus.
v5: * bumped the verion of callbacks in ch driver to 8.1.0
v4: * addressed all open comments in v3 * dropped all the merged commits
v3: * addrressed all the formatting comments in v2 patch set * dropped indentation patches are they do not adhere to libvirt coding style * fixed build issue in qemu driver that was introduced in v2
Praveen K Paladugu (3): qemu,hypervisor: refactor some cgroup mgmt methods ch_process: Setup emulator and iothread settings ch_driver: emulator threadinfo & pinning callbacks
Vineeth Pillai (4): ch: methods for cgroup mgmt in ch driver ch_driver,ch_domain: vcpupin callback in ch driver ch_driver: enable typed param string for numatune ch_driver: add numatune callbacks for CH driver
src/ch/ch_conf.c | 2 + src/ch/ch_conf.h | 4 +- src/ch/ch_domain.c | 64 ++++ src/ch/ch_domain.h | 18 +- src/ch/ch_driver.c | 590 +++++++++++++++++++++++++++++++++ src/ch/ch_monitor.c | 156 +++++++++ src/ch/ch_monitor.h | 56 +++- src/ch/ch_process.c | 385 ++++++++++++++++++++- src/ch/ch_process.h | 3 + src/hypervisor/domain_cgroup.c | 457 ++++++++++++++++++++++++- src/hypervisor/domain_cgroup.h | 72 ++++ src/libvirt_private.syms | 14 +- src/qemu/qemu_cgroup.c | 413 +---------------------- src/qemu/qemu_cgroup.h | 11 - src/qemu/qemu_driver.c | 14 +- src/qemu/qemu_hotplug.c | 7 +- src/qemu/qemu_process.c | 24 +- 17 files changed, 1835 insertions(+), 455 deletions(-)
-- Regards, Praveen K Paladugu

On 1/25/22 17:19, Praveen K Paladugu wrote:
This patchset adds support for cgroup management of ch threads. This version correctly manages cgroups for vcpu and emulator threads created by ch. cgroup management for iothreads is not yet supported.
Along with cgroup management, this patchset also enables support for pinning vcpu and emulator threads to selected host cpus.
And also does a lot of formatting changes back and forth. I'm not fond of that really. If you want to clean up the formatting please do so in a separate patch(set).
src/ch/ch_conf.c | 2 + src/ch/ch_conf.h | 4 +- src/ch/ch_domain.c | 64 ++++ src/ch/ch_domain.h | 18 +- src/ch/ch_driver.c | 590 +++++++++++++++++++++++++++++++++ src/ch/ch_monitor.c | 156 +++++++++ src/ch/ch_monitor.h | 56 +++- src/ch/ch_process.c | 385 ++++++++++++++++++++- src/ch/ch_process.h | 3 + src/hypervisor/domain_cgroup.c | 457 ++++++++++++++++++++++++- src/hypervisor/domain_cgroup.h | 72 ++++ src/libvirt_private.syms | 14 +- src/qemu/qemu_cgroup.c | 413 +---------------------- src/qemu/qemu_cgroup.h | 11 - src/qemu/qemu_driver.c | 14 +- src/qemu/qemu_hotplug.c | 7 +- src/qemu/qemu_process.c | 24 +- 17 files changed, 1835 insertions(+), 455 deletions(-)
Nevertheless, I'm fixing all the issues I've raised and merging. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Michal

On 1/28/2022 10:05 AM, Michal Prívozník wrote:
On 1/25/22 17:19, Praveen K Paladugu wrote:
This patchset adds support for cgroup management of ch threads. This version correctly manages cgroups for vcpu and emulator threads created by ch. cgroup management for iothreads is not yet supported.
Along with cgroup management, this patchset also enables support for pinning vcpu and emulator threads to selected host cpus.
And also does a lot of formatting changes back and forth. I'm not fond of that really. If you want to clean up the formatting please do so in a separate patch(set).
Michal, My apologies for the churn related to formatting. I ran "GNU intend" these patches assuming that would correctly handle all the formatting. That caused a ton of formatting issues, I tried to revert most of it. Seems like I missed a few instances. I will pay attention to the formatting in the follow up submissions. I took note of Persistent Def and handling of NUMA Nodeset in cloud-hypervisor. The original author for this patch is no longer with Microsoft. I will take some time to figure this out and submit patches are necessary. Appreciate your support with merging this patch set.
src/ch/ch_conf.c | 2 + src/ch/ch_conf.h | 4 +- src/ch/ch_domain.c | 64 ++++ src/ch/ch_domain.h | 18 +- src/ch/ch_driver.c | 590 +++++++++++++++++++++++++++++++++ src/ch/ch_monitor.c | 156 +++++++++ src/ch/ch_monitor.h | 56 +++- src/ch/ch_process.c | 385 ++++++++++++++++++++- src/ch/ch_process.h | 3 + src/hypervisor/domain_cgroup.c | 457 ++++++++++++++++++++++++- src/hypervisor/domain_cgroup.h | 72 ++++ src/libvirt_private.syms | 14 +- src/qemu/qemu_cgroup.c | 413 +---------------------- src/qemu/qemu_cgroup.h | 11 - src/qemu/qemu_driver.c | 14 +- src/qemu/qemu_hotplug.c | 7 +- src/qemu/qemu_process.c | 24 +- 17 files changed, 1835 insertions(+), 455 deletions(-)
Nevertheless, I'm fixing all the issues I've raised and merging.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal
-- Regards, Praveen K Paladugu

On 2/1/22 18:22, Praveen K Paladugu wrote:
On 1/28/2022 10:05 AM, Michal Prívozník wrote:
On 1/25/22 17:19, Praveen K Paladugu wrote:
This patchset adds support for cgroup management of ch threads. This version correctly manages cgroups for vcpu and emulator threads created by ch. cgroup management for iothreads is not yet supported.
Along with cgroup management, this patchset also enables support for pinning vcpu and emulator threads to selected host cpus.
And also does a lot of formatting changes back and forth. I'm not fond of that really. If you want to clean up the formatting please do so in a separate patch(set).
Michal,
My apologies for the churn related to formatting. I ran "GNU intend" these patches assuming that would correctly handle all the formatting. That caused a ton of formatting issues, I tried to revert most of it.
Seems like I missed a few instances. I will pay attention to the formatting in the follow up submissions.
Usually it helps to have your favorite editor set up so that it formats the code as you write it. This is more or less what I have in my ~/.vimrc: https://libvirt.org/coding-style.html#code-indentation
I took note of Persistent Def and handling of NUMA Nodeset in cloud-hypervisor. The original author for this patch is no longer with Microsoft. I will take some time to figure this out and submit patches are necessary.
Alright, thank you. Michal
participants (2)
-
Michal Prívozník
-
Praveen K Paladugu