[libvirt] [PATCH v2 0/2] add blkio.weight_device support

This series adds support for blkio.weight_device. changes from v1: - update remote_protocol-structs to make `make check` pass - fix some memleaks - compared the sizes of remote_typed_param before and after patch 1 using pdwtags, doesn't change - libvirtd(before patch) returns an error message error: Unable to decode message payload if connect to it and send a remote_typed_param(after patch) via virsh(after patch) Hu Tao (2): Add VIR_TYPED_PARAM_STRING add interface for blkio.weight_device daemon/remote.c | 20 +++++ include/libvirt/libvirt.h.in | 13 +++- src/conf/domain_conf.c | 142 ++++++++++++++++++++++++++++++++++- src/conf/domain_conf.h | 15 ++++ src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 22 ++++++ src/qemu/qemu_driver.c | 170 +++++++++++++++++++++++++++++++++++++++++- src/remote/remote_driver.c | 15 ++++ src/remote/remote_protocol.x | 2 + src/remote_protocol-structs | 2 + src/util/cgroup.c | 33 ++++++++ src/util/cgroup.h | 3 + tools/virsh.c | 31 ++++++++ tools/virsh.pod | 5 +- 14 files changed, 467 insertions(+), 7 deletions(-) -- 1.7.3.1

--- daemon/remote.c | 15 +++++++++++++++ include/libvirt/libvirt.h.in | 4 +++- src/remote/remote_driver.c | 15 +++++++++++++++ src/remote/remote_protocol.x | 2 ++ src/remote_protocol-structs | 2 ++ 5 files changed, 37 insertions(+), 1 deletions(-) diff --git a/daemon/remote.c b/daemon/remote.c index 38bbb10..a9d0daa 100644 --- a/daemon/remote.c +++ b/daemon/remote.c @@ -613,6 +613,13 @@ remoteSerializeTypedParameters(virTypedParameterPtr params, case VIR_TYPED_PARAM_BOOLEAN: val[i].value.remote_typed_param_value_u.b = params[i].value.b; break; + case VIR_TYPED_PARAM_STRING: + val[i].value.remote_typed_param_value_u.s = strdup(params[i].value.s); + if (val[i].value.remote_typed_param_value_u.s == NULL) { + virReportOOMError(); + goto cleanup; + } + break; default: virNetError(VIR_ERR_RPC, _("unknown parameter type: %d"), params[i].type); @@ -691,6 +698,14 @@ remoteDeserializeTypedParameters(remote_typed_param *args_params_val, params[i].value.b = args_params_val[i].value.remote_typed_param_value_u.b; break; + case VIR_TYPED_PARAM_STRING: + params[i].value.s = + strdup(args_params_val[i].value.remote_typed_param_value_u.s); + if (params[i].value.s == NULL) { + virReportOOMError(); + goto cleanup; + } + break; default: virNetError(VIR_ERR_INTERNAL_ERROR, _("unknown parameter type: %d"), params[i].type); diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 5fa489e..e57241c 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -488,7 +488,8 @@ typedef enum { VIR_TYPED_PARAM_LLONG = 3, /* long long case */ VIR_TYPED_PARAM_ULLONG = 4, /* unsigned long long case */ VIR_TYPED_PARAM_DOUBLE = 5, /* double case */ - VIR_TYPED_PARAM_BOOLEAN = 6 /* boolean(character) case */ + VIR_TYPED_PARAM_BOOLEAN = 6, /* boolean(character) case */ + VIR_TYPED_PARAM_STRING = 7 /* string case */ } virTypedParameterType; /** @@ -519,6 +520,7 @@ struct _virTypedParameter { unsigned long long int ul; /* type is ULLONG */ double d; /* type is DOUBLE */ char b; /* type is BOOLEAN */ + char *s; /* type is STRING */ } value; /* parameter value */ }; diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 9d34b7e..f4cdc2e 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -1276,6 +1276,13 @@ remoteSerializeTypedParameters(virTypedParameterPtr params, case VIR_TYPED_PARAM_BOOLEAN: val[i].value.remote_typed_param_value_u.b = params[i].value.b; break; + case VIR_TYPED_PARAM_STRING: + val[i].value.remote_typed_param_value_u.s = strdup(params[i].value.s); + if (val[i].value.remote_typed_param_value_u.s == NULL) { + virReportOOMError(); + goto cleanup; + } + break; default: remoteError(VIR_ERR_RPC, _("unknown parameter type: %d"), params[i].type); @@ -1347,6 +1354,14 @@ remoteDeserializeTypedParameters(remote_typed_param *ret_params_val, params[i].value.b = ret_params_val[i].value.remote_typed_param_value_u.b; break; + case VIR_TYPED_PARAM_STRING: + params[i].value.s = + strdup(ret_params_val[i].value.remote_typed_param_value_u.s); + if (params[i].value.s == NULL) { + virReportOOMError(); + goto cleanup; + } + break; default: remoteError(VIR_ERR_RPC, _("unknown parameter type: %d"), params[i].type); diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 4ec1c57..93e6374 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -317,6 +317,8 @@ union remote_typed_param_value switch (int type) { double d; case VIR_TYPED_PARAM_BOOLEAN: int b; + case VIR_TYPED_PARAM_STRING: + remote_nonnull_string s; }; struct remote_typed_param { diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 27178da..3c88258 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -6,6 +6,7 @@ enum { VIR_TYPED_PARAM_ULLONG = 4, VIR_TYPED_PARAM_DOUBLE = 5, VIR_TYPED_PARAM_BOOLEAN = 6, + VIR_TYPED_PARAM_STRING = 7, }; struct remote_nonnull_domain { remote_nonnull_string name; @@ -78,6 +79,7 @@ struct remote_typed_param_value { uint64_t ul; double d; int b; + remote_nonnull_string s; } remote_typed_param_value_u; }; struct remote_typed_param { -- 1.7.3.1

On 09/08/2011 12:59 AM, Hu Tao wrote:
--- daemon/remote.c | 15 +++++++++++++++
Needs a commit comment, documenting compat issues (newer client talking to older server; older client talking to newer server). See my email on v1 about how I think we also need the addition of a new flag, and that no hypervisor driver will send a string unless the new flag is present to say strings are safe.
/** @@ -519,6 +520,7 @@ struct _virTypedParameter { unsigned long long int ul; /* type is ULLONG */ double d; /* type is DOUBLE */ char b; /* type is BOOLEAN */ + char *s; /* type is STRING */
Can s legitimately be NULL, or is it guaranteed to be non-NULL?
} value; /* parameter value */ };
diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 9d34b7e..f4cdc2e 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -1276,6 +1276,13 @@ remoteSerializeTypedParameters(virTypedParameterPtr params, case VIR_TYPED_PARAM_BOOLEAN: val[i].value.remote_typed_param_value_u.b = params[i].value.b; break; + case VIR_TYPED_PARAM_STRING: + val[i].value.remote_typed_param_value_u.s = strdup(params[i].value.s);
If s can be NULL, this will crash. If s cannot be NULL, then where to we validate it? -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

This patch adds a parameter --weight-device to virsh command blkiotune for setting/getting blkio.weight_device. --- daemon/remote.c | 5 + include/libvirt/libvirt.h.in | 9 ++ src/conf/domain_conf.c | 142 ++++++++++++++++++++++++++++++++++- src/conf/domain_conf.h | 15 ++++ src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 22 ++++++ src/qemu/qemu_driver.c | 170 +++++++++++++++++++++++++++++++++++++++++- src/util/cgroup.c | 33 ++++++++ src/util/cgroup.h | 3 + tools/virsh.c | 31 ++++++++ tools/virsh.pod | 5 +- 11 files changed, 430 insertions(+), 6 deletions(-) diff --git a/daemon/remote.c b/daemon/remote.c index a9d0daa..ec91526 100644 --- a/daemon/remote.c +++ b/daemon/remote.c @@ -1503,6 +1503,7 @@ remoteDispatchDomainGetBlkioParameters(virNetServerPtr server ATTRIBUTE_UNUSED, int nparams = args->nparams; unsigned int flags; int rv = -1; + int i; struct daemonClientPrivate *priv = virNetServerClientGetPrivateData(client); @@ -1547,6 +1548,10 @@ success: cleanup: if (rv < 0) virNetMessageSaveError(rerr); + for (i = 0; i < nparams; i++) { + if (params[i].type == VIR_TYPED_PARAM_STRING) + VIR_FREE(params[i].value.s); + } VIR_FREE(params); if (dom) virDomainFree(dom); diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index e57241c..c65d8f7 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1137,6 +1137,15 @@ char * virDomainGetSchedulerType(virDomainPtr domain, #define VIR_DOMAIN_BLKIO_WEIGHT "weight" +/** + * VIR_DOMAIN_BLKIO_WEIGHT_DEVICE: + * + * Macro for the blkio tunable weight_device: it represents the + * per device weight. + */ + +#define VIR_DOMAIN_BLKIO_WEIGHT_DEVICE "weight_device" + /* Set Blkio tunables for the domain*/ int virDomainSetBlkioParameters(virDomainPtr domain, virTypedParameterPtr params, diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 74f8d6a..d10e30c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -565,6 +565,108 @@ VIR_ENUM_IMPL(virDomainNumatuneMemMode, VIR_DOMAIN_NUMATUNE_MEM_LAST, #define VIR_DOMAIN_XML_WRITE_FLAGS VIR_DOMAIN_XML_SECURE #define VIR_DOMAIN_XML_READ_FLAGS VIR_DOMAIN_XML_INACTIVE +/** + * virBlkioWeightDeviceToStr: + * + * This function returns a string representing device weights that is + * suitable for writing to /cgroup/blkio/blkio.weight_device, given + * a list of weight devices. + */ +int virBlkioWeightDeviceToStr(virBlkioWeightDevicePtr weightdevices, + int ndevices, + char **result) +{ + int len = 0; + int ret = -1; + int i, j; + char **weight_devices; + char *str; + + if (VIR_ALLOC_N(weight_devices, ndevices) < 0) { + goto fail_nomem1; + } + for (i = 0; i < ndevices; i++) { + int tmp; + tmp = virAsprintf(&weight_devices[i], "%d:%d %d", + weightdevices[i].major, + weightdevices[i].minor, + weightdevices[i].weight); + if (tmp < 0) { + goto fail_nomem2; + } + len += tmp + 1; /* 1 for '\n' and the trailing '\0' */ + } + + if (VIR_ALLOC_N(str, len) < 0) { + goto fail_nomem2; + } + for (i = 0; i < ndevices; i++) { + strcat(str, weight_devices[i]); + strcat(str, "\n"); + } + str[len-1] = '\0'; + + *result = str; + + ret = 0; + +fail_nomem2: + for (j = 0; j < i; j++) + VIR_FREE(weight_devices[i]); + VIR_FREE(weight_devices); +fail_nomem1: + if (ret != 0) + virReportOOMError(); + return ret; +} + +/** + * virDomainBlkioWeightDeviceParseXML + * + * this function parses a XML node: + * + * <device> + * <major>major</major> + * <minor>minor</minor> + * <weight>weight</weight> + * </device> + * + * and fills a virBlkioWeightDevice struct. + */ +static int virDomainBlkioWeightDeviceParseXML(xmlNodePtr root, + virBlkioWeightDevicePtr dw) +{ + char *c; + xmlNodePtr node; + + if (!dw) + return -1; + + node = root->children; + while (node) { + if (node->type == XML_ELEMENT_NODE) { + if (xmlStrEqual(node->name, BAD_CAST "major")) { + c = (char *)xmlNodeGetContent(node); + dw->major = atoi(c); + VIR_FREE(c); + } else if (xmlStrEqual(node->name, BAD_CAST "minor")) { + c = (char *)xmlNodeGetContent(node); + dw->minor = atoi(c); + VIR_FREE(c); + } else if (xmlStrEqual(node->name, BAD_CAST "weight")) { + c = (char *)xmlNodeGetContent(node); + dw->weight = atoi(c); + VIR_FREE(c); + } + } + node = node->next; + } + + return 0; +} + + + static void virDomainObjListDataFree(void *payload, const void *name ATTRIBUTE_UNUSED) { @@ -1225,6 +1327,8 @@ void virDomainDefFree(virDomainDefPtr def) VIR_FREE(def->emulator); VIR_FREE(def->description); + VIR_FREE(def->blkio.weight_devices); + virDomainWatchdogDefFree(def->watchdog); virDomainMemballoonDefFree(def->memballoon); @@ -6459,6 +6563,20 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps, &def->blkio.weight) < 0) def->blkio.weight = 0; + n = virXPathNodeSet("./blkiotune/device", ctxt, &nodes); + if (n > 0) { + if (VIR_ALLOC_N(def->blkio.weight_devices, n) < 0) { + virReportOOMError(); + goto error; + } + + for (i = 0; i < n; i++) { + virDomainBlkioWeightDeviceParseXML(nodes[i], &def->blkio.weight_devices[i]); + } + def->blkio.ndevices = n; + VIR_FREE(nodes); + } + /* Extract other memory tunables */ if (virXPathULong("string(./memtune/hard_limit)", ctxt, &def->mem.hard_limit) < 0) @@ -10474,10 +10592,28 @@ virDomainDefFormatInternal(virDomainDefPtr def, def->mem.cur_balloon); /* add blkiotune only if there are any */ - if (def->blkio.weight) { + if (def->blkio.weight || def->blkio.weight_devices) { virBufferAsprintf(buf, " <blkiotune>\n"); - virBufferAsprintf(buf, " <weight>%u</weight>\n", - def->blkio.weight); + + if (def->blkio.weight) + virBufferAsprintf(buf, " <weight>%u</weight>\n", + def->blkio.weight); + + if (def->blkio.weight_devices) { + int i; + + for (i = 0; i < def->blkio.ndevices; i++) { + virBufferAsprintf(buf, " <device>\n"); + virBufferAsprintf(buf, " <major>%d</major>\n", + def->blkio.weight_devices[i].major); + virBufferAsprintf(buf, " <minor>%d</minor>\n", + def->blkio.weight_devices[i].minor); + virBufferAsprintf(buf, " <weight>%d</weight>\n", + def->blkio.weight_devices[i].weight); + virBufferAsprintf(buf, " </device>\n"); + } + } + virBufferAsprintf(buf, " </blkiotune>\n"); } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 371f270..267eb95 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1276,6 +1276,19 @@ struct _virDomainNumatuneDef { /* Future NUMA tuning related stuff should go here. */ }; +typedef struct _virBlkioWeightDevice virBlkioWeightDevice; +typedef virBlkioWeightDevice *virBlkioWeightDevicePtr; +struct _virBlkioWeightDevice { + int major; + int minor; + int weight; +}; + +int virBlkioWeightDeviceToStr(virBlkioWeightDevicePtr deviceWeights, + int ndevices, + char **result); + + /* * Guest VM main configuration * @@ -1293,6 +1306,8 @@ struct _virDomainDef { struct { unsigned int weight; + int ndevices; + virBlkioWeightDevicePtr weight_devices; } blkio; struct { diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 0780f2d..597cd97 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -227,6 +227,7 @@ virDomainAuditVcpu; # domain_conf.h +virBlkioWeightDeviceToStr; virDiskNameToBusDeviceIndex; virDiskNameToIndex; virDomainActualNetDefFree; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 2a10bd2..eb06871 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -312,6 +312,28 @@ int qemuSetupCgroup(struct qemud_driver *driver, } } + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + for (i = 0; i < vm->def->blkio.ndevices; i++) { + char *tmp = NULL; + + virAsprintf(&tmp, "%d:%d %d", + vm->def->blkio.weight_devices[i].major, + vm->def->blkio.weight_devices[i].minor, + vm->def->blkio.weight_devices[i].weight); + if (tmp) { + rc = virCgroupSetBlkioWeightDevice(cgroup, tmp); + VIR_FREE(tmp); + tmp = NULL; + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to set io device weight for domain %s"), + vm->def->name); + goto cleanup; + } + } + } + } + if (vm->def->mem.hard_limit != 0 || vm->def->mem.soft_limit != 0 || vm->def->mem.swap_hard_limit != 0) { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1925ba5..a9e9ad5 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -44,6 +44,7 @@ #include <sys/ioctl.h> #include <sys/un.h> #include <byteswap.h> +#include <ctype.h> #include "qemu_driver.h" @@ -112,7 +113,7 @@ # define KVM_CAP_NR_VCPUS 9 /* returns max vcpus per vm */ #endif -#define QEMU_NB_BLKIO_PARAM 1 +#define QEMU_NB_BLKIO_PARAM 2 static void processWatchdogEvent(void *data, void *opaque); @@ -6044,6 +6045,86 @@ cleanup: return ret; } +/* weightDeviceStr in the form of major:minor,weight;nmajor:minor,weight + * for example, 8:0,300;8:16,800;8:32,1000 + */ +static int parseBlkioWeightDeviceStr(const char *weightDeviceStr, virBlkioWeightDevicePtr *dw, int *size) +{ + const char *temp; + int nDevice = 0; + int i; + virBlkioWeightDevicePtr result = NULL; + + if (!dw) + return -1; + if (*dw) + return -1; + + temp = weightDeviceStr; + while (temp) { + temp = strchr(temp, ';'); + if (temp) { + temp++; + if (*temp == '\0') + break; + nDevice++; + } + } + nDevice++; + + if (VIR_ALLOC_N(result, nDevice) < 0) { + virReportOOMError(); + return -1; + } + + i = 0; + temp = weightDeviceStr; + while (temp && i < nDevice) { + const char *p = temp; + while (isdigit(*p) && ++p); + if (!p || *p != ':') + goto fail; + + result[i].major = atoi(temp); + + temp = p + 1; + if (!temp) + goto fail; + + p = temp; + while (isdigit(*p) && ++p); + if (!p || *p != ',') + goto fail; + + result[i].minor = atoi(temp); + + temp = p + 1; + if (!temp) + goto fail; + + p = temp; + while (isdigit(*p) && ++p); + if (!p || (*p != ';' && *p != '\0')) + goto fail; + + result[i].weight = atoi(temp); + + i++; + if (*p == '\0') + break; + temp = p + 1; + } + + *dw = result; + *size = i; + + return 0; + +fail: + VIR_FREE(result); + return -1; +} + static int qemuDomainSetBlkioParameters(virDomainPtr dom, virTypedParameterPtr params, int nparams, @@ -6110,10 +6191,10 @@ static int qemuDomainSetBlkioParameters(virDomainPtr dom, ret = 0; if (flags & VIR_DOMAIN_AFFECT_LIVE) { for (i = 0; i < nparams; i++) { + int rc; virTypedParameterPtr param = ¶ms[i]; if (STREQ(param->field, VIR_DOMAIN_BLKIO_WEIGHT)) { - int rc; if (param->type != VIR_TYPED_PARAM_UINT) { qemuReportError(VIR_ERR_INVALID_ARG, "%s", _("invalid type for blkio weight tunable, expected a 'unsigned int'")); @@ -6134,6 +6215,44 @@ static int qemuDomainSetBlkioParameters(virDomainPtr dom, _("unable to set blkio weight tunable")); ret = -1; } + } else if(STREQ(param->field, VIR_DOMAIN_BLKIO_WEIGHT_DEVICE)) { + int ndevices; + virBlkioWeightDevicePtr tmp = NULL; + if (param->type != VIR_TYPED_PARAM_STRING) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("invalid type for blkio weight_device tunable, expected a 'char *'")); + ret = -1; + continue; + } + + if (parseBlkioWeightDeviceStr(params[i].value.s, + &tmp, + &ndevices) < 0) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("invalid format for blkio weight_device")); + ret = -1; + continue; + } + for (i = 0; i < ndevices; i++) { + char *weight_device = NULL; + + virAsprintf(&weight_device, "%d:%d %d", + tmp[i].major, + tmp[i].minor, + tmp[i].weight); + if (weight_device) { + rc = virCgroupSetBlkioWeightDevice(group, weight_device); + VIR_FREE(weight_device); + weight_device = NULL; + if (rc != 0) { + virReportSystemError(-rc, "%s", + _("unable to set blkio weight_device tunable")); + ret = -1; + break; + } + } + } + VIR_FREE(tmp); } else { qemuReportError(VIR_ERR_INVALID_ARG, _("Parameter `%s' not supported"), param->field); @@ -6163,6 +6282,21 @@ static int qemuDomainSetBlkioParameters(virDomainPtr dom, } persistentDef->blkio.weight = params[i].value.ui; + } else if (STREQ(param->field, VIR_DOMAIN_BLKIO_WEIGHT_DEVICE)) { + virBlkioWeightDevicePtr tmp = NULL; + int ndevices; + if (parseBlkioWeightDeviceStr(params[i].value.s, + &tmp, + &ndevices) < 0) { + qemuReportError(VIR_ERR_INVALID_ARG, + _("invalid device weight format: %s"), + params[i].value.s); + ret = -1; + continue; + } + VIR_FREE(persistentDef->blkio.weight_devices); + persistentDef->blkio.weight_devices = tmp; + persistentDef->blkio.ndevices = ndevices; } else { qemuReportError(VIR_ERR_INVALID_ARG, _("Parameter `%s' not supported"), param->field); @@ -6262,6 +6396,7 @@ static int qemuDomainGetBlkioParameters(virDomainPtr dom, if (flags & VIR_DOMAIN_AFFECT_LIVE) { for (i = 0; i < *nparams; i++) { + char *weight_device; virTypedParameterPtr param = ¶ms[i]; val = 0; param->value.ui = 0; @@ -6282,6 +6417,21 @@ static int qemuDomainGetBlkioParameters(virDomainPtr dom, } param->value.ui = val; break; + case 1: /* blkio.weight_device */ + param->type = VIR_TYPED_PARAM_STRING; + rc = virCgroupGetBlkioWeightDevice(group, &weight_device); + if (rc != 0) { + virReportSystemError(-rc, "%s", + _("unable to get blkio weight_device")); + goto cleanup; + } + if (virStrcpyStatic(param->field, VIR_DOMAIN_BLKIO_WEIGHT_DEVICE) == NULL) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Field blkio weight_device too long for destination")); + goto cleanup; + } + param->value.s = weight_device; + break; default: break; @@ -6305,6 +6455,22 @@ static int qemuDomainGetBlkioParameters(virDomainPtr dom, param->value.ui = persistentDef->blkio.weight; break; + case 1: /* blkio.weight_device */ + { + if (virBlkioWeightDeviceToStr(persistentDef->blkio.weight_devices, + persistentDef->blkio.ndevices, + ¶m->value.s) < 0) + goto cleanup; + param->type = VIR_TYPED_PARAM_STRING; + if (virStrcpyStatic(param->field, VIR_DOMAIN_BLKIO_WEIGHT_DEVICE) == NULL) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Field blkio weight_device too long for destination")); + goto cleanup; + } + + break; + } + default: break; /* should not hit here */ diff --git a/src/util/cgroup.c b/src/util/cgroup.c index c8d1f33..f6e5a08 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -982,6 +982,39 @@ int virCgroupGetBlkioWeight(virCgroupPtr group, unsigned int *weight) } /** + * virCgroupSetBlkioWeightDevice: + * + * @group: The cgroup to change io weight device for + * @weight_device: The Weight device for this cgroup + * + * Returns: 0 on success + */ +int virCgroupSetBlkioWeightDevice(virCgroupPtr group, + const char *weight_device) +{ + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "blkio.weight_device", + weight_device); +} + +/** + * virCgroupGetBlkioWeightDevice: + * + * @group: The cgroup to get weight_device for + * @weight_device: returned weight_device string + * + * Returns: 0 on success + */ +int virCgroupGetBlkioWeightDevice(virCgroupPtr group, + char **weight_device) +{ + return virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "blkio.weight_device", weight_device); +} + +/** * virCgroupSetMemory: * * @group: The cgroup to change memory for diff --git a/src/util/cgroup.h b/src/util/cgroup.h index d190bb3..87e196b 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -55,6 +55,9 @@ int virCgroupAddTask(virCgroupPtr group, pid_t pid); int virCgroupSetBlkioWeight(virCgroupPtr group, unsigned int weight); int virCgroupGetBlkioWeight(virCgroupPtr group, unsigned int *weight); +int virCgroupSetBlkioWeightDevice(virCgroupPtr group, const char *weight_device); +int virCgroupGetBlkioWeightDevice(virCgroupPtr group, char **weight_device); + int virCgroupSetMemory(virCgroupPtr group, unsigned long long kb); int virCgroupGetMemoryUsage(virCgroupPtr group, unsigned long *kb); diff --git a/tools/virsh.c b/tools/virsh.c index 0f00463..8301efc 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -4557,6 +4557,8 @@ static const vshCmdOptDef opts_blkiotune[] = { {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, N_("domain name, id or uuid")}, {"weight", VSH_OT_INT, VSH_OFLAG_NONE, N_("IO Weight in range [100, 1000]")}, + {"weight-device", VSH_OT_STRING, VSH_OFLAG_NONE, + N_("per device IO Weight, in the form of major:minor,weight")}, {"config", VSH_OT_BOOL, 0, N_("affect next boot")}, {"live", VSH_OT_BOOL, 0, N_("affect running domain")}, {"current", VSH_OT_BOOL, 0, N_("affect current domain")}, @@ -4567,6 +4569,7 @@ static bool cmdBlkiotune(vshControl * ctl, const vshCmd * cmd) { virDomainPtr dom; + const char *weight_device = NULL; int weight = 0; int nparams = 0; int rv = 0; @@ -4611,6 +4614,16 @@ cmdBlkiotune(vshControl * ctl, const vshCmd * cmd) } } + rv = vshCommandOptString(cmd, "weight-device", &weight_device); + if (rv < 0) { + vshError(ctl, "%s", + _("Unable to parse string parameter")); + goto cleanup; + } + if (rv > 0) { + nparams++; + } + if (nparams == 0) { /* get the number of blkio parameters */ if (virDomainGetBlkioParameters(dom, NULL, &nparams, flags) != 0) { @@ -4658,6 +4671,10 @@ cmdBlkiotune(vshControl * ctl, const vshCmd * cmd) vshPrint(ctl, "%-15s: %d\n", params[i].field, params[i].value.b); break; + case VIR_TYPED_PARAM_STRING: + vshPrint(ctl, "%-15s: %s\n", params[i].field, + params[i].value.s); + break; default: vshPrint(ctl, "unimplemented blkio parameter type\n"); } @@ -4678,6 +4695,20 @@ cmdBlkiotune(vshControl * ctl, const vshCmd * cmd) sizeof(temp->field)); weight = 0; } + + if (weight_device) { + int len = strlen(weight_device) + 1; + temp->type = VIR_TYPED_PARAM_STRING; + temp->value.s = vshMalloc(ctl, len); + if (!temp->value.s) { + goto cleanup; + } + strncpy(temp->value.s, weight_device, strlen(weight_device)); + temp->value.s[len-1] = '\0'; + weight_device = NULL; + strncpy(temp->field, VIR_DOMAIN_BLKIO_WEIGHT_DEVICE, + sizeof(temp->field)); + } } if (virDomainSetBlkioParameters(dom, params, nparams, flags) != 0) vshError(ctl, "%s", _("Unable to change blkio parameters")); diff --git a/tools/virsh.pod b/tools/virsh.pod index d826997..1f83237 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -976,12 +976,15 @@ value are kilobytes (i.e. blocks of 1024 bytes). =back -=item B<blkiotune> I<domain-id> [I<--weight> B<weight>] [[I<--config>] +=item B<blkiotune> I<domain-id> [I<--weight> B<weight>] +[I<--weight-device> B<weight-device>] [[I<--config>] [I<--live>] | [I<--current>]] Display or set the blkio parameters. QEMU/KVM supports I<--weight>. I<--weight> is in range [100, 1000]. +B<weight-device> is in the format of major:minor,weight;major:minor,weight. + If I<--live> is specified, affect a running guest. If I<--config> is specified, affect the next boot of a persistent guest. If I<--current> is specified, affect the current guest state. -- 1.7.3.1

On Thu, Sep 08, 2011 at 02:59:56PM +0800, Hu Tao wrote:
This patch adds a parameter --weight-device to virsh command blkiotune for setting/getting blkio.weight_device. --- daemon/remote.c | 5 + include/libvirt/libvirt.h.in | 9 ++ src/conf/domain_conf.c | 142 ++++++++++++++++++++++++++++++++++- src/conf/domain_conf.h | 15 ++++ src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 22 ++++++ src/qemu/qemu_driver.c | 170 +++++++++++++++++++++++++++++++++++++++++- src/util/cgroup.c | 33 ++++++++ src/util/cgroup.h | 3 + tools/virsh.c | 31 ++++++++ tools/virsh.pod | 5 +- 11 files changed, 430 insertions(+), 6 deletions(-)
diff --git a/daemon/remote.c b/daemon/remote.c index a9d0daa..ec91526 100644 --- a/daemon/remote.c +++ b/daemon/remote.c @@ -1503,6 +1503,7 @@ remoteDispatchDomainGetBlkioParameters(virNetServerPtr server ATTRIBUTE_UNUSED, int nparams = args->nparams; unsigned int flags; int rv = -1; + int i; struct daemonClientPrivate *priv = virNetServerClientGetPrivateData(client);
@@ -1547,6 +1548,10 @@ success: cleanup: if (rv < 0) virNetMessageSaveError(rerr); + for (i = 0; i < nparams; i++) { + if (params[i].type == VIR_TYPED_PARAM_STRING) + VIR_FREE(params[i].value.s); + } VIR_FREE(params); if (dom) virDomainFree(dom); diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index e57241c..c65d8f7 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1137,6 +1137,15 @@ char * virDomainGetSchedulerType(virDomainPtr domain,
#define VIR_DOMAIN_BLKIO_WEIGHT "weight"
+/** + * VIR_DOMAIN_BLKIO_WEIGHT_DEVICE: + * + * Macro for the blkio tunable weight_device: it represents the + * per device weight. + */ + +#define VIR_DOMAIN_BLKIO_WEIGHT_DEVICE "weight_device" + /* Set Blkio tunables for the domain*/ int virDomainSetBlkioParameters(virDomainPtr domain, virTypedParameterPtr params, diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 74f8d6a..d10e30c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -565,6 +565,108 @@ VIR_ENUM_IMPL(virDomainNumatuneMemMode, VIR_DOMAIN_NUMATUNE_MEM_LAST, #define VIR_DOMAIN_XML_WRITE_FLAGS VIR_DOMAIN_XML_SECURE #define VIR_DOMAIN_XML_READ_FLAGS VIR_DOMAIN_XML_INACTIVE
+/** + * virBlkioWeightDeviceToStr: + * + * This function returns a string representing device weights that is + * suitable for writing to /cgroup/blkio/blkio.weight_device, given + * a list of weight devices. + */ +int virBlkioWeightDeviceToStr(virBlkioWeightDevicePtr weightdevices, + int ndevices, + char **result) +{ + int len = 0; + int ret = -1; + int i, j; + char **weight_devices; + char *str; + + if (VIR_ALLOC_N(weight_devices, ndevices) < 0) { + goto fail_nomem1; + } + for (i = 0; i < ndevices; i++) { + int tmp; + tmp = virAsprintf(&weight_devices[i], "%d:%d %d", + weightdevices[i].major, + weightdevices[i].minor, + weightdevices[i].weight); + if (tmp < 0) { + goto fail_nomem2; + } + len += tmp + 1; /* 1 for '\n' and the trailing '\0' */ + } + + if (VIR_ALLOC_N(str, len) < 0) { + goto fail_nomem2; + } + for (i = 0; i < ndevices; i++) { + strcat(str, weight_devices[i]); + strcat(str, "\n"); + } + str[len-1] = '\0'; + + *result = str; + + ret = 0; + +fail_nomem2: + for (j = 0; j < i; j++) + VIR_FREE(weight_devices[i]); + VIR_FREE(weight_devices); +fail_nomem1: + if (ret != 0) + virReportOOMError(); + return ret; +}
I think this method would become alot simpler if you switch over to use virBufferPtr for all the string concatenation and formatting.
+ +/** + * virDomainBlkioWeightDeviceParseXML + * + * this function parses a XML node: + * + * <device> + * <major>major</major> + * <minor>minor</minor> + * <weight>weight</weight> + * </device> + * + * and fills a virBlkioWeightDevice struct. + */
I'm not really seeing the benefit in using major, minor in the XML for this. The <disk> element is using the /dev/hda1 path for the host device, so I'd expect the same path to be usable for the block I/O tuning. How does the scope work here, does major,minor have to refer to a block device, or can it refer to a partition ? If we have multiple <device> elements, each giving a different partition on the same device can we set different weight for each partition ? Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

+/** + * virBlkioWeightDeviceToStr: + * + * This function returns a string representing device weights that is + * suitable for writing to /cgroup/blkio/blkio.weight_device, given + * a list of weight devices. + */ +int virBlkioWeightDeviceToStr(virBlkioWeightDevicePtr weightdevices, + int ndevices, + char **result) +{ + int len = 0; + int ret = -1; + int i, j; + char **weight_devices; + char *str; + + if (VIR_ALLOC_N(weight_devices, ndevices) < 0) { + goto fail_nomem1; + } + for (i = 0; i < ndevices; i++) { + int tmp; + tmp = virAsprintf(&weight_devices[i], "%d:%d %d", + weightdevices[i].major, + weightdevices[i].minor, + weightdevices[i].weight); + if (tmp < 0) { + goto fail_nomem2; + } + len += tmp + 1; /* 1 for '\n' and the trailing '\0' */ + } + + if (VIR_ALLOC_N(str, len) < 0) { + goto fail_nomem2; + } + for (i = 0; i < ndevices; i++) { + strcat(str, weight_devices[i]); + strcat(str, "\n"); + } + str[len-1] = '\0'; + + *result = str; + + ret = 0; + +fail_nomem2: + for (j = 0; j < i; j++) + VIR_FREE(weight_devices[i]); + VIR_FREE(weight_devices); +fail_nomem1: + if (ret != 0) + virReportOOMError(); + return ret; +}
I think this method would become alot simpler if you switch over to use virBufferPtr for all the string concatenation and formatting.
Thanks, I'll look into this.
+ +/** + * virDomainBlkioWeightDeviceParseXML + * + * this function parses a XML node: + * + * <device> + * <major>major</major> + * <minor>minor</minor> + * <weight>weight</weight> + * </device> + * + * and fills a virBlkioWeightDevice struct. + */
I'm not really seeing the benefit in using major, minor in the XML for this. The <disk> element is using the /dev/hda1 path for the host device, so I'd expect the same path to be usable for the block I/O tuning.
How does the scope work here, does major,minor have to refer to a block device, or can it refer to a partition ? If we have multiple <device> elements, each giving a different partition on the same device can we set different weight for each partition ?
blkio doesn't support io control on a partition, so I add major, minor to refer to a block device as a whole, rather than utilizing the existing <disk> element. -- Thanks, Hu Tao

On Thu, Sep 15, 2011 at 03:31:48PM +0800, Hu Tao wrote:
+ +/** + * virDomainBlkioWeightDeviceParseXML + * + * this function parses a XML node: + * + * <device> + * <major>major</major> + * <minor>minor</minor> + * <weight>weight</weight> + * </device> + * + * and fills a virBlkioWeightDevice struct. + */
I'm not really seeing the benefit in using major, minor in the XML for this. The <disk> element is using the /dev/hda1 path for the host device, so I'd expect the same path to be usable for the block I/O tuning.
How does the scope work here, does major,minor have to refer to a block device, or can it refer to a partition ? If we have multiple <device> elements, each giving a different partition on the same device can we set different weight for each partition ?
blkio doesn't support io control on a partition, so I add major, minor to refer to a block device as a whole, rather than utilizing the existing <disk> element.
Hmm, well I think we should still just use the '/dev/hda' block device path. The major/minor numbers cannot be assumed to be stable across different hosts, so would break migration. Whereas you can use the /dev/disk/by-{id,path} symlinks for block paths and thus be migration safe. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Sep 15, 2011 at 08:53:35AM +0100, Daniel P. Berrange wrote:
On Thu, Sep 15, 2011 at 03:31:48PM +0800, Hu Tao wrote:
+ +/** + * virDomainBlkioWeightDeviceParseXML + * + * this function parses a XML node: + * + * <device> + * <major>major</major> + * <minor>minor</minor> + * <weight>weight</weight> + * </device> + * + * and fills a virBlkioWeightDevice struct. + */
I'm not really seeing the benefit in using major, minor in the XML for this. The <disk> element is using the /dev/hda1 path for the host device, so I'd expect the same path to be usable for the block I/O tuning.
How does the scope work here, does major,minor have to refer to a block device, or can it refer to a partition ? If we have multiple <device> elements, each giving a different partition on the same device can we set different weight for each partition ?
blkio doesn't support io control on a partition, so I add major, minor to refer to a block device as a whole, rather than utilizing the existing <disk> element.
Hmm, well I think we should still just use the '/dev/hda' block device path. The major/minor numbers cannot be assumed to be stable across different hosts, so would break migration. Whereas you can use the /dev/disk/by-{id,path} symlinks for block paths and thus be migration safe.
Good idea. Thanks for the suggestion. To summarize, the XML describing weight_device like this will be acceptable: <device> <id>by-id</id> (or path here) <weight>weight</weight> </device> Should the <device> be changed into <disk> to remind user that the id is in fact a disk id? -- Thanks, Hu Tao

On Thu, Sep 15, 2011 at 05:13:53PM +0800, Hu Tao wrote:
On Thu, Sep 15, 2011 at 08:53:35AM +0100, Daniel P. Berrange wrote:
On Thu, Sep 15, 2011 at 03:31:48PM +0800, Hu Tao wrote:
+ +/** + * virDomainBlkioWeightDeviceParseXML + * + * this function parses a XML node: + * + * <device> + * <major>major</major> + * <minor>minor</minor> + * <weight>weight</weight> + * </device> + * + * and fills a virBlkioWeightDevice struct. + */
I'm not really seeing the benefit in using major, minor in the XML for this. The <disk> element is using the /dev/hda1 path for the host device, so I'd expect the same path to be usable for the block I/O tuning.
How does the scope work here, does major,minor have to refer to a block device, or can it refer to a partition ? If we have multiple <device> elements, each giving a different partition on the same device can we set different weight for each partition ?
blkio doesn't support io control on a partition, so I add major, minor to refer to a block device as a whole, rather than utilizing the existing <disk> element.
Hmm, well I think we should still just use the '/dev/hda' block device path. The major/minor numbers cannot be assumed to be stable across different hosts, so would break migration. Whereas you can use the /dev/disk/by-{id,path} symlinks for block paths and thus be migration safe.
Good idea. Thanks for the suggestion. To summarize, the XML describing weight_device like this will be acceptable:
<device> <id>by-id</id> (or path here) <weight>weight</weight> </device>
Should the <device> be changed into <disk> to remind user that the id is in fact a disk id?
IMHO it should be <device> <path>/fully/qulaified/device/path</path> <weight>weight</weight> </device> Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (3)
-
Daniel P. Berrange
-
Eric Blake
-
Hu Tao