[libvirt PATCH v4 0/5] add support for Fibre Channel VMID

changes in v4: - documentation fixes - allow only printable characters to be used in appid - properly check inode in qemuSetupCgroupAppid changes in v3: - rename the XML attribute to appid as technically what user provides is not VMID, that is created by kernel changes in v2: - refactor of resource parsing and formatting code - use <fibrechannel vmid=''/> element - use stat() directly - report only single system error Pavel Hrdina (5): vircgroup: introduce virCgroupGetInode function conf: refactor virDomainResourceDefParse conf: refactor virDomainResourceDefFormat conf: introduce support for Fibre Channel VMID qemu: implement support for Fibre Channel VMID docs/formatdomain.rst | 21 ++++++++++ docs/schemas/domaincommon.rng | 22 ++++++++-- src/conf/domain_conf.c | 42 +++++++++++-------- src/conf/domain_conf.h | 1 + src/conf/domain_validate.c | 34 +++++++++++++++ src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 30 +++++++++++++ src/util/vircgroup.c | 31 ++++++++++++++ src/util/vircgroup.h | 2 + .../fibrechannel-appid.xml | 21 ++++++++++ tests/genericxml2xmltest.c | 2 + 11 files changed, 186 insertions(+), 21 deletions(-) create mode 100644 tests/genericxml2xmlindata/fibrechannel-appid.xml -- 2.31.1

For new feature Fibre Channel VMID we will need to get inode of the VM root cgroup as it is used in the new kernel API together with VMID. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com> --- src/libvirt_private.syms | 1 + src/util/vircgroup.c | 31 +++++++++++++++++++++++++++++++ src/util/vircgroup.h | 2 ++ 3 files changed, 34 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 51a400ba59..fcb02c21b1 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1922,6 +1922,7 @@ virCgroupGetCpuShares; virCgroupGetDevicePermsString; virCgroupGetDomainTotalCpuStats; virCgroupGetFreezerState; +virCgroupGetInode; virCgroupGetMemoryHardLimit; virCgroupGetMemorySoftLimit; virCgroupGetMemoryStat; diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 1b3b28342e..4c9445340e 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -3973,3 +3973,34 @@ virCgroupGetCpuPeriodQuota(virCgroup *cgroup, unsigned long long *period, return 0; } + + +/** + * virCgroupGetInode: + * + * @cgroup: the cgroup to get inode for + * + * Get the @cgroup inode and return its value to the caller. + * + * Returns inode on success, -1 on error with error message reported. + */ +int +virCgroupGetInode(virCgroup *cgroup) +{ + struct stat st; + int controller = virCgroupGetAnyController(cgroup); + g_autofree char *path = NULL; + + if (controller < 0) + return -1; + + if (virCgroupPathOfController(cgroup, controller, "", &path) < 0) + return -1; + + if (stat(path, &st) < 0) { + virReportSystemError(errno, _("failed to get stat for '%s'"), path); + return -1; + } + + return st.st_ino; +} diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 8002cac570..690f09465c 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -283,3 +283,5 @@ int virCgroupSetOwner(virCgroup *cgroup, int virCgroupHasEmptyTasks(virCgroup *cgroup, int controller); bool virCgroupControllerAvailable(int controller); + +int virCgroupGetInode(virCgroup *cgroup); -- 2.31.1

There is no need to error out for empty <partition></partition> element as we can just simply ignore it. This allows to simplify the function and prepare it for new sub-elements of <resource>. It makes the <partition> element optional so we need to reflect the change in schema as well. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com> --- docs/schemas/domaincommon.rng | 8 +++++--- src/conf/domain_conf.c | 18 +++++++----------- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 2442078969..9b669d9de5 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1172,9 +1172,11 @@ <define name="respartition"> <element name="resource"> - <element name="partition"> - <ref name="absFilePath"/> - </element> + <optional> + <element name="partition"> + <ref name="absFilePath"/> + </element> + </optional> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e9bdbbfd74..7dff6c8beb 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -17284,23 +17284,19 @@ virDomainResourceDefParse(xmlNodePtr node, { VIR_XPATH_NODE_AUTORESTORE(ctxt) virDomainResourceDef *def = NULL; + char *partition = NULL; ctxt->node = node; + partition = virXPathString("string(./partition)", ctxt); + + if (!partition) + return NULL; + def = g_new0(virDomainResourceDef, 1); - - /* Find out what type of virtualization to use */ - if (!(def->partition = virXPathString("string(./partition)", ctxt))) { - virReportError(VIR_ERR_INTERNAL_ERROR, - "%s", _("missing resource partition attribute")); - goto error; - } + def->partition = partition; return def; - - error: - virDomainResourceDefFree(def); - return NULL; } -- 2.31.1

Prepare the function for additional sub-elements where all of the sub-elements are optional. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com> --- src/conf/domain_conf.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 7dff6c8beb..571650bfd3 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -26761,11 +26761,15 @@ static void virDomainResourceDefFormat(virBuffer *buf, virDomainResourceDef *def) { - virBufferAddLit(buf, "<resource>\n"); - virBufferAdjustIndent(buf, 2); - virBufferEscapeString(buf, "<partition>%s</partition>\n", def->partition); - virBufferAdjustIndent(buf, -2); - virBufferAddLit(buf, "</resource>\n"); + g_auto(virBuffer) childBuf = VIR_BUFFER_INIT_CHILD(buf); + + if (!def) + return; + + if (def->partition) + virBufferEscapeString(&childBuf, "<partition>%s</partition>\n", def->partition); + + virXMLFormatElement(buf, "resource", NULL, &childBuf); } @@ -27918,8 +27922,7 @@ virDomainDefFormatInternalSetRootName(virDomainDef *def, if (virDomainNumatuneFormatXML(buf, def->numa) < 0) return -1; - if (def->resource) - virDomainResourceDefFormat(buf, def->resource); + virDomainResourceDefFormat(buf, def->resource); for (i = 0; i < def->nsysinfo; i++) { if (virSysinfoFormat(buf, def->sysinfo[i]) < 0) -- 2.31.1

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- docs/formatdomain.rst | 21 ++++++++++++ docs/schemas/domaincommon.rng | 14 ++++++++ src/conf/domain_conf.c | 9 ++++- src/conf/domain_conf.h | 1 + src/conf/domain_validate.c | 34 +++++++++++++++++++ .../fibrechannel-appid.xml | 21 ++++++++++++ tests/genericxml2xmltest.c | 2 ++ 7 files changed, 101 insertions(+), 1 deletion(-) create mode 100644 tests/genericxml2xmlindata/fibrechannel-appid.xml diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 61ccd8895a..512e6abd27 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -1221,6 +1221,27 @@ Resource partitions are currently supported by the QEMU and LXC drivers, which map partition paths to cgroups directories, in all mounted controllers. :since:`Since 1.0.5` +Fibre Channel VMID +------------------- + +The FC SAN can provide various QoS levels and access control depending on the +VMID. It can also collect telemetry data at per-VM level which can be used +to enhance the IO performance of the VM. This can be configured by using +the ``appid`` attribute of ``fibrechannel`` element. The attribute contains +single string (max 128 bytes) and it is used by kernel to create VMID. + +:: + + ... + <resource> + <fibrechannel appid='userProvidedID'/> + </resource> + ... + +Using this feature requires Fibre Channel capable HW, kernel compiled with +option ``CONFIG_BLK_CGROUP_FC_APPID`` and ``nvme_fc`` kernel module loaded. +:since:`Since 7.7.0` + :anchor:`<a id="elementsCPU"/>` CPU model and topology diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 9b669d9de5..11fa24f398 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -576,6 +576,17 @@ </element> </define> + <define name="fibrechannel"> + <element name="fibrechannel"> + <attribute name="appid"> + <data type="string"> + <!-- All printable characters --> + <param name="pattern">[ -~]{1,128}</param> + </data> + </attribute> + </element> + </define> + <!-- The Identifiers can be: - an optional id attribute with a number on the domain element @@ -1177,6 +1188,9 @@ <ref name="absFilePath"/> </element> </optional> + <optional> + <ref name="fibrechannel"/> + </optional> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 571650bfd3..926f831073 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3446,6 +3446,7 @@ virDomainResourceDefFree(virDomainResourceDef *resource) return; g_free(resource->partition); + g_free(resource->appid); g_free(resource); } @@ -17285,16 +17286,19 @@ virDomainResourceDefParse(xmlNodePtr node, VIR_XPATH_NODE_AUTORESTORE(ctxt) virDomainResourceDef *def = NULL; char *partition = NULL; + char *appid = NULL; ctxt->node = node; partition = virXPathString("string(./partition)", ctxt); + appid = virXPathString("string(./fibrechannel/@appid)", ctxt); - if (!partition) + if (!partition && !appid) return NULL; def = g_new0(virDomainResourceDef, 1); def->partition = partition; + def->appid = appid; return def; } @@ -26769,6 +26773,9 @@ virDomainResourceDefFormat(virBuffer *buf, if (def->partition) virBufferEscapeString(&childBuf, "<partition>%s</partition>\n", def->partition); + if (def->appid) + virBufferEscapeString(&childBuf, "<fibrechannel appid='%s'/>\n", def->appid); + virXMLFormatElement(buf, "resource", NULL, &childBuf); } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 748b1d5f30..c7e6df7981 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2538,6 +2538,7 @@ void virBlkioDeviceArrayClear(virBlkioDevice *deviceWeights, struct _virDomainResourceDef { char *partition; + char *appid; }; struct _virDomainHugePage { diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index a9e4519b1a..d656716ea9 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -55,6 +55,37 @@ virDomainDefBootValidate(const virDomainDef *def) } +#define APPID_LEN_MIN 1 +#define APPID_LEN_MAX 128 + +static int +virDomainDefResourceValidate(const virDomainDef *def) +{ + if (!def->resource) + return 0; + + if (def->resource->appid) { + int len; + + if (!virStringIsPrintable(def->resource->appid)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Fibre Channel 'appid' is not printable string")); + return -1; + } + + len = strlen(def->resource->appid); + if (len < APPID_LEN_MIN || len > APPID_LEN_MAX) { + virReportError(VIR_ERR_INVALID_ARG, + _("Fibre Channel 'appid' string length must be between [%d, %d]"), + APPID_LEN_MIN, APPID_LEN_MAX); + return -1; + } + } + + return 0; +} + + static int virDomainDefVideoValidate(const virDomainDef *def) { @@ -1538,6 +1569,9 @@ static int virDomainDefValidateInternal(const virDomainDef *def, virDomainXMLOption *xmlopt) { + if (virDomainDefResourceValidate(def) < 0) + return -1; + if (virDomainDefDuplicateDiskInfoValidate(def) < 0) return -1; diff --git a/tests/genericxml2xmlindata/fibrechannel-appid.xml b/tests/genericxml2xmlindata/fibrechannel-appid.xml new file mode 100644 index 0000000000..ad7df4d4ac --- /dev/null +++ b/tests/genericxml2xmlindata/fibrechannel-appid.xml @@ -0,0 +1,21 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219100</memory> + <currentMemory unit='KiB'>219100</currentMemory> + <vcpu placement='static'>4</vcpu> + <resource> + <fibrechannel appid='someapp:c7a5fdbd-edaf-9455-926a-d65c16db1809'/> + </resource> + <os> + <type arch='x86_64' machine='q35'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + </devices> +</domain> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index 54622ea831..39fd6d403a 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -225,6 +225,8 @@ mymain(void) DO_TEST_DIFFERENT("cputune"); DO_TEST("device-backenddomain"); + DO_TEST("fibrechannel-appid"); + #define DO_TEST_BACKUP_FULL(name, intrnl) \ do { \ const struct testCompareBackupXMLData data = { .testname = name, \ -- 2.31.1

On Tue, Aug 17, 2021 at 12:38:09PM +0200, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- docs/formatdomain.rst | 21 ++++++++++++ docs/schemas/domaincommon.rng | 14 ++++++++ src/conf/domain_conf.c | 9 ++++- src/conf/domain_conf.h | 1 + src/conf/domain_validate.c | 34 +++++++++++++++++++ .../fibrechannel-appid.xml | 21 ++++++++++++ tests/genericxml2xmltest.c | 2 ++ 7 files changed, 101 insertions(+), 1 deletion(-) create mode 100644 tests/genericxml2xmlindata/fibrechannel-appid.xml
diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index a9e4519b1a..d656716ea9 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -55,6 +55,37 @@ virDomainDefBootValidate(const virDomainDef *def) }
+#define APPID_LEN_MIN 1 +#define APPID_LEN_MAX 128 + +static int +virDomainDefResourceValidate(const virDomainDef *def) +{ + if (!def->resource) + return 0; + + if (def->resource->appid) { + int len; + + if (!virStringIsPrintable(def->resource->appid)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Fibre Channel 'appid' is not printable string"));
"not a printable string" Reviewed-by: Martin Kletzander <mkletzan@redhat.com>

Based on kernel commit messages the interface is /sys/class/fc/fc_udev_device/appid_store where we need to write the following string "$INODE:$APPID". $INODE is the VM root cgroup inode in hexadecimal and $APPID is user provided string that will be attached to each FC frame for the VM within the cgroup identified by inode and has limit 128 bytes. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_cgroup.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index f2d99abcfa..2269a8655f 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -904,6 +904,33 @@ qemuSetupCpuCgroup(virDomainObj *vm) } +static int +qemuSetupCgroupAppid(virDomainObj *vm) +{ + qemuDomainObjPrivate *priv = vm->privateData; + int inode = virCgroupGetInode(priv->cgroup); + const char *path = "/sys/class/fc/fc_udev_device/appid_store"; + g_autofree char *appid = NULL; + virDomainResourceDef *resource = vm->def->resource; + + if (!resource || !resource->appid) + return 0; + + if (inode < 0) + return -1; + + appid = g_strdup_printf("%X:%s", inode, resource->appid); + + if (virFileWriteStr(path, appid, 0) < 0) { + virReportSystemError(errno, + _("Unable to write '%s' to '%s'"), appid, path); + return -1; + } + + return 0; +} + + static int qemuInitCgroup(virDomainObj *vm, size_t nnicindexes, @@ -1096,6 +1123,9 @@ qemuSetupCgroup(virDomainObj *vm, if (qemuSetupCpusetCgroup(vm) < 0) return -1; + if (qemuSetupCgroupAppid(vm) < 0) + return -1; + return 0; } -- 2.31.1

On Tue, Aug 17, 2021 at 12:38:10PM +0200, Pavel Hrdina wrote:
Based on kernel commit messages the interface is
/sys/class/fc/fc_udev_device/appid_store
where we need to write the following string "$INODE:$APPID".
$INODE is the VM root cgroup inode in hexadecimal and $APPID is user provided string that will be attached to each FC frame for the VM within the cgroup identified by inode and has limit 128 bytes.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_cgroup.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index f2d99abcfa..2269a8655f 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -904,6 +904,33 @@ qemuSetupCpuCgroup(virDomainObj *vm) }
+static int +qemuSetupCgroupAppid(virDomainObj *vm) +{ + qemuDomainObjPrivate *priv = vm->privateData; + int inode = virCgroupGetInode(priv->cgroup);
If this fails, ...
+ const char *path = "/sys/class/fc/fc_udev_device/appid_store"; + g_autofree char *appid = NULL; + virDomainResourceDef *resource = vm->def->resource; + + if (!resource || !resource->appid) + return 0;
..., but you return 0 here, then you still have an error in the logs. If you initialize it to -1 and only get the inode ID here, then Reviewed-by: Martin Kletzander <mkletzan@redhat.com>

On Tue, Aug 17, 2021 at 01:21:55PM +0200, Martin Kletzander wrote:
On Tue, Aug 17, 2021 at 12:38:10PM +0200, Pavel Hrdina wrote:
Based on kernel commit messages the interface is
/sys/class/fc/fc_udev_device/appid_store
where we need to write the following string "$INODE:$APPID".
$INODE is the VM root cgroup inode in hexadecimal and $APPID is user provided string that will be attached to each FC frame for the VM within the cgroup identified by inode and has limit 128 bytes.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_cgroup.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index f2d99abcfa..2269a8655f 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -904,6 +904,33 @@ qemuSetupCpuCgroup(virDomainObj *vm) }
+static int +qemuSetupCgroupAppid(virDomainObj *vm) +{ + qemuDomainObjPrivate *priv = vm->privateData; + int inode = virCgroupGetInode(priv->cgroup);
If this fails, ...
+ const char *path = "/sys/class/fc/fc_udev_device/appid_store"; + g_autofree char *appid = NULL; + virDomainResourceDef *resource = vm->def->resource; + + if (!resource || !resource->appid) + return 0;
..., but you return 0 here, then you still have an error in the logs.
If you initialize it to -1 and only get the inode ID here, then
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Right, I'll fix that and also the error message in patch 4 and push it. Thanks for review! Pavel
participants (2)
-
Martin Kletzander
-
Pavel Hrdina