[libvirt] [PATCH v5 00/11] Basic implementation of persistent reservations

v5 of: https://www.redhat.com/archives/libvir-list/2018-April/msg00736.html diff to v4: - John's review worked in. Partially. Michal Privoznik (11): virstoragefile: Introduce virStoragePRDef qemuDomainDiskChangeSupported: Deny changing reservations qemu: Introduce pr-manager-helper capability qemu: Generate pr cmd line at startup qemu_ns: Allow /dev/mapper/control for PR qemu_cgroup: Allow /dev/mapper/control for PR qemu: Introduce pr_helper to qemu.conf qemu: Start PR daemon on domain startup qemu_hotplug: Hotplug of reservations qemu_hotplug: Hotunplug of reservations qemu: Detect pr-manager-helper capability docs/formatdomain.html.in | 23 +- docs/schemas/domaincommon.rng | 34 +-- docs/schemas/storagecommon.rng | 50 +++++ m4/virt-driver-qemu.m4 | 5 + src/conf/domain_conf.c | 38 ++++ src/libvirt_private.syms | 6 + src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 4 + src/qemu/qemu_alias.c | 18 ++ src/qemu/qemu_alias.h | 4 + src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_cgroup.c | 33 ++- src/qemu/qemu_command.c | 134 ++++++++++++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_conf.c | 7 +- src/qemu/qemu_conf.h | 1 + src/qemu/qemu_domain.c | 66 ++++++ src/qemu/qemu_domain.h | 5 + src/qemu/qemu_hotplug.c | 149 ++++++++++++- src/qemu/qemu_process.c | 232 +++++++++++++++++++++ src/qemu/qemu_process.h | 6 + src/qemu/test_libvirtd_qemu.aug.in | 1 + src/util/virdevmapper.c | 8 +- src/util/virstoragefile.c | 164 +++++++++++++++ src/util/virstoragefile.h | 18 ++ tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml | 1 + .../disk-virtio-scsi-reservations.args | 38 ++++ .../disk-virtio-scsi-reservations.xml | 49 +++++ tests/qemuxml2argvtest.c | 4 + .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 2 + 36 files changed, 1075 insertions(+), 39 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml -- 2.16.1

This is a definition that holds information on SCSI persistent reservation settings. The XML part looks like this: <reservations enabled='yes' managed='no'> <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> </reservations> If @managed is set to 'yes' then the <source/> is not parsed. This design was agreed on here: https://www.redhat.com/archives/libvir-list/2017-November/msg01005.html Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 23 +++- docs/schemas/domaincommon.rng | 34 +----- docs/schemas/storagecommon.rng | 50 ++++++++ src/conf/domain_conf.c | 38 ++++++ src/libvirt_private.syms | 3 + src/util/virstoragefile.c | 131 +++++++++++++++++++++ src/util/virstoragefile.h | 14 +++ .../disk-virtio-scsi-reservations.xml | 49 ++++++++ .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 2 + 10 files changed, 314 insertions(+), 31 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index ada0df227f..6ed2fd349d 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2565,7 +2565,10 @@ </disk> <disk type='block' device='lun'> <driver name='qemu' type='raw'/> - <source dev='/dev/sda'/> + <source dev='/dev/sda'> + <reservations enabled='yes' managed='no'> + <source type='unix' path='/path/to/qemu-pr-helper' mode='client'/> + </reservations> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='3' unit='0'/> </disk> @@ -2928,6 +2931,24 @@ <a href="formatstorageencryption.html">Storage Encryption</a> page for more information. </dd> + <dt><code>reservations</code></dt> + <dd><span class="since">Since libvirt 4.3.0</span>, the + <code>reservations</code> can be a sub-element of the + <code>source</code> element for storage sources (QEMU driver only). + If present (and enabled) it enables persistent reservations for SCSI + based disks. The element has one mandatory attribute + <code>enabled</code> with accepted values <code>yes</code> and + <code>no</code>. If the feature is enabled, then there's another + mandatory attribute <code>managed</code> (accepted values are the + same as for <code>enabled</code>) that enables or disables libvirt + spawning a helper process. When the PR is unmanaged, then hypervisor + acts as a client and path to server socket must be provided in child + element <code>source</code>, which currently accepts only the + following attributes: <code>type</code> with one value + <code>unix</code>, <code>path</code> with path the socket, and + finally <code>mode</code> which accepts either <code>server</code> + or <code>client</code> and specifies the role of hypervisor. + </dd> </dl> <p> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 4cab55f05d..93084887fb 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1530,6 +1530,9 @@ <optional> <ref name="encryption"/> </optional> + <optional> + <ref name="reservations"/> + </optional> <zeroOrMore> <ref name='devSeclabel'/> </zeroOrMore> @@ -2434,18 +2437,6 @@ </attribute> </optional> </define> - <define name="reconnect"> - <element name="reconnect"> - <attribute name="enabled"> - <ref name="virYesNo"/> - </attribute> - <optional> - <attribute name="timeout"> - <ref name="unsignedInt"/> - </attribute> - </optional> - </element> - </define> <!-- An interface description can either be of type bridge in which case @@ -2494,24 +2485,7 @@ <value>vhostuser</value> </attribute> <interleave> - <element name="source"> - <attribute name="type"> - <value>unix</value> - </attribute> - <attribute name="path"> - <ref name="absFilePath"/> - </attribute> - <attribute name="mode"> - <choice> - <value>server</value> - <value>client</value> - </choice> - </attribute> - <optional> - <ref name="reconnect"/> - </optional> - <empty/> - </element> + <ref name="unixSocketSource"/> <ref name="interface-options"/> </interleave> </group> diff --git a/docs/schemas/storagecommon.rng b/docs/schemas/storagecommon.rng index edee1b0845..eed0b33347 100644 --- a/docs/schemas/storagecommon.rng +++ b/docs/schemas/storagecommon.rng @@ -39,6 +39,56 @@ </element> </define> + <define name="reconnect"> + <element name="reconnect"> + <attribute name="enabled"> + <ref name="virYesNo"/> + </attribute> + <optional> + <attribute name="timeout"> + <ref name="unsignedInt"/> + </attribute> + </optional> + </element> + </define> + + <define name='unixSocketSource'> + <element name="source"> + <attribute name="type"> + <value>unix</value> + </attribute> + <attribute name="path"> + <ref name="absFilePath"/> + </attribute> + <attribute name="mode"> + <choice> + <value>server</value> + <value>client</value> + </choice> + </attribute> + <optional> + <ref name="reconnect"/> + </optional> + <empty/> + </element> + </define> + + <define name='reservations'> + <element name='reservations'> + <attribute name='enabled'> + <ref name='virYesNo'/> + </attribute> + <optional> + <attribute name='managed'> + <ref name='virYesNo'/> + </attribute> + </optional> + <optional> + <ref name='unixSocketSource'/> + </optional> + </element> + </define> + <define name='secret'> <element name='secret'> <attribute name='type'> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 65b429460a..c454fb1b9b 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -5205,6 +5205,13 @@ virDomainDiskDefValidate(const virDomainDiskDef *disk) } } + if (disk->src->pr && + disk->device != VIR_DOMAIN_DISK_DEVICE_LUN) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("<reservations/> allowed only for lun devices")); + return -1; + } + /* Reject disks with a bus type that is not compatible with the * given address type. The function considers only buses that are * handled in common code. For other bus types it's not possible @@ -8608,6 +8615,31 @@ virDomainDiskSourcePrivateDataParse(xmlNodePtr node, } +static int +virDomainDiskSourcePRParse(xmlNodePtr node, + xmlXPathContextPtr ctxt, + virStoragePRDefPtr *pr) +{ + xmlNodePtr saveNode = ctxt->node; + int ret = -1; + + ctxt->node = node; + + if (!(ctxt->node = virXPathNode("./reservations", ctxt))) { + ret = 0; + goto cleanup; + } + + if (!(*pr = virStoragePRDefParseXML(ctxt))) + goto cleanup; + + ret = 0; + cleanup: + ctxt->node = saveNode; + return ret; +} + + static int virDomainStorageSourceParse(xmlNodePtr node, xmlXPathContextPtr ctxt, @@ -8654,6 +8686,9 @@ virDomainStorageSourceParse(xmlNodePtr node, !(src->encryption = virStorageEncryptionParseNode(tmp, ctxt))) goto cleanup; + if (virDomainDiskSourcePRParse(node, ctxt, &src->pr) < 0) + goto cleanup; + if (virSecurityDeviceLabelDefParseXML(&src->seclabels, &src->nseclabels, ctxt, flags) < 0) goto cleanup; @@ -22928,6 +22963,9 @@ virDomainStorageSourceFormat(virBufferPtr attrBuf, virStorageEncryptionFormat(childBuf, src->encryption) < 0) return -1; + if (src->pr) + virStoragePRDefFormat(childBuf, src->pr); + return 0; } diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d2728749fb..f81cb5df93 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2797,6 +2797,9 @@ virStorageNetHostDefFree; virStorageNetHostTransportTypeFromString; virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; +virStoragePRDefFormat; +virStoragePRDefFree; +virStoragePRDefParseXML; virStorageSourceBackingStoreClear; virStorageSourceClear; virStorageSourceCopy; diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 531540ac91..42ef24e9d5 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -1892,6 +1892,136 @@ virStorageAuthDefFormat(virBufferPtr buf, } +void +virStoragePRDefFree(virStoragePRDefPtr prd) +{ + if (!prd) + return; + + VIR_FREE(prd->path); + VIR_FREE(prd); +} + + +virStoragePRDefPtr +virStoragePRDefParseXML(xmlXPathContextPtr ctxt) +{ + virStoragePRDefPtr prd, ret = NULL; + char *enabled = NULL; + char *managed = NULL; + char *type = NULL; + char *path = NULL; + char *mode = NULL; + + if (VIR_ALLOC(prd) < 0) + return NULL; + + if (!(enabled = virXPathString("string(./@enabled)", ctxt))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing @enabled attribute for <reservations/>")); + goto cleanup; + } + + if ((prd->enabled = virTristateBoolTypeFromString(enabled)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'enabled': %s"), enabled); + goto cleanup; + } + + if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + if (!(managed = virXPathString("string(./@managed)", ctxt))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing @managed attribute for <reservations/>")); + goto cleanup; + } + + if ((prd->managed = virTristateBoolTypeFromString(managed)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'managed': %s"), managed); + goto cleanup; + } + + if (prd->managed == VIR_TRISTATE_BOOL_NO) { + type = virXPathString("string(./source[1]/@type)", ctxt); + path = virXPathString("string(./source[1]/@path)", ctxt); + mode = virXPathString("string(./source[1]/@mode)", ctxt); + + if (!type) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection type for <reservations/>")); + goto cleanup; + } + + if (!path) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing path for <reservations/>")); + goto cleanup; + } + + if (!mode) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection mode for <reservations/>")); + goto cleanup; + } + + if (STRNEQ(type, "unix")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported connection type for <reservations/>: %s"), + type); + goto cleanup; + } + + if (STRNEQ(mode, "client")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported connection mode for <reservations/>: %s"), + mode); + goto cleanup; + } + + VIR_STEAL_PTR(prd->path, path); + } + } + + ret = prd; + prd = NULL; + + cleanup: + VIR_FREE(mode); + VIR_FREE(path); + VIR_FREE(type); + VIR_FREE(managed); + VIR_FREE(enabled); + virStoragePRDefFree(prd); + return ret; +} + + +void +virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd) +{ + virBufferAsprintf(buf, "<reservations enabled='%s'", + virTristateBoolTypeToString(prd->enabled)); + if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + virBufferAsprintf(buf, " managed='%s'", + virTristateBoolTypeToString(prd->managed)); + if (prd->managed == VIR_TRISTATE_BOOL_NO) { + virBufferAddLit(buf, ">\n"); + virBufferAdjustIndent(buf, 2); + virBufferAddLit(buf, "<source type='unix'"); + virBufferEscapeString(buf, " path='%s'", prd->path); + virBufferAddLit(buf, " mode='client'/>\n"); + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</reservations>\n"); + } else { + virBufferAddLit(buf, "/>\n"); + } + } else { + virBufferAddLit(buf, "/>\n"); + } +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) @@ -2264,6 +2394,7 @@ virStorageSourceClear(virStorageSourcePtr def) virBitmapFree(def->features); VIR_FREE(def->compat); virStorageEncryptionFree(def->encryption); + virStoragePRDefFree(def->pr); virStorageSourceSeclabelsClear(def); virStoragePermsFree(def->perms); VIR_FREE(def->timestamps); diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index d129e81978..69e7e20e17 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -216,6 +216,14 @@ struct _virStorageAuthDef { virSecretLookupTypeDef seclookupdef; }; +typedef struct _virStoragePRDef virStoragePRDef; +typedef virStoragePRDef *virStoragePRDefPtr; +struct _virStoragePRDef { + int enabled; /* enum virTristateBool */ + int managed; /* enum virTristateBool */ + char *path; +}; + typedef struct _virStorageDriverData virStorageDriverData; typedef virStorageDriverData *virStorageDriverDataPtr; @@ -243,6 +251,7 @@ struct _virStorageSource { bool authInherited; virStorageEncryptionPtr encryption; bool encryptionInherited; + virStoragePRDefPtr pr; virObjectPtr privateData; @@ -369,6 +378,11 @@ virStorageAuthDefPtr virStorageAuthDefParse(xmlNodePtr node, xmlXPathContextPtr ctxt); void virStorageAuthDefFormat(virBufferPtr buf, virStorageAuthDefPtr authdef); +void virStoragePRDefFree(virStoragePRDefPtr prd); +virStoragePRDefPtr virStoragePRDefParseXML(xmlXPathContextPtr ctxt); +void virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd); + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model); diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml new file mode 100644 index 0000000000..036c6e3c25 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml @@ -0,0 +1,49 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>8</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <disk type='block' device='lun'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'> + <reservations enabled='yes' managed='yes'/> + </source> + <target dev='sda' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <disk type='block' device='lun'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest2'> + <reservations enabled='yes' managed='no'> + <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> + </reservations> + </source> + <target dev='sdb' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='1'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <driver queues='8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml new file mode 120000 index 0000000000..dde52fd1db --- /dev/null +++ b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/disk-virtio-scsi-reservations.xml \ No newline at end of file diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 9e77b9fb13..871f01edd7 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -387,6 +387,8 @@ mymain(void) QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-num_queues", QEMU_CAPS_VIRTIO_SCSI); + DO_TEST("disk-virtio-scsi-reservations", + QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-cmd_per_lun", QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-max_sectors", -- 2.16.1

On 04/23/2018 09:14 AM, Michal Privoznik wrote:
This is a definition that holds information on SCSI persistent reservation settings. The XML part looks like this:
<reservations enabled='yes' managed='no'> <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> </reservations>
If @managed is set to 'yes' then the <source/> is not parsed. This design was agreed on here:
https://www.redhat.com/archives/libvir-list/2017-November/msg01005.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Even though this is present there's still an adjustment or two that needs to be made to formatdomain.html.in
--- docs/formatdomain.html.in | 23 +++- docs/schemas/domaincommon.rng | 34 +----- docs/schemas/storagecommon.rng | 50 ++++++++ src/conf/domain_conf.c | 38 ++++++ src/libvirt_private.syms | 3 + src/util/virstoragefile.c | 131 +++++++++++++++++++++ src/util/virstoragefile.h | 14 +++ .../disk-virtio-scsi-reservations.xml | 49 ++++++++ .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 2 + 10 files changed, 314 insertions(+), 31 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index ada0df227f..6ed2fd349d 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2565,7 +2565,10 @@ </disk> <disk type='block' device='lun'> <driver name='qemu' type='raw'/> - <source dev='/dev/sda'/> + <source dev='/dev/sda'> + <reservations enabled='yes' managed='no'> + <source type='unix' path='/path/to/qemu-pr-helper' mode='client'/> + </reservations> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='3' unit='0'/> </disk> @@ -2928,6 +2931,24 @@ <a href="formatstorageencryption.html">Storage Encryption</a> page for more information. </dd> + <dt><code>reservations</code></dt> + <dd><span class="since">Since libvirt 4.3.0</span>, the
4.4.0
+ <code>reservations</code> can be a sub-element of the> + <code>source</code> element for storage sources (QEM driver only).
QEMU
+ If present (and enabled) it enables persistent reservations for SCSI + based disks. The element has one mandatory attribute + <code>enabled</code> with accepted values <code>yes</code> and + <code>no</code>. If the feature is enabled, then there's another + mandatory attribute <code>managed</code> (accepted values are the + same as for <code>enabled</code>) that enables or disables libvirt + spawning a helper process. When the PR is unmanaged, then hypervisor + acts as a client and path to server socket must be provided in child + element <code>source</code>, which currently accepts only the + following attributes: <code>type</code> with one value + <code>unix</code>, <code>path</code> with path the socket, and + finally <code>mode</code> which accepts either <code>server</code> + or <code>client</code> and specifies the role of hypervisor.
I know I've pointed this out before - mode only accepts client. Yes the grammar supports both, but the code supports only one and that's all we should list.
+ </dd>
Can we squeeze in a (somewhat facetiously :-)): It's recommended to allow libvirt manage the persistent reservations. Or Please don't use unmanaged mode, it's going to be really painful for all involved ;-) John
</dl>
<p>

Couple of reasons for that: a) there's no monitor command to change path where the pr-helper connects to, or b) there's no monitor command to introduce a new pr-helper for a disk that already exists. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 8 ++++++++ src/util/virstoragefile.c | 19 +++++++++++++++++++ src/util/virstoragefile.h | 2 ++ 4 files changed, 30 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index f81cb5df93..27c5bb5bce 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2799,6 +2799,7 @@ virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; virStoragePRDefFormat; virStoragePRDefFree; +virStoragePRDefIsEqual; virStoragePRDefParseXML; virStorageSourceBackingStoreClear; virStorageSourceClear; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 326c939c85..0dc5a88e79 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7836,6 +7836,14 @@ qemuDomainDiskChangeSupported(virDomainDiskDefPtr disk, CHECK_EQ(src->readonly, "readonly", true); CHECK_EQ(src->shared, "shared", true); + if (!virStoragePRDefIsEqual(disk->src->pr, + orig_disk->src->pr)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("cannot modify field '%s' of the disk"), + "reservations"); + return false; + } + #undef CHECK_EQ return true; diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 42ef24e9d5..d0050eca1f 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -2022,6 +2022,25 @@ virStoragePRDefFormat(virBufferPtr buf, } +bool +virStoragePRDefIsEqual(virStoragePRDefPtr a, + virStoragePRDefPtr b) +{ + if (!a && !b) + return true; + + if (!a || !b) + return false; + + if (a->enabled != b->enabled || + a->managed != b->managed || + STRNEQ_NULLABLE(a->path, b->path)) + return false; + + return true; +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index 69e7e20e17..09e9ae73f1 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -382,6 +382,8 @@ void virStoragePRDefFree(virStoragePRDefPtr prd); virStoragePRDefPtr virStoragePRDefParseXML(xmlXPathContextPtr ctxt); void virStoragePRDefFormat(virBufferPtr buf, virStoragePRDefPtr prd); +bool virStoragePRDefIsEqual(virStoragePRDefPtr a, + virStoragePRDefPtr b); virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, -- 2.16.1

The capability tracks if qemu has pr-manager-helper object. At this time don't actually detect if qemu has the capability. Not just yet. Only after the code is written the feature will be enabled. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + src/qemu/qemu_capabilities.h | 1 + 2 files changed, 2 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 833c75514c..d67b859d75 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -473,6 +473,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, /* 290 */ "query-cpus-fast", "disk-write-cache", + "pr-manager-helper", ); diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index f08cfc2611..bb74ad05d4 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -457,6 +457,7 @@ typedef enum { /* 290 */ QEMU_CAPS_QUERY_CPUS_FAST, /* query-cpus-fast command */ QEMU_CAPS_DISK_WRITE_CACHE, /* qemu block frontends support write-cache param */ + QEMU_CAPS_PR_MANAGER_HELPER, /* -object pr-manager-helper */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; -- 2.16.1

For command line we need two things: 1) -object pr-manager-helper,id=$alias,path=$socketPath 2) -drive file.pr-manager=$alias In -object pr-manager-helper we tell qemu which socket to connect to, then in -drive file-pr-manager we just reference the object the drive in question should use. For managed PR helper the alias is always "pr-helper0" and socket path "${vm->priv->libDir}/pr-helper0.sock". It was decided in reviews to previous versions of this patch that it should not allocate memory in order to simplify code. This promise is not fulfilled though. For instance, qemuBuildPRManagerInfoProps() is an offender. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 2 + src/qemu/qemu_alias.c | 18 +++ src/qemu/qemu_alias.h | 4 + src/qemu/qemu_command.c | 134 +++++++++++++++++++++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 22 ++++ src/qemu/qemu_domain.h | 3 + src/qemu/qemu_process.h | 1 + src/util/virstoragefile.c | 14 +++ src/util/virstoragefile.h | 2 + .../disk-virtio-scsi-reservations.args | 38 ++++++ tests/qemuxml2argvtest.c | 4 + 12 files changed, 247 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 27c5bb5bce..108b44f6f4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2799,7 +2799,9 @@ virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; virStoragePRDefFormat; virStoragePRDefFree; +virStoragePRDefIsEnabled; virStoragePRDefIsEqual; +virStoragePRDefIsManaged; virStoragePRDefParseXML; virStorageSourceBackingStoreClear; virStorageSourceClear; diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c index 95d1e0370a..9a3a92c82d 100644 --- a/src/qemu/qemu_alias.c +++ b/src/qemu/qemu_alias.c @@ -773,3 +773,21 @@ qemuAliasChardevFromDevAlias(const char *devAlias) return ret; } + + +const char * +qemuDomainGetManagedPRAlias(void) +{ + return "pr-helper0"; +} + + +char * +qemuDomainGetUnmanagedPRAlias(const virDomainDiskDef *disk) +{ + char *ret; + + ignore_value(virAsprintf(&ret, "pr-helper-%s", disk->info.alias)); + + return ret; +} diff --git a/src/qemu/qemu_alias.h b/src/qemu/qemu_alias.h index 8c744138ce..8e391c3f0c 100644 --- a/src/qemu/qemu_alias.h +++ b/src/qemu/qemu_alias.h @@ -92,4 +92,8 @@ char *qemuAliasTLSObjFromSrcAlias(const char *srcAlias) char *qemuAliasChardevFromDevAlias(const char *devAlias) ATTRIBUTE_NONNULL(1); +const char * qemuDomainGetManagedPRAlias(void); + +char *qemuDomainGetUnmanagedPRAlias(const virDomainDiskDef *disk); + #endif /* __QEMU_ALIAS_H__*/ diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index b666f3715f..8b1b5934b1 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1462,6 +1462,28 @@ qemuDiskSourceGetProps(virStorageSourcePtr src) } +static int +qemuBuildDriveSourcePR(virBufferPtr buf, + virDomainDiskDefPtr disk) +{ + char *alias = NULL; + const char *defaultAlias = NULL; + + if (!virStoragePRDefIsEnabled(disk->src->pr)) + return 0; + + if (virStoragePRDefIsManaged(disk->src->pr)) + defaultAlias = qemuDomainGetManagedPRAlias(); + else if (!(alias = qemuDomainGetUnmanagedPRAlias(disk))) + return -1; + + + virBufferAsprintf(buf, ",file.pr-manager=%s", alias ? alias : defaultAlias); + VIR_FREE(alias); + return 0; +} + + static int qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, virQEMUCapsPtr qemuCaps, @@ -1539,6 +1561,9 @@ qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, if (disk->src->debug) virBufferAsprintf(buf, ",file.debug=%d", disk->src->debugLevel); + + if (qemuBuildDriveSourcePR(buf, disk) < 0) + goto cleanup; } else { if (!(source = virQEMUBuildDriveCommandlineFromJSON(srcprops))) goto cleanup; @@ -9665,6 +9690,112 @@ qemuBuildPanicCommandLine(virCommandPtr cmd, } +/** + * qemuBuildPRManagerInfoProps: + * @prd: disk PR runtime info + * @propsret: JSON properties to return + * + * Build the JSON properties for the pr-manager object. + * + * Returns: 0 on success (@propsret is NULL if no properties are needed), + * -1 on failure (with error message set). + */ +int +qemuBuildPRManagerInfoProps(virDomainObjPtr vm, + const virDomainDiskDef *disk, + virJSONValuePtr *propsret, + char **aliasret) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + char *socketPath = NULL; + char *alias = NULL; + int ret = -1; + + *propsret = NULL; + *aliasret = NULL; + + if (!virStoragePRDefIsEnabled(disk->src->pr)) + return 0; + + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_PR_MANAGER_HELPER)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("reservations not supported with this QEMU binary")); + return ret; + } + + if (!(socketPath = qemuDomainGetPRSocketPath(vm, disk->src->pr))) + return ret; + + if (virStoragePRDefIsManaged(disk->src->pr)) { + if (VIR_STRDUP(alias, qemuDomainGetManagedPRAlias()) < 0) + goto cleanup; + } else { + if (!(alias = qemuDomainGetUnmanagedPRAlias(disk))) + goto cleanup; + } + + if (virJSONValueObjectCreate(propsret, + "s:path", socketPath, + NULL) < 0) + goto cleanup; + + VIR_STEAL_PTR(*aliasret, alias); + ret = 0; + cleanup: + VIR_FREE(alias); + VIR_FREE(socketPath); + return ret; +} + + +static int +qemuBuildMasterPRCommandLine(virDomainObjPtr vm, + virCommandPtr cmd, + const virDomainDef *def) +{ + size_t i; + bool managedAdded = false; + virJSONValuePtr props = NULL; + char *alias = NULL; + char *tmp = NULL; + int ret = -1; + + for (i = 0; i < def->ndisks; i++) { + const virDomainDiskDef *disk = def->disks[i]; + + if (virStoragePRDefIsManaged(disk->src->pr)) { + if (managedAdded) + continue; + + managedAdded = true; + } + + if (qemuBuildPRManagerInfoProps(vm, disk, &props, &alias) < 0) + goto cleanup; + + if (!props) + continue; + + if (!(tmp = virQEMUBuildObjectCommandlineFromJSON("pr-manager-helper", + alias, + props))) + goto cleanup; + VIR_FREE(alias); + virJSONValueFree(props); + props = NULL; + + virCommandAddArgList(cmd, "-object", tmp, NULL); + VIR_FREE(tmp); + } + + ret = 0; + cleanup: + VIR_FREE(alias); + virJSONValueFree(props); + return ret; +} + + /** * qemuBuildCommandLineValidate: * @@ -9833,6 +9964,9 @@ qemuBuildCommandLine(virQEMUDriverPtr driver, if (qemuBuildMasterKeyCommandLine(cmd, priv) < 0) goto error; + if (qemuBuildMasterPRCommandLine(vm, cmd, def) < 0) + goto error; + if (enableFips) virCommandAddArg(cmd, "-enable-fips"); diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 31c9da673c..da1fe679fe 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -54,6 +54,11 @@ virCommandPtr qemuBuildCommandLine(virQEMUDriverPtr driver, size_t *nnicindexes, int **nicindexes); +/* Generate the object properties for pr-manager */ +int qemuBuildPRManagerInfoProps(virDomainObjPtr vm, + const virDomainDiskDef *disk, + virJSONValuePtr *propsret, + char **alias); /* Generate the object properties for a secret */ int qemuBuildSecretInfoProps(qemuDomainSecretInfoPtr secinfo, diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 0dc5a88e79..53827d5396 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11983,3 +11983,25 @@ qemuProcessEventFree(struct qemuProcessEvent *event) } VIR_FREE(event); } + + +char * +qemuDomainGetPRSocketPath(virDomainObjPtr vm, + virStoragePRDefPtr pr) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + const char *defaultAlias = NULL; + char *ret = NULL; + + if (!virStoragePRDefIsEnabled(pr)) + return NULL; + + if (virStoragePRDefIsManaged(pr)) { + defaultAlias = qemuDomainGetManagedPRAlias(); + ignore_value(virAsprintf(&ret, "%s/%s.sock", priv->libDir, defaultAlias)); + } else { + ignore_value(VIR_STRDUP(ret, pr->path)); + } + + return ret; +} diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 2c27dfb9f3..612d234113 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -1000,4 +1000,7 @@ qemuDomainDiskCachemodeFlags(int cachemode, bool *direct, bool *noflush); +char * qemuDomainGetPRSocketPath(virDomainObjPtr vm, + virStoragePRDefPtr pr); + #endif /* __QEMU_DOMAIN_H__ */ diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index 9dd5c97642..3bf66f71b7 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -24,6 +24,7 @@ # include "qemu_conf.h" # include "qemu_domain.h" +# include "virstoragefile.h" int qemuProcessPrepareMonitorChr(virDomainChrSourceDefPtr monConfig, const char *domainDir); diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index d0050eca1f..e22dd727fa 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -2041,6 +2041,20 @@ virStoragePRDefIsEqual(virStoragePRDefPtr a, } +bool +virStoragePRDefIsEnabled(virStoragePRDefPtr prd) +{ + return prd && prd->enabled == VIR_TRISTATE_BOOL_YES; +} + + +bool +virStoragePRDefIsManaged(virStoragePRDefPtr prd) +{ + return prd && prd->managed == VIR_TRISTATE_BOOL_YES; +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index 09e9ae73f1..059214a517 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -384,6 +384,8 @@ void virStoragePRDefFormat(virBufferPtr buf, virStoragePRDefPtr prd); bool virStoragePRDefIsEqual(virStoragePRDefPtr a, virStoragePRDefPtr b); +bool virStoragePRDefIsEnabled(virStoragePRDefPtr prd); +bool virStoragePRDefIsManaged(virStoragePRDefPtr prd); virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args new file mode 100644 index 0000000000..dce3fc4105 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args @@ -0,0 +1,38 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i686 \ +-name QEMUGuest1 \ +-S \ +-object pr-manager-helper,id=pr-helper0,\ +path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock \ +-object pr-manager-helper,id=pr-helper-scsi0-0-0-1,\ +path=/path/to/qemu-pr-helper.sock \ +-machine pc,accel=tcg,usb=off,dump-guest-core=off \ +-m 214 \ +-smp 8,sockets=8,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-boot c \ +-device virtio-scsi-pci,id=scsi0,num_queues=8,bus=pci.0,addr=0x3 \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest1,file.pr-manager=pr-helper0,format=raw,\ +if=none,id=drive-scsi0-0-0-0,boot=on \ +-device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\ +drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ +-drive file=/dev/HostVG/QEMUGuest2,file.pr-manager=pr-helper-scsi0-0-0-1,\ +format=raw,if=none,id=drive-scsi0-0-0-1 \ +-device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=1,\ +drive=drive-scsi0-0-0-1,id=scsi0-0-0-1 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 74d930ebe2..ad17a5c333 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -2814,6 +2814,10 @@ mymain(void) QEMU_CAPS_PIIX_DISABLE_S3, QEMU_CAPS_PIIX_DISABLE_S4, QEMU_CAPS_ICH9_USB_EHCI1); + DO_TEST("disk-virtio-scsi-reservations", + QEMU_CAPS_DRIVE_BOOT, QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_SCSI_BLOCK, QEMU_CAPS_PR_MANAGER_HELPER); + /* Test disks with format probing enabled for legacy reasons. * New tests should not go in this section. */ driver.config->allowDiskFormatProbing = true; -- 2.16.1

On 04/23/2018 09:14 AM, Michal Privoznik wrote:
For command line we need two things:
1) -object pr-manager-helper,id=$alias,path=$socketPath 2) -drive file.pr-manager=$alias
In -object pr-manager-helper we tell qemu which socket to connect to, then in -drive file-pr-manager we just reference the object the drive in question should use.
For managed PR helper the alias is always "pr-helper0" and socket path "${vm->priv->libDir}/pr-helper0.sock".
It was decided in reviews to previous versions of this patch that it should not allocate memory in order to simplify code. This promise is not fulfilled though. For instance, qemuBuildPRManagerInfoProps() is an offender.
Last paragraph is irrelevant and partially incorrect trivia for above the ---. I would assume everyone has received review comments that they may not agree with, but that's what they are review comments. Voicing frustration in a commit message is probably not a good thing. Good thing I work at home with usually no one listening 0-).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 2 + src/qemu/qemu_alias.c | 18 +++ src/qemu/qemu_alias.h | 4 + src/qemu/qemu_command.c | 134 +++++++++++++++++++++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 22 ++++ src/qemu/qemu_domain.h | 3 + src/qemu/qemu_process.h | 1 + src/util/virstoragefile.c | 14 +++ src/util/virstoragefile.h | 2 + .../disk-virtio-scsi-reservations.args | 38 ++++++ tests/qemuxml2argvtest.c | 4 + 12 files changed, 247 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 27c5bb5bce..108b44f6f4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2799,7 +2799,9 @@ virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; virStoragePRDefFormat; virStoragePRDefFree; +virStoragePRDefIsEnabled; virStoragePRDefIsEqual; +virStoragePRDefIsManaged; virStoragePRDefParseXML; virStorageSourceBackingStoreClear; virStorageSourceClear; diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c index 95d1e0370a..9a3a92c82d 100644 --- a/src/qemu/qemu_alias.c +++ b/src/qemu/qemu_alias.c @@ -773,3 +773,21 @@ qemuAliasChardevFromDevAlias(const char *devAlias)
return ret; } + + +const char * +qemuDomainGetManagedPRAlias(void) +{ + return "pr-helper0";
This works, IDC but you could have gone with # define QEMU_DOMAIN_MANAGED_PR_ALIAS "pr-helper0" in src/qemu/qemu_alias.h too and used it that way. Ironically later on, we have: #define QEMU_PR_HELPER "/usr/bin/qemu-pr-helper" and use VIR_STRDUP(cfg->prHelperName, QEMU_PR_HELPER) < 0) So why not use the same construct for this alias?
+} + + +char * +qemuDomainGetUnmanagedPRAlias(const virDomainDiskDef *disk) +{ + char *ret; + + ignore_value(virAsprintf(&ret, "pr-helper-%s", disk->info.alias)); + + return ret; +}
[...]
diff --git a/src/qemu/qemu_alias.h b/src/qemu/qemu_alias.h index 8c744138ce..8e391c3f0c 100644 --- a/src/qemu/qemu_alias.h +++ b/src/qemu/qemu_alias.h @@ -92,4 +92,8 @@ char *qemuAliasTLSObjFromSrcAlias(const char *srcAlias) char *qemuAliasChardevFromDevAlias(const char *devAlias) ATTRIBUTE_NONNULL(1);
+const char * qemuDomainGetManagedPRAlias(void);
s/* q/*q/ or as I suggest: # define QEMU_DOMAIN_MANAGED_PR_ALIAS "pr-helper0"
+ +char *qemuDomainGetUnmanagedPRAlias(const virDomainDiskDef *disk); + #endif /* __QEMU_ALIAS_H__*/
[...] Since the decision was made in other reviews to wait until the last patch to turn on the capability in the caps*.xml files, the current .args usage and test config using the DO_TEST macro will have to suffice; however, when we get to that last patch this will need to change in order to follow what is being requested of other upstream posts now too. With no changes to alias, a reluctant: Reviewed-by: John Ferlan <jferlan@redhat.com> a less reluctant one mean change to use the # define construct - I've even attached a diff that worked for me. John

On 04/29/2018 02:16 PM, John Ferlan wrote:
On 04/23/2018 09:14 AM, Michal Privoznik wrote:
For command line we need two things:
1) -object pr-manager-helper,id=$alias,path=$socketPath 2) -drive file.pr-manager=$alias
In -object pr-manager-helper we tell qemu which socket to connect to, then in -drive file-pr-manager we just reference the object the drive in question should use.
For managed PR helper the alias is always "pr-helper0" and socket path "${vm->priv->libDir}/pr-helper0.sock".
It was decided in reviews to previous versions of this patch that it should not allocate memory in order to simplify code. This promise is not fulfilled though. For instance, qemuBuildPRManagerInfoProps() is an offender.
Last paragraph is irrelevant and partially incorrect trivia for above the ---. I would assume everyone has received review comments that they may not agree with, but that's what they are review comments. Voicing frustration in a commit message is probably not a good thing. Good thing I work at home with usually no one listening 0-).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 2 + src/qemu/qemu_alias.c | 18 +++ src/qemu/qemu_alias.h | 4 + src/qemu/qemu_command.c | 134 +++++++++++++++++++++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 22 ++++ src/qemu/qemu_domain.h | 3 + src/qemu/qemu_process.h | 1 + src/util/virstoragefile.c | 14 +++ src/util/virstoragefile.h | 2 + .../disk-virtio-scsi-reservations.args | 38 ++++++ tests/qemuxml2argvtest.c | 4 + 12 files changed, 247 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 27c5bb5bce..108b44f6f4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2799,7 +2799,9 @@ virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; virStoragePRDefFormat; virStoragePRDefFree; +virStoragePRDefIsEnabled; virStoragePRDefIsEqual; +virStoragePRDefIsManaged; virStoragePRDefParseXML; virStorageSourceBackingStoreClear; virStorageSourceClear; diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c index 95d1e0370a..9a3a92c82d 100644 --- a/src/qemu/qemu_alias.c +++ b/src/qemu/qemu_alias.c @@ -773,3 +773,21 @@ qemuAliasChardevFromDevAlias(const char *devAlias)
return ret; } + + +const char * +qemuDomainGetManagedPRAlias(void) +{ + return "pr-helper0";
This works, IDC but you could have gone with
# define QEMU_DOMAIN_MANAGED_PR_ALIAS "pr-helper0"
in src/qemu/qemu_alias.h too and used it that way.
I rather not. Everything else uses a function to get/set aliases. In order to stay consistent I'd rather have this as a function.
Ironically later on, we have:
#define QEMU_PR_HELPER "/usr/bin/qemu-pr-helper"
Do we? Not in the code, but in config.h produced by configure. Well, that's standardized way of defining values in configure. I'm not sure that logic applies to the rest of our code.
and use
VIR_STRDUP(cfg->prHelperName, QEMU_PR_HELPER) < 0)
So why not use the same construct for this alias?
Because QEMU_PR_HELPER is detected at compile time (for the default, can be overridden in qemu.conf later). There is no other way to do it. We can't have configure generating a function like this: qemuGetPrHelperPath() { return "/usr/bin/qemu-pr-helper"; } But we can have our own code do that, esp. if other functions from the family (qemu alias handling functions that is) are functions indeed instead of some #define-s.
+} + + +char * +qemuDomainGetUnmanagedPRAlias(const virDomainDiskDef *disk) +{ + char *ret; + + ignore_value(virAsprintf(&ret, "pr-helper-%s", disk->info.alias)); + + return ret; +}
[...]
diff --git a/src/qemu/qemu_alias.h b/src/qemu/qemu_alias.h index 8c744138ce..8e391c3f0c 100644 --- a/src/qemu/qemu_alias.h +++ b/src/qemu/qemu_alias.h @@ -92,4 +92,8 @@ char *qemuAliasTLSObjFromSrcAlias(const char *srcAlias) char *qemuAliasChardevFromDevAlias(const char *devAlias) ATTRIBUTE_NONNULL(1);
+const char * qemuDomainGetManagedPRAlias(void);
s/* q/*q/
Oh, right. Michal

If qemu-pr-helper is compiled with multipath support the first thing it does is open /dev/mapper/control. Since we're going to be running it inside qemu namespace we need to create it there. Unfortunately, we don't know if it was compiled with or without multipath so we have to create it anyway. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 53827d5396..5ccdeb8e3a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -109,6 +109,7 @@ VIR_ENUM_IMPL(qemuDomainNamespace, QEMU_DOMAIN_NS_LAST, #define PROC_MOUNTS "/proc/mounts" #define DEVPREFIX "/dev/" #define DEV_VFIO "/dev/vfio/vfio" +#define DEVICE_MAPPER_CONTROL_PATH "/dev/mapper/control" struct _qemuDomainLogContext { @@ -10207,6 +10208,11 @@ qemuDomainSetupDisk(virQEMUDriverConfigPtr cfg ATTRIBUTE_UNUSED, goto cleanup; } + /* qemu-pr-helper might require access to /dev/mapper/control. */ + if (virStoragePRDefIsEnabled(disk->src->pr) && + qemuDomainCreateDevice(DEVICE_MAPPER_CONTROL_PATH, data, true) < 0) + goto cleanup; + ret = 0; cleanup: VIR_FREE(dst); @@ -11218,6 +11224,7 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, virStorageSourcePtr next; const char **paths = NULL; size_t npaths = 0; + char *dmPath = NULL; int ret = -1; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) @@ -11234,11 +11241,18 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, goto cleanup; } + /* qemu-pr-helper might require access to /dev/mapper/control. */ + if (virStoragePRDefIsEnabled(src->pr) && + (VIR_STRDUP(dmPath, DEVICE_MAPPER_CONTROL_PATH) < 0 || + VIR_APPEND_ELEMENT_COPY(paths, npaths, dmPath) < 0)) + goto cleanup; + if (qemuDomainNamespaceMknodPaths(vm, paths, npaths) < 0) goto cleanup; ret = 0; cleanup: + VIR_FREE(dmPath); VIR_FREE(paths); return ret; } -- 2.16.1

On 04/23/2018 09:14 AM, Michal Privoznik wrote:
If qemu-pr-helper is compiled with multipath support the first thing it does is open /dev/mapper/control. Since we're going to be running it inside qemu namespace we need to create it there. Unfortunately, we don't know if it was compiled with or without multipath so we have to create it anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 53827d5396..5ccdeb8e3a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -109,6 +109,7 @@ VIR_ENUM_IMPL(qemuDomainNamespace, QEMU_DOMAIN_NS_LAST, #define PROC_MOUNTS "/proc/mounts" #define DEVPREFIX "/dev/" #define DEV_VFIO "/dev/vfio/vfio" +#define DEVICE_MAPPER_CONTROL_PATH "/dev/mapper/control"
struct _qemuDomainLogContext { @@ -10207,6 +10208,11 @@ qemuDomainSetupDisk(virQEMUDriverConfigPtr cfg ATTRIBUTE_UNUSED, goto cleanup; }
+ /* qemu-pr-helper might require access to /dev/mapper/control. */ + if (virStoragePRDefIsEnabled(disk->src->pr) && + qemuDomainCreateDevice(DEVICE_MAPPER_CONTROL_PATH, data, true) < 0) + goto cleanup; +
This is from SetupAllDisks via qemuDomainBuildNamespace... Does it or could it matter that more than one disk would be doing this? Thus having multiple entries? Was too lazy to check.
ret = 0; cleanup: VIR_FREE(dst); @@ -11218,6 +11224,7 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, virStorageSourcePtr next; const char **paths = NULL; size_t npaths = 0; + char *dmPath = NULL; int ret = -1;
if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) @@ -11234,11 +11241,18 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, goto cleanup; }
+ /* qemu-pr-helper might require access to /dev/mapper/control. */ + if (virStoragePRDefIsEnabled(src->pr) && + (VIR_STRDUP(dmPath, DEVICE_MAPPER_CONTROL_PATH) < 0 || + VIR_APPEND_ELEMENT_COPY(paths, npaths, dmPath) < 0)) + goto cleanup; +
... similarly when we add/hotplug a disk we could be repeating that path. If the duplication is OK or handled in some way that wasn't obvious, Reviewed-by: John Ferlan <jferlan@redhat.com> If you need to adjust the logic to handle this specially, then obviously I'd ask to wait for that OK. I wonder if the @prDaemonRunning logic added later on would suffice? John
if (qemuDomainNamespaceMknodPaths(vm, paths, npaths) < 0) goto cleanup;
ret = 0; cleanup: + VIR_FREE(dmPath); VIR_FREE(paths); return ret; }

On 04/29/2018 02:20 PM, John Ferlan wrote:
On 04/23/2018 09:14 AM, Michal Privoznik wrote:
If qemu-pr-helper is compiled with multipath support the first thing it does is open /dev/mapper/control. Since we're going to be running it inside qemu namespace we need to create it there. Unfortunately, we don't know if it was compiled with or without multipath so we have to create it anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 53827d5396..5ccdeb8e3a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -109,6 +109,7 @@ VIR_ENUM_IMPL(qemuDomainNamespace, QEMU_DOMAIN_NS_LAST, #define PROC_MOUNTS "/proc/mounts" #define DEVPREFIX "/dev/" #define DEV_VFIO "/dev/vfio/vfio" +#define DEVICE_MAPPER_CONTROL_PATH "/dev/mapper/control"
struct _qemuDomainLogContext { @@ -10207,6 +10208,11 @@ qemuDomainSetupDisk(virQEMUDriverConfigPtr cfg ATTRIBUTE_UNUSED, goto cleanup; }
+ /* qemu-pr-helper might require access to /dev/mapper/control. */ + if (virStoragePRDefIsEnabled(disk->src->pr) && + qemuDomainCreateDevice(DEVICE_MAPPER_CONTROL_PATH, data, true) < 0) + goto cleanup; +
This is from SetupAllDisks via qemuDomainBuildNamespace... Does it or could it matter that more than one disk would be doing this? Thus having multiple entries? Was too lazy to check.
It doesn't matter. The code works well with already existing files. It has to. Imagine two disks, both with the same backing file: diskA = /dev/diskA -> /dev/diskBacking diskB = /dev/diskB -> /dev/diskBacking Now when diskA is created in the namespace both diskA and diskBacking files are created. Then, when diskB is created we can't fail if diskBacking already exists. Therefore the code is written so that it accepts pre-existing files.
ret = 0; cleanup: VIR_FREE(dst); @@ -11218,6 +11224,7 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, virStorageSourcePtr next; const char **paths = NULL; size_t npaths = 0; + char *dmPath = NULL; int ret = -1;
if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) @@ -11234,11 +11241,18 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, goto cleanup; }
+ /* qemu-pr-helper might require access to /dev/mapper/control. */ + if (virStoragePRDefIsEnabled(src->pr) && + (VIR_STRDUP(dmPath, DEVICE_MAPPER_CONTROL_PATH) < 0 || + VIR_APPEND_ELEMENT_COPY(paths, npaths, dmPath) < 0)) + goto cleanup; +
... similarly when we add/hotplug a disk we could be repeating that path.
If the duplication is OK or handled in some way that wasn't obvious,
Take a look at qemuDomainCreateDevice() and search for errno == EEXIST lines. They are always followed with ret = 0;
Reviewed-by: John Ferlan <jferlan@redhat.com>
Thanks, Michal

Just like in previous commit, qemu-pr-helper might want to open /dev/mapper/control under certain circumstances. Therefore we have to allow it in cgroups. The change virdevmapper.c might look spurious but it isn't. After 6dd84f6850ca437 any path that we're allowing in deivces CGroup is subject to virDevMapperGetTargets() inspection. And libdevmapper returns ENXIO for the path from subject. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_cgroup.c | 33 ++++++++++++++++++++++++++++++--- src/util/virdevmapper.c | 8 +++++++- 2 files changed, 37 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index d88eb7881f..546a4c8e63 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -114,6 +114,8 @@ qemuSetupImagePathCgroup(virDomainObjPtr vm, } +#define DEVICE_MAPPER_CONTROL_PATH "/dev/mapper/control" + static int qemuSetupImageCgroupInternal(virDomainObjPtr vm, virStorageSourcePtr src, @@ -125,6 +127,10 @@ qemuSetupImageCgroupInternal(virDomainObjPtr vm, return 0; } + if (virStoragePRDefIsManaged(src->pr) && + qemuSetupImagePathCgroup(vm, DEVICE_MAPPER_CONTROL_PATH, false) < 0) + return -1; + return qemuSetupImagePathCgroup(vm, src->path, src->readonly || forceReadonly); } @@ -142,9 +148,8 @@ qemuTeardownImageCgroup(virDomainObjPtr vm, virStorageSourcePtr src) { qemuDomainObjPrivatePtr priv = vm->privateData; - int perms = VIR_CGROUP_DEVICE_READ | - VIR_CGROUP_DEVICE_WRITE | - VIR_CGROUP_DEVICE_MKNOD; + int perms = VIR_CGROUP_DEVICE_RWM; + size_t i; int ret; if (!virCgroupHasController(priv->cgroup, @@ -157,6 +162,28 @@ qemuTeardownImageCgroup(virDomainObjPtr vm, return 0; } + for (i = 0; i < vm->def->ndisks; i++) { + virStorageSourcePtr diskSrc = vm->def->disks[i]->src; + + if (src == diskSrc) + continue; + + if (virStoragePRDefIsManaged(diskSrc->pr)) + break; + } + + if (i == vm->def->ndisks) { + VIR_DEBUG("Disabling device mapper control"); + ret = virCgroupDenyDevicePath(priv->cgroup, + DEVICE_MAPPER_CONTROL_PATH, perms, true); + virDomainAuditCgroupPath(vm, priv->cgroup, "deny", + DEVICE_MAPPER_CONTROL_PATH, + virCgroupGetDevicePermsString(perms), ret); + if (ret < 0) + return ret; + } + + VIR_DEBUG("Deny path %s", src->path); ret = virCgroupDenyDevicePath(priv->cgroup, src->path, perms, true); diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index d2c25af003..ef4b1e480a 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -101,8 +101,14 @@ virDevMapperGetTargetsImpl(const char *path, dm_task_no_open_count(dmt); - if (!dm_task_run(dmt)) + if (!dm_task_run(dmt)) { + if (errno == ENXIO) { + /* In some cases devmapper realizes this late device + * is not managed by it. */ + ret = 0; + } goto cleanup; + } if (!dm_task_get_info(dmt, &info)) goto cleanup; -- 2.16.1

On 04/23/2018 09:14 AM, Michal Privoznik wrote:
Just like in previous commit, qemu-pr-helper might want to open /dev/mapper/control under certain circumstances. Therefore we have to allow it in cgroups.
The change virdevmapper.c might look spurious but it isn't. After 6dd84f6850ca437 any path that we're allowing in deivces CGroup is
devices
subject to virDevMapperGetTargets() inspection. And libdevmapper returns ENXIO for the path from subject.
^^^ IMO: This explanation (minus the commit id reference) belongs where you check for ENXIO. As a reader of code I don't necessarily check commit messages for reasons a check exists.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_cgroup.c | 33 ++++++++++++++++++++++++++++++--- src/util/virdevmapper.c | 8 +++++++- 2 files changed, 37 insertions(+), 4 deletions(-)
I would say a similar echo here as in the NS patch - since the subsequent patch will have a way to know that PR is running/started, then why not use that knowledge or similar logic to helper determine whether we need to add NS/Cgroup and whether we need to remove the cgroup reference (if that even matters by that point in time).
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index d88eb7881f..546a4c8e63 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -114,6 +114,8 @@ qemuSetupImagePathCgroup(virDomainObjPtr vm, }
+#define DEVICE_MAPPER_CONTROL_PATH "/dev/mapper/control" + static int qemuSetupImageCgroupInternal(virDomainObjPtr vm, virStorageSourcePtr src, @@ -125,6 +127,10 @@ qemuSetupImageCgroupInternal(virDomainObjPtr vm, return 0; }
+ if (virStoragePRDefIsManaged(src->pr) && + qemuSetupImagePathCgroup(vm, DEVICE_MAPPER_CONTROL_PATH, false) < 0) + return -1; + return qemuSetupImagePathCgroup(vm, src->path, src->readonly || forceReadonly); }
@@ -142,9 +148,8 @@ qemuTeardownImageCgroup(virDomainObjPtr vm, virStorageSourcePtr src) { qemuDomainObjPrivatePtr priv = vm->privateData; - int perms = VIR_CGROUP_DEVICE_READ | - VIR_CGROUP_DEVICE_WRITE | - VIR_CGROUP_DEVICE_MKNOD; + int perms = VIR_CGROUP_DEVICE_RWM; + size_t i; int ret;
if (!virCgroupHasController(priv->cgroup, @@ -157,6 +162,28 @@ qemuTeardownImageCgroup(virDomainObjPtr vm, return 0; }
+ for (i = 0; i < vm->def->ndisks; i++) { + virStorageSourcePtr diskSrc = vm->def->disks[i]->src; + + if (src == diskSrc) + continue; + + if (virStoragePRDefIsManaged(diskSrc->pr)) + break; + } + + if (i == vm->def->ndisks) { + VIR_DEBUG("Disabling device mapper control"); + ret = virCgroupDenyDevicePath(priv->cgroup, + DEVICE_MAPPER_CONTROL_PATH, perms, true); + virDomainAuditCgroupPath(vm, priv->cgroup, "deny", + DEVICE_MAPPER_CONTROL_PATH, + virCgroupGetDevicePermsString(perms), ret); + if (ret < 0) + return ret; + } + + VIR_DEBUG("Deny path %s", src->path);
ret = virCgroupDenyDevicePath(priv->cgroup, src->path, perms, true); diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index d2c25af003..ef4b1e480a 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -101,8 +101,14 @@ virDevMapperGetTargetsImpl(const char *path,
dm_task_no_open_count(dmt);
- if (!dm_task_run(dmt)) + if (!dm_task_run(dmt)) { + if (errno == ENXIO) { + /* In some cases devmapper realizes this late device + * is not managed by it. */
So my question here is that is "some cases" limited to just one device or would it be multiple? Let's be explicit - better to understand now than chase later. I think one only consider Laine's chasing in nwfilter to realize that if we have to do something to handle some special condition as a result of how we use something, then documenting it for future bug chaster to find *in code* as opposed to *in commit messages* may actually help diagnose problems quicker. So for the code and assuming the comments get rearranged a bit, Reviewed-by: John Ferlan <jferlan@redhat.com> John
+ ret = 0; + } goto cleanup; + }
if (!dm_task_get_info(dmt, &info)) goto cleanup;

On 04/29/2018 02:34 PM, John Ferlan wrote:
On 04/23/2018 09:14 AM, Michal Privoznik wrote:
Just like in previous commit, qemu-pr-helper might want to open /dev/mapper/control under certain circumstances. Therefore we have to allow it in cgroups.
The change virdevmapper.c might look spurious but it isn't. After 6dd84f6850ca437 any path that we're allowing in deivces CGroup is
devices
subject to virDevMapperGetTargets() inspection. And libdevmapper returns ENXIO for the path from subject.
^^^ IMO: This explanation (minus the commit id reference) belongs where you check for ENXIO. As a reader of code I don't necessarily check commit messages for reasons a check exists.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_cgroup.c | 33 ++++++++++++++++++++++++++++++--- src/util/virdevmapper.c | 8 +++++++- 2 files changed, 37 insertions(+), 4 deletions(-)
I would say a similar echo here as in the NS patch - since the subsequent patch will have a way to know that PR is running/started, then why not use that knowledge or similar logic to helper determine whether we need to add NS/Cgroup and whether we need to remove the cgroup reference (if that even matters by that point in time).
The check is there ...
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index d88eb7881f..546a4c8e63 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -114,6 +114,8 @@ qemuSetupImagePathCgroup(virDomainObjPtr vm, }
+#define DEVICE_MAPPER_CONTROL_PATH "/dev/mapper/control" + static int qemuSetupImageCgroupInternal(virDomainObjPtr vm, virStorageSourcePtr src, @@ -125,6 +127,10 @@ qemuSetupImageCgroupInternal(virDomainObjPtr vm, return 0; }
+ if (virStoragePRDefIsManaged(src->pr) &&
.. here. If PR is managed, i.e. qemu-pr-helper is started by libvirt, then it will be also placed into the same CGroup as qemu. Therefore, we need to allow access to /dev/mapper/control. But, it PR is unmanaged, then the helper process is started by somebody else (we don't care who), and therefore it's their responsibility to allow that path in CGroups (if they are placing the helper into any). The only caveat here is that qemu-pr-helper might access the path only sometimes, depending whether it was compiled with multipath support or not. But that's something libvirt can't know. Therefore it has to assume yes and allow the path.
+ qemuSetupImagePathCgroup(vm, DEVICE_MAPPER_CONTROL_PATH, false) < 0) + return -1; + return qemuSetupImagePathCgroup(vm, src->path, src->readonly || forceReadonly); }
@@ -142,9 +148,8 @@ qemuTeardownImageCgroup(virDomainObjPtr vm, virStorageSourcePtr src) { qemuDomainObjPrivatePtr priv = vm->privateData; - int perms = VIR_CGROUP_DEVICE_READ | - VIR_CGROUP_DEVICE_WRITE | - VIR_CGROUP_DEVICE_MKNOD; + int perms = VIR_CGROUP_DEVICE_RWM; + size_t i; int ret;
if (!virCgroupHasController(priv->cgroup, @@ -157,6 +162,28 @@ qemuTeardownImageCgroup(virDomainObjPtr vm, return 0; }
+ for (i = 0; i < vm->def->ndisks; i++) { + virStorageSourcePtr diskSrc = vm->def->disks[i]->src; + + if (src == diskSrc) + continue; + + if (virStoragePRDefIsManaged(diskSrc->pr)) + break; + } + + if (i == vm->def->ndisks) { + VIR_DEBUG("Disabling device mapper control"); + ret = virCgroupDenyDevicePath(priv->cgroup, + DEVICE_MAPPER_CONTROL_PATH, perms, true); + virDomainAuditCgroupPath(vm, priv->cgroup, "deny", + DEVICE_MAPPER_CONTROL_PATH, + virCgroupGetDevicePermsString(perms), ret); + if (ret < 0) + return ret; + } + + VIR_DEBUG("Deny path %s", src->path);
ret = virCgroupDenyDevicePath(priv->cgroup, src->path, perms, true); diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index d2c25af003..ef4b1e480a 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -101,8 +101,14 @@ virDevMapperGetTargetsImpl(const char *path,
dm_task_no_open_count(dmt);
- if (!dm_task_run(dmt)) + if (!dm_task_run(dmt)) { + if (errno == ENXIO) { + /* In some cases devmapper realizes this late device + * is not managed by it. */
So my question here is that is "some cases" limited to just one device or would it be multiple?
To be fair I haven't tested any other devices. Okay, I'll fix the comment.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Thanks, Michal

Just like we allow users overriding path to bridge-helper detected at compile time we can allow them to override path to qemu-pr-helper. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- m4/virt-driver-qemu.m4 | 5 +++++ src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 4 ++++ src/qemu/qemu_conf.c | 7 ++++++- src/qemu/qemu_conf.h | 1 + src/qemu/test_libvirtd_qemu.aug.in | 1 + 6 files changed, 18 insertions(+), 1 deletion(-) diff --git a/m4/virt-driver-qemu.m4 b/m4/virt-driver-qemu.m4 index b9bafdab90..80e1d3ad46 100644 --- a/m4/virt-driver-qemu.m4 +++ b/m4/virt-driver-qemu.m4 @@ -57,6 +57,11 @@ AC_DEFUN([LIBVIRT_DRIVER_CHECK_QEMU], [ [/usr/libexec:/usr/lib/qemu:/usr/lib]) AC_DEFINE_UNQUOTED([QEMU_BRIDGE_HELPER], ["$QEMU_BRIDGE_HELPER"], [QEMU bridge helper]) + AC_PATH_PROG([QEMU_PR_HELPER], [qemu-pr-helper], + [/usr/bin/qemu-pr-helper], + [/usr/bin:/usr/libexec]) + AC_DEFINE_UNQUOTED([QEMU_PR_HELPER], ["$QEMU_PR_HELPER"], + [QEMU PR helper]) ]) AC_DEFUN([LIBVIRT_DRIVER_RESULT_QEMU], [ diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug index c19bf3a43a..2dc16e91fd 100644 --- a/src/qemu/libvirtd_qemu.aug +++ b/src/qemu/libvirtd_qemu.aug @@ -86,6 +86,7 @@ module Libvirtd_qemu = let process_entry = str_entry "hugetlbfs_mount" | bool_entry "clear_emulator_capabilities" | str_entry "bridge_helper" + | str_entry "pr_helper" | bool_entry "set_process_name" | int_entry "max_processes" | int_entry "max_files" diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 34441857bd..31738ff19c 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -779,3 +779,7 @@ # This directory is used for memoryBacking source if configured as file. # NOTE: big files will be stored here #memory_backing_dir = "/var/lib/libvirt/qemu/ram" + +# Path to the SCSI persistent reservations helper. This helper is +# used whenever <reservations/> are enabled for SCSI LUN devices. +#pr_helper = "/usr/bin/qemu-pr-helper" diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index bfbb572f01..277ab833a8 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -302,7 +302,8 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) goto error; } - if (VIR_STRDUP(cfg->bridgeHelperName, QEMU_BRIDGE_HELPER) < 0) + if (VIR_STRDUP(cfg->bridgeHelperName, QEMU_BRIDGE_HELPER) < 0 || + VIR_STRDUP(cfg->prHelperName, QEMU_PR_HELPER) < 0) goto error; cfg->clearEmulatorCapabilities = true; @@ -387,6 +388,7 @@ static void virQEMUDriverConfigDispose(void *obj) } VIR_FREE(cfg->hugetlbfs); VIR_FREE(cfg->bridgeHelperName); + VIR_FREE(cfg->prHelperName); VIR_FREE(cfg->saveImageFormat); VIR_FREE(cfg->dumpImageFormat); @@ -754,6 +756,9 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg, if (virConfGetValueString(conf, "bridge_helper", &cfg->bridgeHelperName) < 0) goto cleanup; + if (virConfGetValueString(conf, "pr_helper", &cfg->prHelperName) < 0) + goto cleanup; + if (virConfGetValueBool(conf, "mac_filter", &cfg->macFilter) < 0) goto cleanup; diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index e1ad5463f3..7a63780c48 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -153,6 +153,7 @@ struct _virQEMUDriverConfig { size_t nhugetlbfs; char *bridgeHelperName; + char *prHelperName; bool macFilter; diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qemu.aug.in index 688e5b9fda..95885e9f06 100644 --- a/src/qemu/test_libvirtd_qemu.aug.in +++ b/src/qemu/test_libvirtd_qemu.aug.in @@ -100,3 +100,4 @@ module Test_libvirtd_qemu = { "1" = "mount" } } { "memory_backing_dir" = "/var/lib/libvirt/qemu/ram" } +{ "pr_helper" = "/usr/bin/qemu-pr-helper" } -- 2.16.1

Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one). The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec(). Again, despite demand to not store anything in status XML I have to take my chances and store 'priv->prDaemonRunning' (at least) in status XML. Otherwise we might forget whether we started pr-helper after libvirtd restart. What happens if we change default managed alias or default socket is undefined for now, because we don't store them in status XML. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 22 +++++ src/qemu/qemu_domain.h | 2 + src/qemu/qemu_process.c | 232 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 256 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 5ccdeb8e3a..fbe6a0f332 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2040,6 +2040,15 @@ qemuDomainObjPrivateXMLFormatAllowReboot(virBufferPtr buf, } +static void +qemuDomainObjPrivateXMLFormatPR(virBufferPtr buf, + qemuDomainObjPrivatePtr priv) +{ + if (priv->prDaemonRunning) + virBufferAddLit(buf, "<prDaemon running='1'/>\n"); +} + + static int qemuDomainObjPrivateXMLFormatJob(virBufferPtr buf, virDomainObjPtr vm, @@ -2180,6 +2189,8 @@ qemuDomainObjPrivateXMLFormat(virBufferPtr buf, qemuDomainObjPrivateXMLFormatAllowReboot(buf, priv->allowReboot); + qemuDomainObjPrivateXMLFormatPR(buf, priv); + if (qemuDomainObjPrivateXMLFormatBlockjobs(buf, vm) < 0) return -1; @@ -2323,6 +2334,15 @@ qemuDomainObjPrivateXMLParseAllowReboot(xmlXPathContextPtr ctxt, } +static void +qemuDomainObjPrivateXMLParsePR(xmlXPathContextPtr ctxt, + bool *prDaemonRunning) +{ + *prDaemonRunning = virXPathBoolean("count(./prDaemon[@running = '1']) > 0", + ctxt) > 0; +} + + static int qemuDomainObjPrivateXMLParseJob(virDomainObjPtr vm, qemuDomainObjPrivatePtr priv, @@ -2572,6 +2592,8 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, qemuDomainObjPrivateXMLParseAllowReboot(ctxt, &priv->allowReboot); + qemuDomainObjPrivateXMLParsePR(ctxt, &priv->prDaemonRunning); + if (qemuDomainObjPrivateXMLParseBlockjobs(priv, ctxt) < 0) goto error; diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 612d234113..b837902e8d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -342,6 +342,8 @@ struct _qemuDomainObjPrivate { /* Migration capabilities. Rechecked on reconnect, not to be saved in * private XML. */ virBitmapPtr migrationCaps; + + bool prDaemonRunning; }; # define QEMU_DOMAIN_PRIVATE(vm) \ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 6a5262ae46..36cfa438ed 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2555,6 +2555,231 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, } +static char * +qemuProcessBuildPRHelperPidfilePath(virDomainObjPtr vm, + const char *prdAlias) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + return virPidFileBuildPath(priv->libDir, prdAlias); +} + + +static void +qemuProcessKillPRDaemon(virDomainObjPtr vm) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virErrorPtr orig_err; + const char *prdAlias; + char *pidfile; + + if (!priv->prDaemonRunning) + return; + + if (!(prdAlias = qemuDomainGetManagedPRAlias())) { + VIR_WARN("Unable to get default PR manager alias"); + return; + } + + if (!(pidfile = qemuProcessBuildPRHelperPidfilePath(vm, prdAlias))) { + VIR_WARN("Unable to construct pr-helper pidfile path"); + return; + } + + virErrorPreserveLast(&orig_err); + if (virPidFileForceCleanupPath(pidfile) < 0) { + VIR_WARN("Unable to kill pr-helper process"); + } else { + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove stale pidfile %s"), + pidfile); + } else { + priv->prDaemonRunning = false; + } + } + virErrorRestore(&orig_err); + + VIR_FREE(pidfile); +} + + +static int +qemuProcessStartPRDaemonHook(void *opaque) +{ + virDomainObjPtr vm = opaque; + size_t i, nfds = 0; + int *fds = NULL; + int ret = -1; + + if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0) + return ret; + + if (nfds > 0 && + virProcessSetNamespaces(nfds, fds) < 0) + goto cleanup; + + ret = 0; + cleanup: + for (i = 0; i < nfds; i++) + VIR_FORCE_CLOSE(fds[i]); + VIR_FREE(fds); + return ret; +} + + +static int +qemuProcessStartPRDaemon(virDomainObjPtr vm, + const virDomainDiskDef *disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + virQEMUDriverConfigPtr cfg; + const char *prdAlias; + int errfd = -1; + char *pidfile = NULL; + int pidfd = -1; + char *socketPath = NULL; + pid_t cpid = -1; + virCommandPtr cmd = NULL; + virTimeBackOffVar timebackoff; + const unsigned long long timeout = 500000; /* ms */ + int ret = -1; + + if (!virStoragePRDefIsManaged(disk->src->pr) || + priv->prDaemonRunning) + return 0; + + cfg = virQEMUDriverGetConfig(driver); + + if (!virFileIsExecutable(cfg->prHelperName)) { + virReportSystemError(errno, _("'%s' is not a suitable pr helper"), + cfg->prHelperName); + goto cleanup; + } + + prdAlias = qemuDomainGetManagedPRAlias(); + + if (!(pidfile = qemuProcessBuildPRHelperPidfilePath(vm, prdAlias))) + goto cleanup; + + /* Just try to acquire. Dummy pid will be replaced later */ + if ((pidfd = virPidFileAcquirePath(pidfile, false, -1)) < 0) + goto cleanup; + + if (!(socketPath = qemuDomainGetPRSocketPath(vm, disk->src->pr))) + goto cleanup; + + /* Remove stale socket */ + if (unlink(socketPath) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove stale socket path: %s"), + socketPath); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(cfg->prHelperName, + "-k", socketPath, + "-f", pidfile, + NULL))) + goto cleanup; + + virCommandDaemonize(cmd); + /* We want our virCommand to write child PID into the pidfile + * so that we can read it even before exec(). */ + virCommandSetPidFile(cmd, pidfile); + virCommandSetErrorFD(cmd, &errfd); + + /* Place the process into the same namespace and cgroup as + * qemu (so that it shares the same view of the system). */ + virCommandSetPreExecHook(cmd, qemuProcessStartPRDaemonHook, vm); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (virPidFileReadPath(pidfile, &cpid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pr helper %s didn't show up"), prdAlias); + goto cleanup; + } + + if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + char errbuf[1024] = { 0 }; + + if (virFileExists(socketPath)) + break; + + if (virProcessKill(cpid, 0) == 0) + continue; + + if (saferead(errfd, errbuf, sizeof(errbuf) - 1) < 0) { + virReportSystemError(errno, + _("pr helper %s died unexpectedly"), prdAlias); + } else { + virReportError(VIR_ERR_OPERATION_FAILED, + _("pr helper died and reported: %s"), errbuf); + } + goto cleanup; + } + + if (!virFileExists(socketPath)) { + virReportError(VIR_ERR_OPERATION_TIMEOUT, "%s", + _("pr helper socked did not show up")); + goto cleanup; + } + + if (priv->cgroup && + virCgroupAddMachineTask(priv->cgroup, cpid) < 0) + goto cleanup; + + if (qemuSecurityDomainSetPathLabel(driver->securityManager, + vm->def, socketPath, true) < 0) + goto cleanup; + + priv->prDaemonRunning = true; + ret = 1; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + if (cpid >= 0) + virProcessKillPainfully(cpid, true); + if (pidfile) + unlink(pidfile); + } + virCommandFree(cmd); + VIR_FREE(socketPath); + VIR_FORCE_CLOSE(pidfd); + VIR_FREE(pidfile); + VIR_FORCE_CLOSE(errfd); + virObjectUnref(cfg); + return ret; +} + + +static int +qemuProcessMaybeStartPRDaemon(virDomainObjPtr vm) +{ + size_t i; + int rv; + + for (i = 0; i < vm->def->ndisks; i++) { + const virDomainDiskDef *disk = vm->def->disks[i]; + + if ((rv = qemuProcessStartPRDaemon(vm, disk)) < 0) + return -1; + + if (rv > 0) + return 1; + } + + return 0; +} + + static int qemuProcessInitPasswords(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -6071,6 +6296,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessResctrlCreate(driver, vm) < 0) goto cleanup; + VIR_DEBUG("Setting up PR daemon"); + if (qemuProcessMaybeStartPRDaemon(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting domain security labels"); if (qemuSecuritySetAllLabel(driver, vm, @@ -6598,6 +6827,9 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove the master key */ qemuDomainMasterKeyRemove(priv); + /* Do this before we delete the tree and remove pidfile. */ + qemuProcessKillPRDaemon(vm); + virFileDeleteTree(priv->libDir); virFileDeleteTree(priv->channelTargetDir); -- 2.16.1

On 04/23/2018 09:14 AM, Michal Privoznik wrote:
Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one). The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec().
Again, despite demand to not store anything in status XML I have to take my chances and store 'priv->prDaemonRunning' (at least) in status XML. Otherwise we might forget whether we started pr-helper after libvirtd restart.
And again airing review grievances in commit messages is really not something that should be done... BTW: This "logic" perhaps answers concerns I had in patches 6 and 7 regarding how to determine if something is running already and whether it's necessary to add to NS and cgroup or in the case of stopping, then removing from the cgroup... whether removing from NS works or not, I have no idea.
What happens if we change default managed alias or default socket is undefined for now, because we don't store them in status XML.
Do you honestly believe this will happen in the future? The alias is a somewhat dynamic relationship between libvirt and qemu for the purpose of describing an object so that some future libvirt can find that object using the same alias. If something in the future changes such that there could be multiple objects, then we know we can query a running qemu for the current alias in order to retrieve the object. The future would then need to generate aliases based upon some new naming scheme and we can handle it at that time. For now, there is a one-to-one relationship to worry about. Likewise, for the socket path, if something changes in the future which then has multiple socket paths, then those paths would need to have something unique included. If we ourselves decide to change the name of the alias, then we have to handle fallout. If the pr-helper socket path changes either by name or path to it, then we still know the old path. Whomever makes a change to the path would be required to handle all those current consumers for a possibly running domain. If you really believe this can/will happen, then lets go back to patch 4 and revisit the statement : "For managed PR helper the alias is always "pr-helper0" and socket path "${vm->priv->libDir}/pr-helper0.sock"."
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 22 +++++ src/qemu/qemu_domain.h | 2 + src/qemu/qemu_process.c | 232 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 256 insertions(+)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 5ccdeb8e3a..fbe6a0f332 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2040,6 +2040,15 @@ qemuDomainObjPrivateXMLFormatAllowReboot(virBufferPtr buf, }
+static void +qemuDomainObjPrivateXMLFormatPR(virBufferPtr buf, + qemuDomainObjPrivatePtr priv) +{ + if (priv->prDaemonRunning) + virBufferAddLit(buf, "<prDaemon running='1'/>\n"); +}
Why not follow the existing @fakeReboot? or chardevStdioLogd? Is there belief that at some point in the future 'running' could be 0 or 1? Or could this be a tristate? If there is, then it shouldn't be a boolean in the .h file.
+ + static int qemuDomainObjPrivateXMLFormatJob(virBufferPtr buf, virDomainObjPtr vm, @@ -2180,6 +2189,8 @@ qemuDomainObjPrivateXMLFormat(virBufferPtr buf,
qemuDomainObjPrivateXMLFormatAllowReboot(buf, priv->allowReboot);
+ qemuDomainObjPrivateXMLFormatPR(buf, priv); + if (qemuDomainObjPrivateXMLFormatBlockjobs(buf, vm) < 0) return -1;
@@ -2323,6 +2334,15 @@ qemuDomainObjPrivateXMLParseAllowReboot(xmlXPathContextPtr ctxt, }
+static void +qemuDomainObjPrivateXMLParsePR(xmlXPathContextPtr ctxt, + bool *prDaemonRunning) +{ + *prDaemonRunning = virXPathBoolean("count(./prDaemon[@running = '1']) > 0", + ctxt) > 0; +} + + static int qemuDomainObjPrivateXMLParseJob(virDomainObjPtr vm, qemuDomainObjPrivatePtr priv, @@ -2572,6 +2592,8 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt,
qemuDomainObjPrivateXMLParseAllowReboot(ctxt, &priv->allowReboot);
+ qemuDomainObjPrivateXMLParsePR(ctxt, &priv->prDaemonRunning); + if (qemuDomainObjPrivateXMLParseBlockjobs(priv, ctxt) < 0) goto error;
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 612d234113..b837902e8d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -342,6 +342,8 @@ struct _qemuDomainObjPrivate { /* Migration capabilities. Rechecked on reconnect, not to be saved in * private XML. */ virBitmapPtr migrationCaps; +
Similar to the other recent additions, please add a comment to describe what the purpose is.
+ bool prDaemonRunning; };
# define QEMU_DOMAIN_PRIVATE(vm) \ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 6a5262ae46..36cfa438ed 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2555,6 +2555,231 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, }
+static char * +qemuProcessBuildPRHelperPidfilePath(virDomainObjPtr vm, + const char *prdAlias) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + return virPidFileBuildPath(priv->libDir, prdAlias); +} + + +static void +qemuProcessKillPRDaemon(virDomainObjPtr vm) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virErrorPtr orig_err; + const char *prdAlias; + char *pidfile; + + if (!priv->prDaemonRunning) + return; + + if (!(prdAlias = qemuDomainGetManagedPRAlias())) { + VIR_WARN("Unable to get default PR manager alias"); + return; + }
This makes zero sense as it cannot be NULL... Much easier to replace prdAlias with QEMU_DOMAIN_MANAGED_PR_ALIAS
+ + if (!(pidfile = qemuProcessBuildPRHelperPidfilePath(vm, prdAlias))) { + VIR_WARN("Unable to construct pr-helper pidfile path"); + return; + } + + virErrorPreserveLast(&orig_err); + if (virPidFileForceCleanupPath(pidfile) < 0) { + VIR_WARN("Unable to kill pr-helper process"); + } else { + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove stale pidfile %s"), + pidfile); + } else { + priv->prDaemonRunning = false; + } + } + virErrorRestore(&orig_err); + + VIR_FREE(pidfile); +} + + +static int +qemuProcessStartPRDaemonHook(void *opaque) +{ + virDomainObjPtr vm = opaque; + size_t i, nfds = 0; + int *fds = NULL; + int ret = -1; + + if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0) + return ret; + + if (nfds > 0 && + virProcessSetNamespaces(nfds, fds) < 0) + goto cleanup; +
hmmmm... so when we hot plug if not already running, then this would be run, right? So this sets up NS appropriately and thus the cgroup magic as part of that... So not "obvious" from prior to patches... Maybe it's just an ordering and thought thing, but I do feel better now about the previous 2 patches and whether they can be run multiple times or not as the case seems to be.
+ ret = 0; + cleanup: + for (i = 0; i < nfds; i++) + VIR_FORCE_CLOSE(fds[i]); + VIR_FREE(fds); + return ret; +} + + +static int +qemuProcessStartPRDaemon(virDomainObjPtr vm, + const virDomainDiskDef *disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + virQEMUDriverConfigPtr cfg; + const char *prdAlias; + int errfd = -1; + char *pidfile = NULL; + int pidfd = -1; + char *socketPath = NULL; + pid_t cpid = -1; + virCommandPtr cmd = NULL; + virTimeBackOffVar timebackoff; + const unsigned long long timeout = 500000; /* ms */ + int ret = -1; + + if (!virStoragePRDefIsManaged(disk->src->pr) || + priv->prDaemonRunning) + return 0; + + cfg = virQEMUDriverGetConfig(driver); + + if (!virFileIsExecutable(cfg->prHelperName)) { + virReportSystemError(errno, _("'%s' is not a suitable pr helper"), + cfg->prHelperName); + goto cleanup; + } + + prdAlias = qemuDomainGetManagedPRAlias(); + + if (!(pidfile = qemuProcessBuildPRHelperPidfilePath(vm, prdAlias))) + goto cleanup; + + /* Just try to acquire. Dummy pid will be replaced later */ + if ((pidfd = virPidFileAcquirePath(pidfile, false, -1)) < 0) + goto cleanup; + + if (!(socketPath = qemuDomainGetPRSocketPath(vm, disk->src->pr))) + goto cleanup; + + /* Remove stale socket */ + if (unlink(socketPath) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove stale socket path: %s"), + socketPath); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(cfg->prHelperName, + "-k", socketPath, + "-f", pidfile, + NULL))) + goto cleanup; + + virCommandDaemonize(cmd); + /* We want our virCommand to write child PID into the pidfile + * so that we can read it even before exec(). */ + virCommandSetPidFile(cmd, pidfile); + virCommandSetErrorFD(cmd, &errfd); + + /* Place the process into the same namespace and cgroup as + * qemu (so that it shares the same view of the system). */ + virCommandSetPreExecHook(cmd, qemuProcessStartPRDaemonHook, vm); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (virPidFileReadPath(pidfile, &cpid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pr helper %s didn't show up"), prdAlias);
Honestly the alias output has nothing to do with the message - seems like it's here to prove a point. If anything what's more important here is 'cfg->prHelperName' because that's what died not the alias.
+ goto cleanup; + } + + if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + char errbuf[1024] = { 0 }; + + if (virFileExists(socketPath)) + break; + + if (virProcessKill(cpid, 0) == 0) + continue; + + if (saferead(errfd, errbuf, sizeof(errbuf) - 1) < 0) { + virReportSystemError(errno, + _("pr helper %s died unexpectedly"), prdAlias);
ditto
+ } else { + virReportError(VIR_ERR_OPERATION_FAILED, + _("pr helper died and reported: %s"), errbuf); + } + goto cleanup; + } + + if (!virFileExists(socketPath)) { + virReportError(VIR_ERR_OPERATION_TIMEOUT, "%s", + _("pr helper socked did not show up")); + goto cleanup; + } + + if (priv->cgroup && + virCgroupAddMachineTask(priv->cgroup, cpid) < 0) + goto cleanup; + + if (qemuSecurityDomainSetPathLabel(driver->securityManager, + vm->def, socketPath, true) < 0) + goto cleanup; + + priv->prDaemonRunning = true; + ret = 1; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + if (cpid >= 0) + virProcessKillPainfully(cpid, true); + if (pidfile) + unlink(pidfile); + } + virCommandFree(cmd); + VIR_FREE(socketPath); + VIR_FORCE_CLOSE(pidfd); + VIR_FREE(pidfile); + VIR_FORCE_CLOSE(errfd); + virObjectUnref(cfg); + return ret; +} + + +static int +qemuProcessMaybeStartPRDaemon(virDomainObjPtr vm) +{ + size_t i; + int rv; + + for (i = 0; i < vm->def->ndisks; i++) { + const virDomainDiskDef *disk = vm->def->disks[i]; + + if ((rv = qemuProcessStartPRDaemon(vm, disk)) < 0) + return -1; + + if (rv > 0) + return 1; + } + + return 0; +} + + static int qemuProcessInitPasswords(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -6071,6 +6296,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessResctrlCreate(driver, vm) < 0) goto cleanup;
+ VIR_DEBUG("Setting up PR daemon"); + if (qemuProcessMaybeStartPRDaemon(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting domain security labels"); if (qemuSecuritySetAllLabel(driver, vm, @@ -6598,6 +6827,9 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove the master key */ qemuDomainMasterKeyRemove(priv);
+ /* Do this before we delete the tree and remove pidfile. */ + qemuProcessKillPRDaemon(vm); +
I find it ironic that this "ProcessMaybeStart", but "ProcessKill" here. IDC because the logic is fine and it's just a naming thing If you change to using boolean or at least successfully argue for keeping the "running = '1'" in the XML and with the assumption that you have already gone with the suggested alias alterations and thus apply the attached patch, Reviewed-by: John Ferlan <jferlan@redhat.com> John
virFileDeleteTree(priv->libDir); virFileDeleteTree(priv->channelTargetDir);

On 04/29/2018 07:01 PM, John Ferlan wrote:
On 04/23/2018 09:14 AM, Michal Privoznik wrote:
Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one). The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec().
Again, despite demand to not store anything in status XML I have to take my chances and store 'priv->prDaemonRunning' (at least) in status XML. Otherwise we might forget whether we started pr-helper after libvirtd restart.
And again airing review grievances in commit messages is really not something that should be done...
BTW: This "logic" perhaps answers concerns I had in patches 6 and 7 regarding how to determine if something is running already and whether it's necessary to add to NS and cgroup or in the case of stopping, then removing from the cgroup... whether removing from NS works or not, I have no idea.
What happens if we change default managed alias or default socket is undefined for now, because we don't store them in status XML.
Do you honestly believe this will happen in the future? The alias is a somewhat dynamic relationship between libvirt and qemu for the purpose of describing an object so that some future libvirt can find that object using the same alias. If something in the future changes such that there could be multiple objects, then we know we can query a running qemu for the current alias in order to retrieve the object.
Can we? I'm not sure we can do that, IOW I don't know how to do that.
The future would then need to generate aliases based upon some new naming scheme and we can handle it at that time. For now, there is a one-to-one relationship to worry about. Likewise, for the socket path, if something changes in the future which then has multiple socket paths, then those paths would need to have something unique included.
If we ourselves decide to change the name of the alias, then we have to handle fallout. If the pr-helper socket path changes either by name or path to it, then we still know the old path. Whomever makes a change to the path would be required to handle all those current consumers for a possibly running domain.
Which is going to be ugly. We would have to try and guess aliases whereas if we save it in the status XML we know exactly the alias we need. Look at it from different angle - step away from pr-manager object and think of generic device. We generate alias for it. And save it in the status XML. Why? We can just generate the alias whenever we need it. Also, we know all the changes made to alias generating code so we can try all older versions until we find the right one. Or we can query for it (assuming answer to my question above is yes).
If you really believe this can/will happen, then lets go back to patch 4 and revisit the statement :
"For managed PR helper the alias is always "pr-helper0" and socket path "${vm->priv->libDir}/pr-helper0.sock"."
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 22 +++++ src/qemu/qemu_domain.h | 2 + src/qemu/qemu_process.c | 232 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 256 insertions(+)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 5ccdeb8e3a..fbe6a0f332 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2040,6 +2040,15 @@ qemuDomainObjPrivateXMLFormatAllowReboot(virBufferPtr buf, }
+static void +qemuDomainObjPrivateXMLFormatPR(virBufferPtr buf, + qemuDomainObjPrivatePtr priv) +{ + if (priv->prDaemonRunning) + virBufferAddLit(buf, "<prDaemon running='1'/>\n"); +}
Why not follow the existing @fakeReboot? or chardevStdioLogd?
Is there belief that at some point in the future 'running' could be 0 or 1? Or could this be a tristate? If there is, then it shouldn't be a boolean in the .h file.
Okay.
+ + static int qemuDomainObjPrivateXMLFormatJob(virBufferPtr buf, virDomainObjPtr vm, @@ -2180,6 +2189,8 @@ qemuDomainObjPrivateXMLFormat(virBufferPtr buf,
qemuDomainObjPrivateXMLFormatAllowReboot(buf, priv->allowReboot);
+ qemuDomainObjPrivateXMLFormatPR(buf, priv); + if (qemuDomainObjPrivateXMLFormatBlockjobs(buf, vm) < 0) return -1;
@@ -2323,6 +2334,15 @@ qemuDomainObjPrivateXMLParseAllowReboot(xmlXPathContextPtr ctxt, }
+static void +qemuDomainObjPrivateXMLParsePR(xmlXPathContextPtr ctxt, + bool *prDaemonRunning) +{ + *prDaemonRunning = virXPathBoolean("count(./prDaemon[@running = '1']) > 0", + ctxt) > 0; +} + + static int qemuDomainObjPrivateXMLParseJob(virDomainObjPtr vm, qemuDomainObjPrivatePtr priv, @@ -2572,6 +2592,8 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt,
qemuDomainObjPrivateXMLParseAllowReboot(ctxt, &priv->allowReboot);
+ qemuDomainObjPrivateXMLParsePR(ctxt, &priv->prDaemonRunning); + if (qemuDomainObjPrivateXMLParseBlockjobs(priv, ctxt) < 0) goto error;
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 612d234113..b837902e8d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -342,6 +342,8 @@ struct _qemuDomainObjPrivate { /* Migration capabilities. Rechecked on reconnect, not to be saved in * private XML. */ virBitmapPtr migrationCaps; +
Similar to the other recent additions, please add a comment to describe what the purpose is.
+ bool prDaemonRunning; };
# define QEMU_DOMAIN_PRIVATE(vm) \ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 6a5262ae46..36cfa438ed 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2555,6 +2555,231 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, }
+static char * +qemuProcessBuildPRHelperPidfilePath(virDomainObjPtr vm, + const char *prdAlias) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + return virPidFileBuildPath(priv->libDir, prdAlias); +} + + +static void +qemuProcessKillPRDaemon(virDomainObjPtr vm) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virErrorPtr orig_err; + const char *prdAlias; + char *pidfile; + + if (!priv->prDaemonRunning) + return; + + if (!(prdAlias = qemuDomainGetManagedPRAlias())) { + VIR_WARN("Unable to get default PR manager alias"); + return; + }
This makes zero sense as it cannot be NULL... Much easier to replace prdAlias with QEMU_DOMAIN_MANAGED_PR_ALIAS
+ + if (!(pidfile = qemuProcessBuildPRHelperPidfilePath(vm, prdAlias))) { + VIR_WARN("Unable to construct pr-helper pidfile path"); + return; + } + + virErrorPreserveLast(&orig_err); + if (virPidFileForceCleanupPath(pidfile) < 0) { + VIR_WARN("Unable to kill pr-helper process"); + } else { + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove stale pidfile %s"), + pidfile); + } else { + priv->prDaemonRunning = false; + } + } + virErrorRestore(&orig_err); + + VIR_FREE(pidfile); +} + + +static int +qemuProcessStartPRDaemonHook(void *opaque) +{ + virDomainObjPtr vm = opaque; + size_t i, nfds = 0; + int *fds = NULL; + int ret = -1; + + if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0) + return ret; + + if (nfds > 0 && + virProcessSetNamespaces(nfds, fds) < 0) + goto cleanup; +
hmmmm... so when we hot plug if not already running, then this would be run, right? So this sets up NS appropriately and thus the cgroup magic as part of that... So not "obvious" from prior to patches... Maybe it's just an ordering and thought thing, but I do feel better now about the previous 2 patches and whether they can be run multiple times or not as the case seems to be.
+ ret = 0; + cleanup: + for (i = 0; i < nfds; i++) + VIR_FORCE_CLOSE(fds[i]); + VIR_FREE(fds); + return ret; +} + + +static int +qemuProcessStartPRDaemon(virDomainObjPtr vm, + const virDomainDiskDef *disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + virQEMUDriverConfigPtr cfg; + const char *prdAlias; + int errfd = -1; + char *pidfile = NULL; + int pidfd = -1; + char *socketPath = NULL; + pid_t cpid = -1; + virCommandPtr cmd = NULL; + virTimeBackOffVar timebackoff; + const unsigned long long timeout = 500000; /* ms */ + int ret = -1; + + if (!virStoragePRDefIsManaged(disk->src->pr) || + priv->prDaemonRunning) + return 0; + + cfg = virQEMUDriverGetConfig(driver); + + if (!virFileIsExecutable(cfg->prHelperName)) { + virReportSystemError(errno, _("'%s' is not a suitable pr helper"), + cfg->prHelperName); + goto cleanup; + } + + prdAlias = qemuDomainGetManagedPRAlias(); + + if (!(pidfile = qemuProcessBuildPRHelperPidfilePath(vm, prdAlias))) + goto cleanup; + + /* Just try to acquire. Dummy pid will be replaced later */ + if ((pidfd = virPidFileAcquirePath(pidfile, false, -1)) < 0) + goto cleanup; + + if (!(socketPath = qemuDomainGetPRSocketPath(vm, disk->src->pr))) + goto cleanup; + + /* Remove stale socket */ + if (unlink(socketPath) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove stale socket path: %s"), + socketPath); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(cfg->prHelperName, + "-k", socketPath, + "-f", pidfile, + NULL))) + goto cleanup; + + virCommandDaemonize(cmd); + /* We want our virCommand to write child PID into the pidfile + * so that we can read it even before exec(). */ + virCommandSetPidFile(cmd, pidfile); + virCommandSetErrorFD(cmd, &errfd); + + /* Place the process into the same namespace and cgroup as + * qemu (so that it shares the same view of the system). */ + virCommandSetPreExecHook(cmd, qemuProcessStartPRDaemonHook, vm); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (virPidFileReadPath(pidfile, &cpid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pr helper %s didn't show up"), prdAlias);
Honestly the alias output has nothing to do with the message - seems like it's here to prove a point.
If anything what's more important here is 'cfg->prHelperName' because that's what died not the alias.
+ goto cleanup; + } + + if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + char errbuf[1024] = { 0 }; + + if (virFileExists(socketPath)) + break; + + if (virProcessKill(cpid, 0) == 0) + continue; + + if (saferead(errfd, errbuf, sizeof(errbuf) - 1) < 0) { + virReportSystemError(errno, + _("pr helper %s died unexpectedly"), prdAlias);
ditto
+ } else { + virReportError(VIR_ERR_OPERATION_FAILED, + _("pr helper died and reported: %s"), errbuf); + } + goto cleanup; + } + + if (!virFileExists(socketPath)) { + virReportError(VIR_ERR_OPERATION_TIMEOUT, "%s", + _("pr helper socked did not show up")); + goto cleanup; + } + + if (priv->cgroup && + virCgroupAddMachineTask(priv->cgroup, cpid) < 0) + goto cleanup; + + if (qemuSecurityDomainSetPathLabel(driver->securityManager, + vm->def, socketPath, true) < 0) + goto cleanup; + + priv->prDaemonRunning = true; + ret = 1; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + if (cpid >= 0) + virProcessKillPainfully(cpid, true); + if (pidfile) + unlink(pidfile); + } + virCommandFree(cmd); + VIR_FREE(socketPath); + VIR_FORCE_CLOSE(pidfd); + VIR_FREE(pidfile); + VIR_FORCE_CLOSE(errfd); + virObjectUnref(cfg); + return ret; +} + + +static int +qemuProcessMaybeStartPRDaemon(virDomainObjPtr vm) +{ + size_t i; + int rv; + + for (i = 0; i < vm->def->ndisks; i++) { + const virDomainDiskDef *disk = vm->def->disks[i]; + + if ((rv = qemuProcessStartPRDaemon(vm, disk)) < 0) + return -1; + + if (rv > 0) + return 1; + } + + return 0; +} + + static int qemuProcessInitPasswords(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -6071,6 +6296,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessResctrlCreate(driver, vm) < 0) goto cleanup;
+ VIR_DEBUG("Setting up PR daemon"); + if (qemuProcessMaybeStartPRDaemon(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting domain security labels"); if (qemuSecuritySetAllLabel(driver, vm, @@ -6598,6 +6827,9 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove the master key */ qemuDomainMasterKeyRemove(priv);
+ /* Do this before we delete the tree and remove pidfile. */ + qemuProcessKillPRDaemon(vm); +
I find it ironic that this "ProcessMaybeStart", but "ProcessKill" here. IDC because the logic is fine and it's just a naming thing
If you change to using boolean or at least successfully argue for keeping the "running = '1'" in the XML and with the assumption that you have already gone with the suggested alias alterations and thus apply the attached patch,
Reviewed-by: John Ferlan <jferlan@redhat.com>
I've done all the changes. Thanks, Michal

When attaching a disk that requires pr-manager we might need to plug the pr-manager object and start the pr-helper process. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_process.c | 4 +-- src/qemu/qemu_process.h | 5 +++ 3 files changed, 97 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index df9e8aa716..2e736873d1 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -348,6 +348,63 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, } +/** + * qemuDomainMaybeStartPRDaemon: + * @vm: domain object + * @disk: disk to hotplug + * + * Checks if it's needed to start qemu-pr-helper and starts it. + * + * Returns: 0 if qemu-pr-helper is not needed + * 1 if it is needed and was started + * -1 otherwise. + */ +static int +qemuDomainMaybeStartPRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (!virStoragePRDefIsManaged(disk->src->pr)) { + /* @disk itself does not require qemu-pr-helper. */ + return 0; + } + + if (priv->prDaemonRunning) { + /* @disk requires qemu-pr-helper but there's already one running. */ + return 0; + } + + /* @disk requires qemu-pr-helper but none is running. + * Start it now. */ + if (qemuProcessStartPRDaemon(vm, disk) < 0) + return -1; + + return 1; +} + + +static int +qemuMaybeBuildPRManagerInfoProps(virDomainObjPtr vm, + const virDomainDiskDef *disk, + virJSONValuePtr *propsret, + char **aliasret) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + *propsret = NULL; + *aliasret = NULL; + + if (virStoragePRDefIsManaged(disk->src->pr) && + priv->prDaemonRunning) { + /* @disk requires qemu-pr-helper but there's already one running. */ + return 0; + } + + return qemuBuildPRManagerInfoProps(vm, disk, propsret, aliasret); +} + + /** * qemuDomainAttachDiskGeneric: * @@ -365,12 +422,16 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, char *devstr = NULL; char *drivestr = NULL; char *drivealias = NULL; + char *prmgrAlias = NULL; bool driveAdded = false; bool secobjAdded = false; bool encobjAdded = false; + bool prmgrAdded = false; + bool prdStarted = false; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); virJSONValuePtr secobjProps = NULL; virJSONValuePtr encobjProps = NULL; + virJSONValuePtr prmgrProps = NULL; qemuDomainStorageSourcePrivatePtr srcPriv; qemuDomainSecretInfoPtr secinfo = NULL; qemuDomainSecretInfoPtr encinfo = NULL; @@ -398,6 +459,17 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, if (encinfo && qemuBuildSecretInfoProps(encinfo, &encobjProps) < 0) goto error; + if (qemuMaybeBuildPRManagerInfoProps(vm, disk, &prmgrProps, &prmgrAlias) < 0) + goto error; + + /* Start daemon only after prmgrProps is built. Otherwise + * qemuDomainMaybeStartPRDaemon() might start daemon and set + * priv->prDaemonRunning which confuses props building code. */ + if ((rv = qemuDomainMaybeStartPRDaemon(vm, disk)) < 0) + goto error; + else if (rv > 0) + prdStarted = true; + if (disk->src->haveTLS && qemuDomainAddDiskSrcTLSObject(driver, vm, disk->src, disk->info.alias) < 0) @@ -435,6 +507,15 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, encobjAdded = true; } + if (prmgrProps) { + rv = qemuMonitorAddObject(priv->mon, "pr-manager-helper", prmgrAlias, + prmgrProps); + prmgrProps = NULL; /* qemuMonitorAddObject consumes */ + if (rv < 0) + goto exit_monitor; + prmgrAdded = true; + } + if (qemuMonitorAddDrive(priv->mon, drivestr) < 0) goto exit_monitor; driveAdded = true; @@ -453,12 +534,14 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, ret = 0; cleanup: - virJSONValueFree(secobjProps); + virJSONValueFree(prmgrProps); virJSONValueFree(encobjProps); + virJSONValueFree(secobjProps); qemuDomainSecretDiskDestroy(disk); - VIR_FREE(devstr); - VIR_FREE(drivestr); + VIR_FREE(prmgrAlias); VIR_FREE(drivealias); + VIR_FREE(drivestr); + VIR_FREE(devstr); virObjectUnref(cfg); return ret; @@ -472,6 +555,8 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, secinfo->s.aes.alias)); if (encobjAdded) ignore_value(qemuMonitorDelObject(priv->mon, encinfo->s.aes.alias)); + if (prmgrAdded) + ignore_value(qemuMonitorDelObject(priv->mon, prmgrAlias)); if (qemuDomainObjExitMonitor(driver, vm) < 0) ret = -2; virErrorRestore(&orig_err); @@ -481,6 +566,8 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, error: qemuDomainDelDiskSrcTLSObject(driver, vm, disk->src); ignore_value(qemuHotplugPrepareDiskAccess(driver, vm, disk, NULL, true)); + if (prdStarted) + qemuProcessKillPRDaemon(vm); goto cleanup; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 36cfa438ed..b0c22001d1 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2565,7 +2565,7 @@ qemuProcessBuildPRHelperPidfilePath(virDomainObjPtr vm, } -static void +void qemuProcessKillPRDaemon(virDomainObjPtr vm) { qemuDomainObjPrivatePtr priv = vm->privateData; @@ -2629,7 +2629,7 @@ qemuProcessStartPRDaemonHook(void *opaque) } -static int +int qemuProcessStartPRDaemon(virDomainObjPtr vm, const virDomainDiskDef *disk) { diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index 3bf66f71b7..eda6695415 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -205,4 +205,9 @@ int qemuProcessRefreshDisks(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob); +int qemuProcessStartPRDaemon(virDomainObjPtr vm, + const virDomainDiskDef *disk); + +void qemuProcessKillPRDaemon(virDomainObjPtr vm); + #endif /* __QEMU_PROCESS_H__ */ -- 2.16.1

On 04/23/2018 09:14 AM, Michal Privoznik wrote:
When attaching a disk that requires pr-manager we might need to plug the pr-manager object and start the pr-helper process.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_process.c | 4 +-- src/qemu/qemu_process.h | 5 +++ 3 files changed, 97 insertions(+), 5 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

If we are the last one to use pr-manager object we need to remove it and also kill the qemu-pr-helper process. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 2e736873d1..e9dd8a4532 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3824,6 +3824,49 @@ static bool qemuIsMultiFunctionDevice(virDomainDefPtr def, } +static int +qemuDomainDiskNeedRemovePR(virDomainObjPtr vm, + virDomainDiskDefPtr disk, + char **aliasret, + bool *stopDaemon) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + size_t i; + + *aliasret = NULL; + *stopDaemon = false; + + if (!virStoragePRDefIsEnabled(disk->src->pr)) + return 0; + + if (!virStoragePRDefIsManaged(disk->src->pr)) { + *aliasret = qemuDomainGetUnmanagedPRAlias(disk); + return *aliasret ? 0 : -1; + } + + for (i = 0; i < vm->def->ndisks; i++) { + const virDomainDiskDef *domainDisk = vm->def->disks[i]; + + if (domainDisk == disk) + continue; + + if (virStoragePRDefIsManaged(domainDisk->src->pr)) + break; + } + + if (i != vm->def->ndisks) + return 0; + + if (VIR_STRDUP(*aliasret, qemuDomainGetManagedPRAlias()) < 0) + return -1; + + if (priv->prDaemonRunning) + *stopDaemon = true; + + return 0; +} + + static int qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -3837,6 +3880,8 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, char *drivestr; char *objAlias = NULL; char *encAlias = NULL; + char *prmgrAlias = NULL; + bool stopPRDaemon = false; VIR_DEBUG("Removing disk %s from domain %p %s", disk->info.alias, vm, vm->def->name); @@ -3874,6 +3919,9 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } } + if (qemuDomainDiskNeedRemovePR(vm, disk, &prmgrAlias, &stopPRDaemon) < 0) + return -1; + qemuDomainObjEnterMonitor(driver, vm); qemuMonitorDriveDel(priv->mon, drivestr); @@ -3889,6 +3937,11 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, encAlias)); VIR_FREE(encAlias); + /* If it fails, then so be it - it was a best shot */ + if (prmgrAlias) + ignore_value(qemuMonitorDelObject(priv->mon, prmgrAlias)); + VIR_FREE(prmgrAlias); + if (disk->src->haveTLS) ignore_value(qemuMonitorDelObject(priv->mon, disk->src->tlsAlias)); @@ -3907,6 +3960,9 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } } + if (stopPRDaemon) + qemuProcessKillPRDaemon(vm); + qemuDomainReleaseDeviceAddress(vm, &disk->info, src); if (qemuSecurityRestoreDiskLabel(driver, vm, disk) < 0) -- 2.16.1

On 04/23/2018 09:14 AM, Michal Privoznik wrote:
If we are the last one to use pr-manager object we need to remove it and also kill the qemu-pr-helper process.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 2e736873d1..e9dd8a4532 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3824,6 +3824,49 @@ static bool qemuIsMultiFunctionDevice(virDomainDefPtr def, }
+static int +qemuDomainDiskNeedRemovePR(virDomainObjPtr vm, + virDomainDiskDefPtr disk, + char **aliasret, + bool *stopDaemon) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + size_t i; + + *aliasret = NULL; + *stopDaemon = false; + + if (!virStoragePRDefIsEnabled(disk->src->pr)) + return 0; + + if (!virStoragePRDefIsManaged(disk->src->pr)) { + *aliasret = qemuDomainGetUnmanagedPRAlias(disk); + return *aliasret ? 0 : -1; + } + + for (i = 0; i < vm->def->ndisks; i++) { + const virDomainDiskDef *domainDisk = vm->def->disks[i]; + + if (domainDisk == disk) + continue; + + if (virStoragePRDefIsManaged(domainDisk->src->pr)) + break; + } + + if (i != vm->def->ndisks) + return 0; + + if (VIR_STRDUP(*aliasret, qemuDomainGetManagedPRAlias()) < 0)
and again here a simple: if (VIR_STRDUP(*aliasret, QEMU_DOMAIN_MANAGED_PR_ALIAS) < 0) would suffice Reviewed-by: John Ferlan <jferlan@redhat.com> John /me waits for the tester that starts/stops 100's of times in a row ;-) [...]

Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml | 1 + 6 files changed, 6 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d67b859d75..dea41dfbd0 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -1100,6 +1100,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "virtio-mouse-ccw", QEMU_CAPS_DEVICE_VIRTIO_MOUSE_CCW }, { "virtio-tablet-ccw", QEMU_CAPS_DEVICE_VIRTIO_TABLET_CCW }, { "pcie-pci-bridge", QEMU_CAPS_DEVICE_PCIE_PCI_BRIDGE }, + { "pr-manager-helper", QEMU_CAPS_PR_MANAGER_HELPER }, }; static struct virQEMUCapsStringFlags virQEMUCapsDevicePropsVirtioBalloon[] = { diff --git a/tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml b/tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml index 6dd392502e..70fd047fa8 100644 --- a/tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml @@ -118,6 +118,7 @@ <flag name='qcow2-luks'/> <flag name='seccomp-blacklist'/> <flag name='disk-write-cache'/> + <flag name='pr-manager-helper'/> <version>2011000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>342058</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml index 31c5d0dd23..fbfb41f354 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml @@ -157,6 +157,7 @@ <flag name='seccomp-blacklist'/> <flag name='query-cpus-fast'/> <flag name='disk-write-cache'/> + <flag name='pr-manager-helper'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>342346</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml index 7dead4a1f4..d8975299e4 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml @@ -154,6 +154,7 @@ <flag name='seccomp-blacklist'/> <flag name='query-cpus-fast'/> <flag name='disk-write-cache'/> + <flag name='pr-manager-helper'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>419215</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml b/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml index 70ae8f91c7..1c93993920 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml @@ -119,6 +119,7 @@ <flag name='seccomp-blacklist'/> <flag name='query-cpus-fast'/> <flag name='disk-write-cache'/> + <flag name='pr-manager-helper'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml index d809a78380..adb751f596 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml @@ -195,6 +195,7 @@ <flag name='seccomp-blacklist'/> <flag name='query-cpus-fast'/> <flag name='disk-write-cache'/> + <flag name='pr-manager-helper'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>390060</microcodeVersion> -- 2.16.1

On 04/23/2018 09:14 AM, Michal Privoznik wrote:
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml | 1 + 6 files changed, 6 insertions(+)
Now that the capability exists in latest, the qemuxml2argvtest requires changes... We need to be sure to follow the new CAPS_LATEST convention even though this works without it, was started before it, but must capitulate so that future patches cannot point at this and say, well this one didn't have to... 1. git mv tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args \ tests/qemuxml2argvdata/disk-virtio-scsi-reservations.x86_64-latest.args 2. edit tests/qemuxml2argvtest.c w/ diff: - DO_TEST("disk-virtio-scsi-reservations", - QEMU_CAPS_DRIVE_BOOT, QEMU_CAPS_VIRTIO_SCSI, - QEMU_CAPS_SCSI_BLOCK, QEMU_CAPS_PR_MANAGER_HELPER); + DO_TEST_CAPS_LATEST("disk-virtio-scsi-reservations"); 3. VIR_TEST_REGENERATE_OUTPUT=1 tests/qemuxml2argvtest and add/merge into here. or move this to earlier and do the above earlier - IDC either way, then Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 04/23/2018 09:14 AM, Michal Privoznik wrote:
v5 of:
https://www.redhat.com/archives/libvir-list/2018-April/msg00736.html
diff to v4: - John's review worked in. Partially.
Michal Privoznik (11): virstoragefile: Introduce virStoragePRDef qemuDomainDiskChangeSupported: Deny changing reservations qemu: Introduce pr-manager-helper capability qemu: Generate pr cmd line at startup qemu_ns: Allow /dev/mapper/control for PR qemu_cgroup: Allow /dev/mapper/control for PR qemu: Introduce pr_helper to qemu.conf qemu: Start PR daemon on domain startup qemu_hotplug: Hotplug of reservations qemu_hotplug: Hotunplug of reservations qemu: Detect pr-manager-helper capability
Based on what I've also reviewed for Stefan Berger's TPM emulator: https://www.redhat.com/archives/libvir-list/2018-May/msg00366.html and in particular patch 14 for cgroup resource mgmt. Would you believe the same concept would need to apply to pr_helper? At least w/r/t how many cgroup controlled resources the pr_helper should/would be allowed. We place similar controls on other threads (cpu, iothread, emulator). I realize a TPM emulator and a pr_helper thread are very different in what they are doing, but there are similarities that probably need to be addressed. John
docs/formatdomain.html.in | 23 +- docs/schemas/domaincommon.rng | 34 +-- docs/schemas/storagecommon.rng | 50 +++++ m4/virt-driver-qemu.m4 | 5 + src/conf/domain_conf.c | 38 ++++ src/libvirt_private.syms | 6 + src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 4 + src/qemu/qemu_alias.c | 18 ++ src/qemu/qemu_alias.h | 4 + src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_cgroup.c | 33 ++- src/qemu/qemu_command.c | 134 ++++++++++++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_conf.c | 7 +- src/qemu/qemu_conf.h | 1 + src/qemu/qemu_domain.c | 66 ++++++ src/qemu/qemu_domain.h | 5 + src/qemu/qemu_hotplug.c | 149 ++++++++++++- src/qemu/qemu_process.c | 232 +++++++++++++++++++++ src/qemu/qemu_process.h | 6 + src/qemu/test_libvirtd_qemu.aug.in | 1 + src/util/virdevmapper.c | 8 +- src/util/virstoragefile.c | 164 +++++++++++++++ src/util/virstoragefile.h | 18 ++ tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml | 1 + .../disk-virtio-scsi-reservations.args | 38 ++++ .../disk-virtio-scsi-reservations.xml | 49 +++++ tests/qemuxml2argvtest.c | 4 + .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 2 + 36 files changed, 1075 insertions(+), 39 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml
participants (2)
-
John Ferlan
-
Michal Privoznik