[libvirt] [PATCH v3 00/12] Basic implementation of persistent reservations

v3 of: https://www.redhat.com/archives/libvir-list/2018-February/msg01021.html Diff to v2: - John's review worked in The event on pr-helper process dying is still not implemented as qemu has no implementation yet. Michal Privoznik (12): virstoragefile: Introduce virStoragePRDef qemuDomainDiskChangeSupported: Deny changing reservations qemu: Introduce pr-manager-helper capability qemu: Generate alias and socket path for pr-helper qemu: Store pr runtime data in status XML qemu: Generate cmd line at startup qemu: Introduce pr_helper to qemu.conf qemu_domain: Track pr-helper PID in status XML qemu: Start PR daemon on domain startup qemu_hotplug: Hotplug of reservations qemu_hotplug: Hotunplug of reservations qemu: Detect pr-manager-helper capability docs/formatdomain.html.in | 25 ++- docs/schemas/domaincommon.rng | 34 +-- docs/schemas/storagecommon.rng | 50 +++++ m4/virt-driver-qemu.m4 | 5 + src/conf/domain_conf.c | 36 ++++ src/libvirt_private.syms | 6 + src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 4 + src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 48 +++++ src/qemu/qemu_conf.c | 7 +- src/qemu/qemu_conf.h | 1 + src/qemu/qemu_domain.c | 238 ++++++++++++++++++--- src/qemu/qemu_domain.h | 15 ++ src/qemu/qemu_hotplug.c | 94 ++++++++ src/qemu/qemu_process.c | 193 +++++++++++++++++ src/qemu/qemu_process.h | 7 + src/qemu/test_libvirtd_qemu.aug.in | 1 + src/util/virstoragefile.c | 181 ++++++++++++++++ src/util/virstoragefile.h | 19 ++ tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml | 1 + .../disk-virtio-scsi-reservations-not-managed.args | 29 +++ .../disk-virtio-scsi-reservations-not-managed.xml | 41 ++++ .../disk-virtio-scsi-reservations.args | 29 +++ .../disk-virtio-scsi-reservations.xml | 39 ++++ tests/qemuxml2argvtest.c | 8 + .../disk-virtio-scsi-reservations-not-managed.xml | 1 + .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 4 + 30 files changed, 1064 insertions(+), 57 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml -- 2.16.1

This is a definition that holds information on SCSI persistent reservation settings. The XML part looks like this: <reservations enabled='yes' managed='no'> <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> </reservations> If @managed is set to 'yes' then the <source/> is not parsed. This design was agreed on here: https://www.redhat.com/archives/libvir-list/2017-November/msg01005.html Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/formatdomain.html.in | 25 +++- docs/schemas/domaincommon.rng | 34 +---- docs/schemas/storagecommon.rng | 50 +++++++ src/conf/domain_conf.c | 36 +++++ src/libvirt_private.syms | 3 + src/util/virstoragefile.c | 148 +++++++++++++++++++++ src/util/virstoragefile.h | 15 +++ .../disk-virtio-scsi-reservations-not-managed.xml | 41 ++++++ .../disk-virtio-scsi-reservations.xml | 39 ++++++ .../disk-virtio-scsi-reservations-not-managed.xml | 1 + .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 4 + 12 files changed, 366 insertions(+), 31 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 6fd2189cd..cbdc887a0 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2563,7 +2563,10 @@ </disk> <disk type='block' device='lun'> <driver name='qemu' type='raw'/> - <source dev='/dev/sda'/> + <source dev='/dev/sda'> + <reservations enabled='yes' managed='no'> + <source type='unix' path='/path/to/qemu-pr-helper' mode='client'/> + </reservations> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='3' unit='0'/> </disk> @@ -2926,6 +2929,26 @@ <a href="formatstorageencryption.html">Storage Encryption</a> page for more information. </dd> + <dt><code>reservations</code></dt> + <dd><span class="since">Since libvirt 4.1.0</span>, the + <code>reservations</code> can be a sub-element of the + <code>source</code> element for storage sources (QEMU driver only). + If present (and enabled) it enabled persistent reservations for + SCSI based disks. The element has one mandatory attribute + <code>enabled</code> with accepted values <code>yes</code> and + <code>no</code>. If the feature is enabled, then there's another + mandatory attribute <code>managed</code> (accepted values are the + same as for <code>enabled</code>) that enables or disables libvirt + spawning any helper process (should one be needed). However, if + libvirt is not enabled spawning helper process (i.e. hypervisor + should just use one which is already running), path to its socket + must be provided in child element <code>source</code>, which + currently accepts only the following attributes: <code>type</code> + with one value <code>unix</code>, <code>path</code> with path the + socket, and finally <code>mode</code> which accepts either + <code>server</code> or <code>client</code> and specifies the role + of hypervisor. + </dd> </dl> <p> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 8165e699d..198d4400b 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1530,6 +1530,9 @@ <optional> <ref name="encryption"/> </optional> + <optional> + <ref name="reservations"/> + </optional> <zeroOrMore> <ref name='devSeclabel'/> </zeroOrMore> @@ -2431,18 +2434,6 @@ </attribute> </optional> </define> - <define name="reconnect"> - <element name="reconnect"> - <attribute name="enabled"> - <ref name="virYesNo"/> - </attribute> - <optional> - <attribute name="timeout"> - <ref name="unsignedInt"/> - </attribute> - </optional> - </element> - </define> <!-- An interface description can either be of type bridge in which case @@ -2491,24 +2482,7 @@ <value>vhostuser</value> </attribute> <interleave> - <element name="source"> - <attribute name="type"> - <value>unix</value> - </attribute> - <attribute name="path"> - <ref name="absFilePath"/> - </attribute> - <attribute name="mode"> - <choice> - <value>server</value> - <value>client</value> - </choice> - </attribute> - <optional> - <ref name="reconnect"/> - </optional> - <empty/> - </element> + <ref name="unixSocketSource"/> <ref name="interface-options"/> </interleave> </group> diff --git a/docs/schemas/storagecommon.rng b/docs/schemas/storagecommon.rng index edee1b084..eed0b3334 100644 --- a/docs/schemas/storagecommon.rng +++ b/docs/schemas/storagecommon.rng @@ -39,6 +39,56 @@ </element> </define> + <define name="reconnect"> + <element name="reconnect"> + <attribute name="enabled"> + <ref name="virYesNo"/> + </attribute> + <optional> + <attribute name="timeout"> + <ref name="unsignedInt"/> + </attribute> + </optional> + </element> + </define> + + <define name='unixSocketSource'> + <element name="source"> + <attribute name="type"> + <value>unix</value> + </attribute> + <attribute name="path"> + <ref name="absFilePath"/> + </attribute> + <attribute name="mode"> + <choice> + <value>server</value> + <value>client</value> + </choice> + </attribute> + <optional> + <ref name="reconnect"/> + </optional> + <empty/> + </element> + </define> + + <define name='reservations'> + <element name='reservations'> + <attribute name='enabled'> + <ref name='virYesNo'/> + </attribute> + <optional> + <attribute name='managed'> + <ref name='virYesNo'/> + </attribute> + </optional> + <optional> + <ref name='unixSocketSource'/> + </optional> + </element> + </define> + <define name='secret'> <element name='secret'> <attribute name='type'> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 86fc27511..00b729fc2 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -5201,6 +5201,13 @@ virDomainDiskDefValidate(const virDomainDiskDef *disk) } } + if (disk->src->pr && + disk->device != VIR_DOMAIN_DISK_DEVICE_LUN) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("<reservations/> allowed only for lun disks")); + return -1; + } + /* Reject disks with a bus type that is not compatible with the * given address type. The function considers only buses that are * handled in common code. For other bus types it's not possible @@ -8586,6 +8593,29 @@ virDomainDiskSourcePrivateDataParse(xmlNodePtr node, } +static int +virDomainDiskSourcePRParse(xmlNodePtr node, + virStoragePRDefPtr *prsrc) +{ + xmlNodePtr child; + virStoragePRDefPtr pr = NULL; + + for (child = node->children; child; child = child->next) { + if (child->type == XML_ELEMENT_NODE && + virXMLNodeNameEqual(child, "reservations")) { + + if (!(pr = virStoragePRDefParseNode(node->doc, child))) + return -1; + + *prsrc = pr; + return 0; + } + } + + return 0; +} + + static int virDomainStorageSourceParse(xmlNodePtr node, xmlXPathContextPtr ctxt, @@ -8632,6 +8662,9 @@ virDomainStorageSourceParse(xmlNodePtr node, !(src->encryption = virStorageEncryptionParseNode(tmp, ctxt))) goto cleanup; + if (virDomainDiskSourcePRParse(node, &src->pr) < 0) + goto cleanup; + if (virSecurityDeviceLabelDefParseXML(&src->seclabels, &src->nseclabels, ctxt, flags) < 0) goto cleanup; @@ -22892,6 +22925,9 @@ virDomainStorageSourceFormat(virBufferPtr attrBuf, virStorageEncryptionFormat(childBuf, src->encryption) < 0) return -1; + if (src->pr) + virStoragePRDefFormat(childBuf, src->pr); + return 0; } diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c67bce738..dd33895f9 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2782,6 +2782,9 @@ virStorageNetHostDefFree; virStorageNetHostTransportTypeFromString; virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; +virStoragePRDefFormat; +virStoragePRDefFree; +virStoragePRDefParseNode; virStorageSourceBackingStoreClear; virStorageSourceClear; virStorageSourceCopy; diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 67b9ec71a..2b5b48f00 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -1892,6 +1892,153 @@ virStorageAuthDefFormat(virBufferPtr buf, } +void +virStoragePRDefFree(virStoragePRDefPtr prd) +{ + if (!prd) + return; + + VIR_FREE(prd->path); + VIR_FREE(prd); +} + + +static virStoragePRDefPtr +virStoragePRDefParseXML(xmlXPathContextPtr ctxt) +{ + virStoragePRDefPtr prd, ret = NULL; + char *enabled = NULL; + char *managed = NULL; + char *type = NULL; + char *path = NULL; + char *mode = NULL; + + if (VIR_ALLOC(prd) < 0) + return NULL; + + if (!(enabled = virXPathString("string(./@enabled)", ctxt))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing @enabled attribute for <reservations/>")); + goto cleanup; + } + + if ((prd->enabled = virTristateBoolTypeFromString(enabled)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'enabled': %s"), NULLSTR(enabled)); + goto cleanup; + } + + if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + managed = virXPathString("string(./@managed)", ctxt); + if ((prd->managed = virTristateBoolTypeFromString(managed)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'managed': %s"), NULLSTR(managed)); + goto cleanup; + } + + if (prd->managed == VIR_TRISTATE_BOOL_NO) { + type = virXPathString("string(./source[1]/@type)", ctxt); + path = virXPathString("string(./source[1]/@path)", ctxt); + mode = virXPathString("string(./source[1]/@mode)", ctxt); + + if (!type) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection type")); + goto cleanup; + } + + if (!path) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing path")); + goto cleanup; + } + + if (!mode) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection mode")); + goto cleanup; + } + + if (STRNEQ(type, "unix")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported type: %s"), type); + goto cleanup; + } + + if (STRNEQ(mode, "client")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported mode: %s"), mode); + goto cleanup; + } + + prd->path = path; + path = NULL; + + } + } + + ret = prd; + prd = NULL; + + cleanup: + VIR_FREE(mode); + VIR_FREE(path); + VIR_FREE(type); + VIR_FREE(managed); + VIR_FREE(enabled); + virStoragePRDefFree(prd); + return ret; +} + + +virStoragePRDefPtr +virStoragePRDefParseNode(xmlDocPtr xml, + xmlNodePtr root) +{ + xmlXPathContextPtr ctxt = NULL; + virStoragePRDefPtr prd = NULL; + + ctxt = xmlXPathNewContext(xml); + if (ctxt == NULL) { + virReportOOMError(); + goto cleanup; + } + + ctxt->node = root; + prd = virStoragePRDefParseXML(ctxt); + + cleanup: + xmlXPathFreeContext(ctxt); + return prd; +} + + +void +virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd) +{ + virBufferAsprintf(buf, "<reservations enabled='%s'", + virTristateBoolTypeToString(prd->enabled)); + if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + virBufferAsprintf(buf, " managed='%s'", + virTristateBoolTypeToString(prd->managed)); + if (prd->managed == VIR_TRISTATE_BOOL_NO) { + virBufferAddLit(buf, ">\n"); + virBufferAdjustIndent(buf, 2); + virBufferAddLit(buf, "<source type='unix'"); + virBufferEscapeString(buf, " path='%s'", prd->path); + virBufferAddLit(buf, " mode='client'/>\n"); + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</reservations>\n"); + } else { + virBufferAddLit(buf, "/>\n"); + } + } else { + virBufferAddLit(buf, "/>\n"); + } +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) @@ -2270,6 +2417,7 @@ virStorageSourceClear(virStorageSourcePtr def) virBitmapFree(def->features); VIR_FREE(def->compat); virStorageEncryptionFree(def->encryption); + virStoragePRDefFree(def->pr); virStorageSourceSeclabelsClear(def); virStoragePermsFree(def->perms); VIR_FREE(def->timestamps); diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index 596746ccb..69fea34bc 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -216,6 +216,14 @@ struct _virStorageAuthDef { virSecretLookupTypeDef seclookupdef; }; +typedef struct _virStoragePRDef virStoragePRDef; +typedef virStoragePRDef *virStoragePRDefPtr; +struct _virStoragePRDef { + int enabled; /* enum virTristateBool */ + int managed; /* enum virTristateBool */ + char *path; +}; + typedef struct _virStorageDriverData virStorageDriverData; typedef virStorageDriverData *virStorageDriverDataPtr; @@ -243,6 +251,7 @@ struct _virStorageSource { bool authInherited; virStorageEncryptionPtr encryption; bool encryptionInherited; + virStoragePRDefPtr pr; virObjectPtr privateData; @@ -370,6 +379,12 @@ virStorageAuthDefPtr virStorageAuthDefParse(xmlNodePtr node, xmlXPathContextPtr ctxt); void virStorageAuthDefFormat(virBufferPtr buf, virStorageAuthDefPtr authdef); +void virStoragePRDefFree(virStoragePRDefPtr prd); +virStoragePRDefPtr virStoragePRDefParseNode(xmlDocPtr xml, + xmlNodePtr root); +void virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd); + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model); diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml new file mode 100644 index 000000000..f4e4f1ee5 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml @@ -0,0 +1,41 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>8</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <disk type='block' device='lun'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'> + <reservations enabled='yes' managed='no'> + <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> + </reservations> + </source> + <target dev='sdb' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <driver queues='8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml new file mode 100644 index 000000000..92225a5b0 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml @@ -0,0 +1,39 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>8</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <disk type='block' device='lun'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'> + <reservations enabled='yes' managed='yes'/> + </source> + <target dev='sdb' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <driver queues='8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml new file mode 120000 index 000000000..f96d62cb6 --- /dev/null +++ b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml \ No newline at end of file diff --git a/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml new file mode 120000 index 000000000..dde52fd1d --- /dev/null +++ b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/disk-virtio-scsi-reservations.xml \ No newline at end of file diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 28ba46efb..670303898 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -387,6 +387,10 @@ mymain(void) QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-num_queues", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); + DO_TEST("disk-virtio-scsi-reservations", + QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); + DO_TEST("disk-virtio-scsi-reservations-not-managed", + QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-cmd_per_lun", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-max_sectors", -- 2.16.1

On Wed, Mar 14, 2018 at 17:05:30 +0100, Michal Privoznik wrote:
This is a definition that holds information on SCSI persistent reservation settings. The XML part looks like this:
<reservations enabled='yes' managed='no'> <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> </reservations>
If @managed is set to 'yes' then the <source/> is not parsed. This design was agreed on here:
https://www.redhat.com/archives/libvir-list/2017-November/msg01005.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/formatdomain.html.in | 25 +++- docs/schemas/domaincommon.rng | 34 +---- docs/schemas/storagecommon.rng | 50 +++++++ src/conf/domain_conf.c | 36 +++++ src/libvirt_private.syms | 3 + src/util/virstoragefile.c | 148 +++++++++++++++++++++ src/util/virstoragefile.h | 15 +++ .../disk-virtio-scsi-reservations-not-managed.xml | 41 ++++++ .../disk-virtio-scsi-reservations.xml | 39 ++++++ .../disk-virtio-scsi-reservations-not-managed.xml | 1 + .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 4 + 12 files changed, 366 insertions(+), 31 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml
[skipping XML design since that was already agreed upon]
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 86fc27511..00b729fc2 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -5201,6 +5201,13 @@ virDomainDiskDefValidate(const virDomainDiskDef *disk) } }
+ if (disk->src->pr && + disk->device != VIR_DOMAIN_DISK_DEVICE_LUN) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("<reservations/> allowed only for lun disks")); + return -1; + } + /* Reject disks with a bus type that is not compatible with the * given address type. The function considers only buses that are * handled in common code. For other bus types it's not possible @@ -8586,6 +8593,29 @@ virDomainDiskSourcePrivateDataParse(xmlNodePtr node, }
+static int +virDomainDiskSourcePRParse(xmlNodePtr node, + virStoragePRDefPtr *prsrc) +{ + xmlNodePtr child; + virStoragePRDefPtr pr = NULL; + + for (child = node->children; child; child = child->next) { + if (child->type == XML_ELEMENT_NODE && + virXMLNodeNameEqual(child, "reservations")) { + + if (!(pr = virStoragePRDefParseNode(node->doc, child))) + return -1;
Use xpath instead of this ugly loop. I tried hard to remove it in my recent refactors of parsing the secret and encryption attributes.
+ + *prsrc = pr; + return 0; + } + } + + return 0; +} + + static int virDomainStorageSourceParse(xmlNodePtr node, xmlXPathContextPtr ctxt,
[...]
+static virStoragePRDefPtr +virStoragePRDefParseXML(xmlXPathContextPtr ctxt) +{ + virStoragePRDefPtr prd, ret = NULL; + char *enabled = NULL; + char *managed = NULL; + char *type = NULL; + char *path = NULL; + char *mode = NULL; + + if (VIR_ALLOC(prd) < 0) + return NULL; + + if (!(enabled = virXPathString("string(./@enabled)", ctxt))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing @enabled attribute for <reservations/>")); + goto cleanup; + } + + if ((prd->enabled = virTristateBoolTypeFromString(enabled)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'enabled': %s"), NULLSTR(enabled));
'enabled' can't be NULL here.
+ goto cleanup; + } + + if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + managed = virXPathString("string(./@managed)", ctxt); + if ((prd->managed = virTristateBoolTypeFromString(managed)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'managed': %s"), NULLSTR(managed)); + goto cleanup;
So this will report "invalid value for 'managed': <null>" rather than that it is missing.
+ } + + if (prd->managed == VIR_TRISTATE_BOOL_NO) { + type = virXPathString("string(./source[1]/@type)", ctxt); + path = virXPathString("string(./source[1]/@path)", ctxt); + mode = virXPathString("string(./source[1]/@mode)", ctxt); + + if (!type) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection type")); + goto cleanup; + } + + if (!path) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing path")); + goto cleanup; + } + + if (!mode) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection mode")); + goto cleanup;
All the above error messages don't really suggest which element is missing.
+ } + + if (STRNEQ(type, "unix")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported type: %s"), type); + goto cleanup; + } + + if (STRNEQ(mode, "client")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported mode: %s"), mode); + goto cleanup; + }
And these don't really say which value is wrong.
+ + prd->path = path; + path = NULL;
VIR_STEAL_PTR. Also 'mode' and type is not used after this. If you want to write it cleaner you could use a clever boolean xpath query to verify that both are present in the expected state.
+ + } + } + + ret = prd; + prd = NULL; + + cleanup: + VIR_FREE(mode); + VIR_FREE(path); + VIR_FREE(type); + VIR_FREE(managed); + VIR_FREE(enabled); + virStoragePRDefFree(prd); + return ret; +} + + +virStoragePRDefPtr +virStoragePRDefParseNode(xmlDocPtr xml, + xmlNodePtr root) +{ + xmlXPathContextPtr ctxt = NULL; + virStoragePRDefPtr prd = NULL; + + ctxt = xmlXPathNewContext(xml); + if (ctxt == NULL) { + virReportOOMError(); + goto cleanup; + }
I've removed exactly this kind of code in recent refactors. Please pass the global context into the parser and don't add this function again.
+ + ctxt->node = root; + prd = virStoragePRDefParseXML(ctxt); + + cleanup: + xmlXPathFreeContext(ctxt); + return prd; +} + + +void +virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd) +{ + virBufferAsprintf(buf, "<reservations enabled='%s'", + virTristateBoolTypeToString(prd->enabled)); + if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + virBufferAsprintf(buf, " managed='%s'", + virTristateBoolTypeToString(prd->managed)); + if (prd->managed == VIR_TRISTATE_BOOL_NO) { + virBufferAddLit(buf, ">\n"); + virBufferAdjustIndent(buf, 2); + virBufferAddLit(buf, "<source type='unix'"); + virBufferEscapeString(buf, " path='%s'", prd->path); + virBufferAddLit(buf, " mode='client'/>\n"); + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</reservations>\n"); + } else { + virBufferAddLit(buf, "/>\n"); + } + } else { + virBufferAddLit(buf, "/>\n"); + } +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) @@ -2270,6 +2417,7 @@ virStorageSourceClear(virStorageSourcePtr def) virBitmapFree(def->features); VIR_FREE(def->compat); virStorageEncryptionFree(def->encryption); + virStoragePRDefFree(def->pr); virStorageSourceSeclabelsClear(def); virStoragePermsFree(def->perms); VIR_FREE(def->timestamps); diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index 596746ccb..69fea34bc 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -216,6 +216,14 @@ struct _virStorageAuthDef { virSecretLookupTypeDef seclookupdef; };
+typedef struct _virStoragePRDef virStoragePRDef; +typedef virStoragePRDef *virStoragePRDefPtr; +struct _virStoragePRDef { + int enabled; /* enum virTristateBool */ + int managed; /* enum virTristateBool */ + char *path; +}; + typedef struct _virStorageDriverData virStorageDriverData; typedef virStorageDriverData *virStorageDriverDataPtr;
@@ -243,6 +251,7 @@ struct _virStorageSource { bool authInherited; virStorageEncryptionPtr encryption; bool encryptionInherited; + virStoragePRDefPtr pr;
virObjectPtr privateData;
@@ -370,6 +379,12 @@ virStorageAuthDefPtr virStorageAuthDefParse(xmlNodePtr node, xmlXPathContextPtr ctxt); void virStorageAuthDefFormat(virBufferPtr buf, virStorageAuthDefPtr authdef);
+void virStoragePRDefFree(virStoragePRDefPtr prd); +virStoragePRDefPtr virStoragePRDefParseNode(xmlDocPtr xml, + xmlNodePtr root); +void virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd); + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model); diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml new file mode 100644 index 000000000..f4e4f1ee5 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml @@ -0,0 +1,41 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>8</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <disk type='block' device='lun'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'> + <reservations enabled='yes' managed='no'> + <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> + </reservations> + </source> + <target dev='sdb' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <driver queues='8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml new file mode 100644 index 000000000..92225a5b0 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml @@ -0,0 +1,39 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>8</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <disk type='block' device='lun'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'> + <reservations enabled='yes' managed='yes'/> + </source> + <target dev='sdb' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <driver queues='8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memballoon> + </devices> +</domain>
Looks like the two above XML files can be merged into one by adding two different disks.

Couple of reasons for that: a) there's no monitor command to change path where the pr-helper connects to, or b) there's no monitor command to introduce a new pr-helper for a disk that already exists. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 8 ++++++++ src/util/virstoragefile.c | 18 ++++++++++++++++++ src/util/virstoragefile.h | 2 ++ 4 files changed, 29 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index dd33895f9..9c32d6240 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2784,6 +2784,7 @@ virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; virStoragePRDefFormat; virStoragePRDefFree; +virStoragePRDefIsEqual; virStoragePRDefParseNode; virStorageSourceBackingStoreClear; virStorageSourceClear; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 2d108bec1..d8fd54493 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7744,6 +7744,14 @@ qemuDomainDiskChangeSupported(virDomainDiskDefPtr disk, CHECK_EQ(src->readonly, "readonly", true); CHECK_EQ(src->shared, "shared", true); + if (!virStoragePRDefIsEqual(disk->src->pr, + orig_disk->src->pr)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("cannot modify field '%s' of the disk"), + "reservations"); + return false; + } + #undef CHECK_EQ return true; diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 2b5b48f00..e481dcd6d 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -2039,6 +2039,24 @@ virStoragePRDefFormat(virBufferPtr buf, } +bool +virStoragePRDefIsEqual(virStoragePRDefPtr a, + virStoragePRDefPtr b) +{ + if (!a && !b) + return true; + + if (!a || !b) + return false; + + if (a->enabled != b->enabled || + a->managed != b->managed || + STRNEQ_NULLABLE(a->path, b->path)) + return false; + + return true; +} + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index 69fea34bc..a6fcf36d6 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -384,6 +384,8 @@ virStoragePRDefPtr virStoragePRDefParseNode(xmlDocPtr xml, xmlNodePtr root); void virStoragePRDefFormat(virBufferPtr buf, virStoragePRDefPtr prd); +bool virStoragePRDefIsEqual(virStoragePRDefPtr a, + virStoragePRDefPtr b); virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, -- 2.16.1

The capability tracks if qemu has pr-manager-helper object. At this time don't actually detect if qemu has the capability. Not just yet. Only after the code is written the feature will be enabled. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + src/qemu/qemu_capabilities.h | 1 + 2 files changed, 2 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 3eb5ed6d1..180485e63 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -459,6 +459,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "pl011", "machine.pseries.max-cpu-compat", "dump-completed", + "pr-manager-helper", ); diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index c2ec2be19..8d315665a 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -444,6 +444,7 @@ typedef enum { QEMU_CAPS_DEVICE_PL011, /* -device pl011 (not user-instantiable) */ QEMU_CAPS_MACHINE_PSERIES_MAX_CPU_COMPAT, /* -machine pseries,max-cpu-compat= */ QEMU_CAPS_DUMP_COMPLETED, /* DUMP_COMPLETED event */ + QEMU_CAPS_PR_MANAGER_HELPER, /* -object pr-manager-helper */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; -- 2.16.1

While we're not generating the command line just yet (look for the next commits), we can generate the alias for pr-manager. A domain can have up to one managed pr-manager (in which case socket path is decided by libvirt and pr-helper is spawned by libvirt too), but up to ndisks of unmanaged ones (one per disk basically). In case of the former we can generate a simple alias and be sure it'll not conflict. But in case of the latter to avoid any conflicts let's generate alias that's based on something unique - like disk target. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 2 ++ src/qemu/qemu_domain.c | 87 +++++++++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_domain.h | 10 ++++++ src/util/virstoragefile.c | 15 ++++++++ src/util/virstoragefile.h | 2 ++ 5 files changed, 113 insertions(+), 3 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 9c32d6240..777de47ae 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2784,7 +2784,9 @@ virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; virStoragePRDefFormat; virStoragePRDefFree; +virStoragePRDefIsEnabled; virStoragePRDefIsEqual; +virStoragePRDefIsManaged; virStoragePRDefParseNode; virStorageSourceBackingStoreClear; virStorageSourceClear; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d8fd54493..c17e40dc8 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -982,6 +982,18 @@ qemuDomainSecretInfoFree(qemuDomainSecretInfoPtr *secinfo) } +static void +qemuDomainDiskPRDFree(qemuDomainDiskPRDPtr prd) +{ + if (!prd) + return; + + VIR_FREE(prd->alias); + VIR_FREE(prd->path); + VIR_FREE(prd); +} + + static virClassPtr qemuDomainDiskPrivateClass; static void qemuDomainDiskPrivateDispose(void *obj); @@ -1062,6 +1074,7 @@ qemuDomainStorageSourcePrivateDispose(void *obj) qemuDomainSecretInfoFree(&priv->secinfo); qemuDomainSecretInfoFree(&priv->encinfo); + qemuDomainDiskPRDFree(priv->prd); } @@ -1473,9 +1486,6 @@ qemuDomainSecretStorageSourcePrepare(qemuDomainObjPrivatePtr priv, if (!hasAuth && !hasEnc) return 0; - if (!(src->privateData = qemuDomainStorageSourcePrivateNew())) - return -1; - srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); if (hasAuth) { @@ -11802,17 +11812,88 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, } +static int +qemuDomainPrepareDiskPRD(qemuDomainObjPrivatePtr priv, + virDomainDiskDefPtr disk) +{ + qemuDomainStorageSourcePrivatePtr srcPriv; + virStoragePRDefPtr prd = disk->src->pr; + char *prAlias = NULL; + char *prPath = NULL; + int ret = -1; + + if (!virStoragePRDefIsEnabled(prd)) + return 0; + + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_PR_MANAGER_HELPER)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("reservations not supported with this QEMU binary")); + return ret; + } + + if (!virStorageSourceIsLocalStorage(disk->src)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("reservations supported only for local storage")); + return ret; + } + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + + /* Managed PR means there's one pr-manager object per domain + * and the pr-helper process is spawned and managed by + * libvirt. + * If PR is unmanaged there's one pr-manager object per disk + * (in general each disk can have its own pr-manager) and + * it's user's responsibility to start the pr-helper process. + */ + if (virStoragePRDefIsManaged(prd)) { + /* Generate PR helper socket path & alias that are same + * for each disk in the domain. */ + + if (VIR_STRDUP(prAlias, "pr-helper0") < 0) + return ret; + + if (virAsprintf(&prPath, "%s/pr-helper0.sock", priv->libDir) < 0) + goto cleanup; + + } else { + if (virAsprintf(&prAlias, "pr-helper-%s", disk->info.alias) < 0) + return ret; + + if (VIR_STRDUP(prPath, prd->path) < 0) + goto cleanup; + } + + if (VIR_ALLOC(srcPriv->prd) < 0) + goto cleanup; + VIR_STEAL_PTR(srcPriv->prd->alias, prAlias); + VIR_STEAL_PTR(srcPriv->prd->path, prPath); + + ret = 0; + cleanup: + VIR_FREE(prPath); + VIR_FREE(prAlias); + return ret; +} + + int qemuDomainPrepareDiskSource(virDomainDiskDefPtr disk, qemuDomainObjPrivatePtr priv, virQEMUDriverConfigPtr cfg) { + if (!(disk->src->privateData = qemuDomainStorageSourcePrivateNew())) + return -1; + if (qemuDomainPrepareDiskSourceTLS(disk->src, cfg) < 0) return -1; if (qemuDomainSecretDiskPrepare(priv, disk) < 0) return -1; + if (qemuDomainPrepareDiskPRD(priv, disk) < 0) + return -1; + if (disk->src->type == VIR_STORAGE_TYPE_NETWORK && disk->src->protocol == VIR_STORAGE_NET_PROTOCOL_GLUSTER && virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_GLUSTER_DEBUG_LEVEL)) { diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 6d3e6eb5e..b9258eb8e 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -373,6 +373,13 @@ struct _qemuDomainDiskPrivate { bool removable; /* device media can be removed/changed */ }; +typedef struct _qemuDomainDiskPRD qemuDomainDiskPRD; +typedef qemuDomainDiskPRD *qemuDomainDiskPRDPtr; +struct _qemuDomainDiskPRD { + char *alias; + char *path; /* socket path. */ +}; + # define QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src) \ ((qemuDomainStorageSourcePrivatePtr) (src)->privateData) @@ -386,6 +393,9 @@ struct _qemuDomainStorageSourcePrivate { /* data required for decryption of encrypted storage source */ qemuDomainSecretInfoPtr encinfo; + + /* data required for persistent reservations */ + qemuDomainDiskPRDPtr prd; }; virObjectPtr qemuDomainStorageSourcePrivateNew(void); diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index e481dcd6d..6bcbddf32 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -2057,6 +2057,21 @@ virStoragePRDefIsEqual(virStoragePRDefPtr a, return true; } + +bool +virStoragePRDefIsEnabled(virStoragePRDefPtr prd) +{ + return prd && prd->enabled == VIR_TRISTATE_BOOL_YES; +} + + +bool +virStoragePRDefIsManaged(virStoragePRDefPtr prd) +{ + return prd && prd->managed == VIR_TRISTATE_BOOL_YES; +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index a6fcf36d6..88bfb9d2f 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -386,6 +386,8 @@ void virStoragePRDefFormat(virBufferPtr buf, virStoragePRDefPtr prd); bool virStoragePRDefIsEqual(virStoragePRDefPtr a, virStoragePRDefPtr b); +bool virStoragePRDefIsEnabled(virStoragePRDefPtr prd); +bool virStoragePRDefIsManaged(virStoragePRDefPtr prd); virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, -- 2.16.1

Now that we generate pr-manager alias and socket path store them in status XML so that they are preserved across daemon restarts. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 108 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 86 insertions(+), 22 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c17e40dc8..d673ddbb2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1961,28 +1961,6 @@ qemuDomainObjPrivateFree(void *data) } -static int -qemuStorageSourcePrivateDataParse(xmlXPathContextPtr ctxt, - virStorageSourcePtr src) -{ - if (virStorageSourcePrivateDataParseRelPath(ctxt, src) < 0) - return -1; - - return 0; -} - - -static int -qemuStorageSourcePrivateDataFormat(virStorageSourcePtr src, - virBufferPtr buf) -{ - if (virStorageSourcePrivateDataFormatRelPath(src, buf) < 0) - return -1; - - return 0; -} - - static void qemuDomainObjPrivateXMLFormatVcpus(virBufferPtr buf, virDomainDefPtr def) @@ -2590,6 +2568,92 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, } +static int +qemuStorageSourcePrivateDataParsePR(xmlXPathContextPtr ctxt, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = NULL; + int rc; + int ret = -1; + + if ((rc = virXPathBoolean("boolean(./prd)", ctxt)) == 0) { + return 0; + } else if (rc < 0) { + return ret; + } + + if (VIR_ALLOC(prd) < 0) + return ret; + + if (!(prd->alias = virXPathString("string(./prd/alias)", ctxt)) || + !(prd->path = virXPathString("string(./prd/path)", ctxt))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed <prd/>")); + goto cleanup; + } + + VIR_STEAL_PTR(srcPriv->prd, prd); + ret = 0; + cleanup: + qemuDomainDiskPRDFree(prd); + return ret; +} + + +static int +qemuStorageSourcePrivateDataFormatPR(virStorageSourcePtr src, + virBufferPtr buf) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd; + + if (!srcPriv || !srcPriv->prd) + return 0; + + prd = srcPriv->prd; + + virBufferAddLit(buf, "<prd>\n"); + virBufferAdjustIndent(buf, 2); + virBufferAsprintf(buf, "<alias>%s</alias>\n", prd->alias); + virBufferAsprintf(buf, "<path>%s</path>\n", prd->path); + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</prd>\n"); + return 0; +} + + +static int +qemuStorageSourcePrivateDataParse(xmlXPathContextPtr ctxt, + virStorageSourcePtr src) +{ + if (!(src->privateData = qemuDomainStorageSourcePrivateNew())) + return -1; + + if (virStorageSourcePrivateDataParseRelPath(ctxt, src) < 0) + return -1; + + if (qemuStorageSourcePrivateDataParsePR(ctxt, src) < 0) + return -1; + + return 0; +} + + +static int +qemuStorageSourcePrivateDataFormat(virStorageSourcePtr src, + virBufferPtr buf) +{ + if (virStorageSourcePrivateDataFormatRelPath(src, buf) < 0) + return -1; + + if (qemuStorageSourcePrivateDataFormatPR(src, buf) < 0) + return -1; + + return 0; +} + + virDomainXMLPrivateDataCallbacks virQEMUDriverPrivateDataCallbacks = { .alloc = qemuDomainObjPrivateAlloc, .free = qemuDomainObjPrivateFree, -- 2.16.1

On Wed, Mar 14, 2018 at 17:05:34 +0100, Michal Privoznik wrote:
Now that we generate pr-manager alias and socket path store them in status XML so that they are preserved across daemon restarts.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 108 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 86 insertions(+), 22 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c17e40dc8..d673ddbb2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1961,28 +1961,6 @@ qemuDomainObjPrivateFree(void *data) }
-static int -qemuStorageSourcePrivateDataParse(xmlXPathContextPtr ctxt, - virStorageSourcePtr src) -{ - if (virStorageSourcePrivateDataParseRelPath(ctxt, src) < 0) - return -1; - - return 0; -} - - -static int -qemuStorageSourcePrivateDataFormat(virStorageSourcePtr src, - virBufferPtr buf) -{ - if (virStorageSourcePrivateDataFormatRelPath(src, buf) < 0) - return -1; - - return 0; -}
Please don't move these out of here. I've specifically placed them here since I'll need them way earlier than your patch puts them.

On Wed, Mar 14, 2018 at 17:05:34 +0100, Michal Privoznik wrote:
Now that we generate pr-manager alias and socket path store them in status XML so that they are preserved across daemon restarts.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 108 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 86 insertions(+), 22 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c17e40dc8..d673ddbb2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c
[...]
+static int +qemuStorageSourcePrivateDataFormatPR(virStorageSourcePtr src, + virBufferPtr buf) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd; + + if (!srcPriv || !srcPriv->prd) + return 0; + + prd = srcPriv->prd; + + virBufferAddLit(buf, "<prd>\n"); + virBufferAdjustIndent(buf, 2); + virBufferAsprintf(buf, "<alias>%s</alias>\n", prd->alias); + virBufferAsprintf(buf, "<path>%s</path>\n", prd->path);
'path' is user-controlled, so you need to use the 'virBufferEscape' function.
+ virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</prd>\n"); + return 0; +} +

This is the easier part. All we need to do here is put -object pr-manager-helper,id=$alias,path=$socketPath and then just reference the object in -drive file.pr-manager=$alias. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_command.c | 48 ++++++++++++++++++++++ .../disk-virtio-scsi-reservations-not-managed.args | 29 +++++++++++++ .../disk-virtio-scsi-reservations.args | 29 +++++++++++++ tests/qemuxml2argvtest.c | 8 ++++ 4 files changed, 114 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c..d1bef0d00 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1514,6 +1514,20 @@ qemuDiskSourceGetProps(virStorageSourcePtr src) } +static void +qemuBuildDriveSourcePR(virBufferPtr buf, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + + if (!prd || !prd->alias) + return; + + virBufferAsprintf(buf, ",file.pr-manager=%s", prd->alias); +} + + static int qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, virQEMUCapsPtr qemuCaps, @@ -1590,6 +1604,8 @@ qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, if (disk->src->debug) virBufferAsprintf(buf, ",file.debug=%d", disk->src->debugLevel); + + qemuBuildDriveSourcePR(buf, disk->src); } else { if (!(source = virQEMUBuildDriveCommandlineFromJSON(srcprops))) goto cleanup; @@ -9789,6 +9805,36 @@ qemuBuildPanicCommandLine(virCommandPtr cmd, } +static void +qemuBuildMasterPRCommandLine(virCommandPtr cmd, + const virDomainDef *def) +{ + size_t i; + bool managedAdded = false; + + for (i = 0; i < def->ndisks; i++) { + const virDomainDiskDef *disk = def->disks[i]; + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!prd || !prd->alias) + continue; + + if (virStoragePRDefIsManaged(disk->src->pr)) { + if (managedAdded) + continue; + + managedAdded = true; + } + + virBufferAsprintf(&buf, "pr-manager-helper,id=%s,path=%s", prd->alias, prd->path); + virCommandAddArg(cmd, "-object"); + virCommandAddArgBuffer(cmd, &buf); + } +} + + /** * qemuBuildCommandLineValidate: * @@ -9941,6 +9987,8 @@ qemuBuildCommandLine(virQEMUDriverPtr driver, if (qemuBuildMasterKeyCommandLine(cmd, priv) < 0) goto error; + qemuBuildMasterPRCommandLine(cmd, def); + if (enableFips) virCommandAddArg(cmd, "-enable-fips"); diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args new file mode 100644 index 000000000..fea4c6d21 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i686 \ +-name QEMUGuest1 \ +-S \ +-object pr-manager-helper,id=pr-helper-scsi0-0-0-0,\ +path=/path/to/qemu-pr-helper.sock \ +-M pc \ +-m 214 \ +-smp 8,sockets=8,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-nographic \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=readline \ +-no-acpi \ +-boot c \ +-device virtio-scsi-pci,id=scsi0,num_queues=8,bus=pci.0,addr=0x3 \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest1,file.pr-manager=pr-helper-scsi0-0-0-0,\ +format=raw,if=none,id=drive-scsi0-0-0-0,boot=on \ +-device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\ +drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args new file mode 100644 index 000000000..2dc53cf05 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i686 \ +-name QEMUGuest1 \ +-S \ +-object pr-manager-helper,id=pr-helper0,\ +path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock \ +-M pc \ +-m 214 \ +-smp 8,sockets=8,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-nographic \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=readline \ +-no-acpi \ +-boot c \ +-device virtio-scsi-pci,id=scsi0,num_queues=8,bus=pci.0,addr=0x3 \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest1,file.pr-manager=pr-helper0,format=raw,\ +if=none,id=drive-scsi0-0-0-0,boot=on \ +-device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\ +drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 731db9ed5..ecf5997cb 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -2998,6 +2998,14 @@ mymain(void) QEMU_CAPS_HDA_DUPLEX); DO_TEST("user-aliases2", QEMU_CAPS_DEVICE_IOH3420, QEMU_CAPS_ICH9_AHCI); + DO_TEST("disk-virtio-scsi-reservations", + QEMU_CAPS_DRIVE_BOOT, QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_SCSI_BLOCK, QEMU_CAPS_PR_MANAGER_HELPER); + + DO_TEST("disk-virtio-scsi-reservations-not-managed", + QEMU_CAPS_DRIVE_BOOT, QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_SCSI_BLOCK, QEMU_CAPS_PR_MANAGER_HELPER); + /* Test disks with format probing enabled for legacy reasons. * New tests should not go in this section. */ driver.config->allowDiskFormatProbing = true; -- 2.16.1

On Wed, Mar 14, 2018 at 17:05:35 +0100, Michal Privoznik wrote:
This is the easier part. All we need to do here is put -object pr-manager-helper,id=$alias,path=$socketPath and then just reference the object in -drive file.pr-manager=$alias.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_command.c | 48 ++++++++++++++++++++++ .../disk-virtio-scsi-reservations-not-managed.args | 29 +++++++++++++ .../disk-virtio-scsi-reservations.args | 29 +++++++++++++ tests/qemuxml2argvtest.c | 8 ++++ 4 files changed, 114 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c..d1bef0d00 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c
[...]
@@ -9789,6 +9805,36 @@ qemuBuildPanicCommandLine(virCommandPtr cmd, }
+static void +qemuBuildMasterPRCommandLine(virCommandPtr cmd, + const virDomainDef *def) +{ + size_t i; + bool managedAdded = false; + + for (i = 0; i < def->ndisks; i++) { + const virDomainDiskDef *disk = def->disks[i]; + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!prd || !prd->alias) + continue; + + if (virStoragePRDefIsManaged(disk->src->pr)) { + if (managedAdded) + continue; + + managedAdded = true; + } + + virBufferAsprintf(&buf, "pr-manager-helper,id=%s,path=%s", prd->alias, prd->path);
We usually generate this from JSON string that will be necessary for the hotplug step. The reason is that if it's ever necessary to add additional arguments for the object, you don't need to fix two places.
+ virCommandAddArg(cmd, "-object"); + virCommandAddArgBuffer(cmd, &buf); + } +} + + /** * qemuBuildCommandLineValidate: *

Just like we allow users overriding path to bridge-helper detected at compile time we can allow them to override path to qemu-pr-helper. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- m4/virt-driver-qemu.m4 | 5 +++++ src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 4 ++++ src/qemu/qemu_conf.c | 7 ++++++- src/qemu/qemu_conf.h | 1 + src/qemu/test_libvirtd_qemu.aug.in | 1 + 6 files changed, 18 insertions(+), 1 deletion(-) diff --git a/m4/virt-driver-qemu.m4 b/m4/virt-driver-qemu.m4 index b9bafdab9..0e0f261d6 100644 --- a/m4/virt-driver-qemu.m4 +++ b/m4/virt-driver-qemu.m4 @@ -57,6 +57,11 @@ AC_DEFUN([LIBVIRT_DRIVER_CHECK_QEMU], [ [/usr/libexec:/usr/lib/qemu:/usr/lib]) AC_DEFINE_UNQUOTED([QEMU_BRIDGE_HELPER], ["$QEMU_BRIDGE_HELPER"], [QEMU bridge helper]) + AC_PATH_PROG([QEMU_PR_HELPER], [qemu-pr-helper], + [/usr/libexec/qemu-pr-helper], + [/usr/libexec:/usr/lib/qemu:/usr/lib]) + AC_DEFINE_UNQUOTED([QEMU_PR_HELPER], ["$QEMU_PR_HELPER"], + [QEMU PR helper]) ]) AC_DEFUN([LIBVIRT_DRIVER_RESULT_QEMU], [ diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug index c19bf3a43..2dc16e91f 100644 --- a/src/qemu/libvirtd_qemu.aug +++ b/src/qemu/libvirtd_qemu.aug @@ -86,6 +86,7 @@ module Libvirtd_qemu = let process_entry = str_entry "hugetlbfs_mount" | bool_entry "clear_emulator_capabilities" | str_entry "bridge_helper" + | str_entry "pr_helper" | bool_entry "set_process_name" | int_entry "max_processes" | int_entry "max_files" diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 43dd561cc..4bc668406 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -775,3 +775,7 @@ # This directory is used for memoryBacking source if configured as file. # NOTE: big files will be stored here #memory_backing_dir = "/var/lib/libvirt/qemu/ram" + +# Path to the SCSI persistent reservations helper. This helper is +# used whenever <reservations/> are enabled for SCSI disks. +#pr_helper = "/usr/libexec/qemu-pr-helper" diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 36cf3a281..8c69dbe75 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -307,7 +307,8 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) goto error; } - if (VIR_STRDUP(cfg->bridgeHelperName, QEMU_BRIDGE_HELPER) < 0) + if (VIR_STRDUP(cfg->bridgeHelperName, QEMU_BRIDGE_HELPER) < 0 || + VIR_STRDUP(cfg->prHelperName, QEMU_PR_HELPER) < 0) goto error; cfg->clearEmulatorCapabilities = true; @@ -392,6 +393,7 @@ static void virQEMUDriverConfigDispose(void *obj) } VIR_FREE(cfg->hugetlbfs); VIR_FREE(cfg->bridgeHelperName); + VIR_FREE(cfg->prHelperName); VIR_FREE(cfg->saveImageFormat); VIR_FREE(cfg->dumpImageFormat); @@ -759,6 +761,9 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg, if (virConfGetValueString(conf, "bridge_helper", &cfg->bridgeHelperName) < 0) goto cleanup; + if (virConfGetValueString(conf, "pr_helper", &cfg->prHelperName) < 0) + goto cleanup; + if (virConfGetValueBool(conf, "mac_filter", &cfg->macFilter) < 0) goto cleanup; diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index e1ad5463f..7a63780c4 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -153,6 +153,7 @@ struct _virQEMUDriverConfig { size_t nhugetlbfs; char *bridgeHelperName; + char *prHelperName; bool macFilter; diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qemu.aug.in index 688e5b9fd..c0efae47b 100644 --- a/src/qemu/test_libvirtd_qemu.aug.in +++ b/src/qemu/test_libvirtd_qemu.aug.in @@ -100,3 +100,4 @@ module Test_libvirtd_qemu = { "1" = "mount" } } { "memory_backing_dir" = "/var/lib/libvirt/qemu/ram" } +{ "pr_helper" = "/usr/libexec/qemu-pr-helper" } -- 2.16.1

Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 16 ++++++++++++++++ src/qemu/qemu_domain.h | 2 ++ 2 files changed, 18 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d673ddbb2..3e05cf0f8 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1864,6 +1864,7 @@ qemuDomainObjPrivateAlloc(void *opaque) priv->migMaxBandwidth = QEMU_DOMAIN_MIG_BANDWIDTH_MAX; priv->driver = opaque; + priv->prPid = (pid_t) -1; return priv; @@ -1926,6 +1927,8 @@ qemuDomainObjPrivateDataClear(qemuDomainObjPrivatePtr priv) virBitmapFree(priv->migrationCaps); priv->migrationCaps = NULL; + + priv->prPid = (pid_t) -1; } @@ -2182,6 +2185,11 @@ qemuDomainObjPrivateXMLFormat(virBufferPtr buf, if (qemuDomainObjPrivateXMLFormatBlockjobs(buf, vm) < 0) return -1; + if (priv->prPid != (pid_t) -1) { + virBufferAsprintf(buf, "<prPid>%lld</prPid>\n", + (long long) priv->prPid); + } + return 0; } @@ -2390,6 +2398,7 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, xmlNodePtr *nodes = NULL; xmlNodePtr node = NULL; virQEMUCapsPtr qemuCaps = NULL; + long long prPid = -1; if (VIR_ALLOC(priv->monConfig) < 0) goto error; @@ -2552,6 +2561,13 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, if (qemuDomainObjPrivateXMLParseBlockjobs(priv, ctxt) < 0) goto error; + if (virXPathLongLong("string(./prPid)", ctxt, &prPid) < -1) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("unable to parse <prPid/>")); + goto error; + } + priv->prPid = (pid_t) prPid; + return 0; error: diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index b9258eb8e..418b47153 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -342,6 +342,8 @@ struct _qemuDomainObjPrivate { /* Migration capabilities. Rechecked on reconnect, not to be saved in * private XML. */ virBitmapPtr migrationCaps; + + pid_t prPid; }; # define QEMU_DOMAIN_PRIVATE(vm) \ -- 2.16.1

Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one). The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 169 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 169 insertions(+) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 57c06c7c1..b876d293a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2551,6 +2551,169 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, ret = 0; cleanup: virObjectUnref(caps); + + return ret; +} + + +static void +qemuProcessKillPRDaemon(virDomainObjPtr vm) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (priv->prPid == (pid_t) -1) + return; + + virProcessKillPainfully(priv->prPid, true); + priv->prPid = (pid_t) -1; +} + + +static int +qemuProcessSetupOnePRDaemonHook(void *opaque) +{ + virDomainObjPtr vm = opaque; + size_t i, nfds = 0; + int *fds = NULL; + int ret = -1; + + if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0) + return ret; + + if (nfds > 0 && + virProcessSetNamespaces(nfds, fds) < 0) + goto cleanup; + + ret = 0; + cleanup: + for (i = 0; i < nfds; i++) + VIR_FORCE_CLOSE(fds[i]); + VIR_FREE(fds); + return ret; +} + + +static int +qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + virQEMUDriverConfigPtr cfg; + qemuDomainStorageSourcePrivatePtr srcPriv; + qemuDomainDiskPRDPtr prd; + char *pidfile = NULL; + pid_t cpid = -1; + virCommandPtr cmd = NULL; + virTimeBackOffVar timebackoff; + const unsigned long long timeout = 500000; /* ms */ + int ret = -1; + + if (priv->prPid != (pid_t) -1 || + !virStoragePRDefIsManaged(disk->src->pr)) + return 0; + + cfg = virQEMUDriverGetConfig(driver); + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + prd = srcPriv->prd; + + if (!virFileIsExecutable(cfg->prHelperName)) { + virReportSystemError(errno, _("'%s' is not a suitable pr helper"), + cfg->prHelperName); + goto cleanup; + } + + if (!(pidfile = virPidFileBuildPath(cfg->stateDir, prd->alias))) + goto cleanup; + + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(cfg->prHelperName, + "-k", prd->path, + NULL))) + goto cleanup; + + virCommandDaemonize(cmd); + virCommandSetPidFile(cmd, pidfile); + /* The only caveat there is that we should place the process + * into the same namespace and cgroup as qemu (so that it + * shares the same view of the system). */ + virCommandSetPreExecHook(cmd, qemuProcessSetupOnePRDaemonHook, vm); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (virPidFileReadPath(pidfile, &cpid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pr helper %s didn't show up"), prd->alias); + goto cleanup; + } + + if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + if (virFileExists(prd->path)) + break; + + if (virProcessKill(cpid, 0) == 0) + continue; + + virReportSystemError(errno, + _("pr helper %s died unexpectedly"), prd->alias); + goto cleanup; + } + + if (priv->cgroup && + virCgroupAddMachineTask(priv->cgroup, cpid) < 0) + goto cleanup; + + if (qemuSecurityDomainSetPathLabel(driver->securityManager, + vm->def, prd->path, true) < 0) + goto cleanup; + + priv->prPid = cpid; + ret = 0; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + virProcessKillPainfully(cpid, true); + } + virCommandFree(cmd); + if (pidfile) { + if (unlink(pidfile) < 0 && + errno != ENOENT && + !virGetLastError()) + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + VIR_FREE(pidfile); + } + virObjectUnref(cfg); + return ret; +} + + +static int +qemuProcessSetupPRDaemon(virDomainObjPtr vm) +{ + size_t i; + int ret = -1; + + for (i = 0; i < vm->def->ndisks; i++) { + if (qemuProcessSetupOnePRDaemon(vm, vm->def->disks[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + if (ret < 0) + qemuProcessKillPRDaemon(vm); return ret; } @@ -6065,6 +6228,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessResctrlCreate(driver, vm) < 0) goto cleanup; + VIR_DEBUG("Setting up PR daemon"); + if (qemuProcessSetupPRDaemon(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting domain security labels"); if (qemuSecuritySetAllLabel(driver, vm, @@ -6645,6 +6812,8 @@ void qemuProcessStop(virQEMUDriverPtr driver, VIR_FREE(vm->def->seclabels[i]->imagelabel); } + qemuProcessKillPRDaemon(vm); + qemuHostdevReAttachDomainDevices(driver, vm->def); def = vm->def; -- 2.16.1

On Wed, Mar 14, 2018 at 17:05:38 +0100, Michal Privoznik wrote:
Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one). The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 169 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 169 insertions(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 57c06c7c1..b876d293a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2551,6 +2551,169 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, +static int +qemuProcessSetupOnePRDaemonHook(void *opaque)
The naming does not make much sense here. There's only one managed daemon.
+{ + virDomainObjPtr vm = opaque; + size_t i, nfds = 0; + int *fds = NULL; + int ret = -1; + + if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0) + return ret; + + if (nfds > 0 && + virProcessSetNamespaces(nfds, fds) < 0) + goto cleanup; + + ret = 0; + cleanup: + for (i = 0; i < nfds; i++) + VIR_FORCE_CLOSE(fds[i]); + VIR_FREE(fds); + return ret; +} + + +static int +qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + virQEMUDriverConfigPtr cfg; + qemuDomainStorageSourcePrivatePtr srcPriv; + qemuDomainDiskPRDPtr prd; + char *pidfile = NULL; + pid_t cpid = -1; + virCommandPtr cmd = NULL; + virTimeBackOffVar timebackoff; + const unsigned long long timeout = 500000; /* ms */ + int ret = -1; + + if (priv->prPid != (pid_t) -1 || + !virStoragePRDefIsManaged(disk->src->pr)) + return 0;
This function should ever be called only for the managed PR daemon and only once per domain. This looks wrong and makes the caller look that multiple daemons are setup. Please rename it and move this logic to the caller, which determines whether the managed daemon is necessary and calls this exactly once if it is.
+ + cfg = virQEMUDriverGetConfig(driver); + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + prd = srcPriv->prd; + + if (!virFileIsExecutable(cfg->prHelperName)) { + virReportSystemError(errno, _("'%s' is not a suitable pr helper"), + cfg->prHelperName); + goto cleanup; + } + + if (!(pidfile = virPidFileBuildPath(cfg->stateDir, prd->alias))) + goto cleanup; + + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(cfg->prHelperName, + "-k", prd->path, + NULL))) + goto cleanup; + + virCommandDaemonize(cmd); + virCommandSetPidFile(cmd, pidfile); + /* The only caveat there is that we should place the process + * into the same namespace and cgroup as qemu (so that it + * shares the same view of the system). */ + virCommandSetPreExecHook(cmd, qemuProcessSetupOnePRDaemonHook, vm); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (virPidFileReadPath(pidfile, &cpid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pr helper %s didn't show up"), prd->alias); + goto cleanup; + } + + if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + if (virFileExists(prd->path)) + break; + + if (virProcessKill(cpid, 0) == 0) + continue; + + virReportSystemError(errno, + _("pr helper %s died unexpectedly"), prd->alias); + goto cleanup; + }
This code does not do anything on timeout.
+ + if (priv->cgroup && + virCgroupAddMachineTask(priv->cgroup, cpid) < 0) + goto cleanup; + + if (qemuSecurityDomainSetPathLabel(driver->securityManager, + vm->def, prd->path, true) < 0) + goto cleanup;
So this function will probably fail with a weird error message.
+ + priv->prPid = cpid; + ret = 0; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + virProcessKillPainfully(cpid, true); + } + virCommandFree(cmd); + if (pidfile) { + if (unlink(pidfile) < 0 && + errno != ENOENT && + !virGetLastError()) + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + VIR_FREE(pidfile);
By removing the pidfile, you can't make sure later that the PR helper program is still the same process, thus when killing the VM by the pid number only you might actually kill a different process. I unfortunately already deleted the top of the message so you'll need to fix the function qemuProcessKillPRDaemon to see whether it's the same process, similarly to what we do when reconnecting to qemu.
+ } + virObjectUnref(cfg); + return ret; +} + + +static int +qemuProcessSetupPRDaemon(virDomainObjPtr vm) +{ + size_t i; + int ret = -1; + + for (i = 0; i < vm->def->ndisks; i++) { + if (qemuProcessSetupOnePRDaemon(vm, vm->def->disks[i]) < 0) + goto cleanup;
This looks wrong, since we only do one PR daemon. This function should determine whether one is needed and call that function only once.
+ } + + ret = 0; + cleanup: + if (ret < 0) + qemuProcessKillPRDaemon(vm); return ret; }

On 03/15/2018 01:31 PM, Peter Krempa wrote:
On Wed, Mar 14, 2018 at 17:05:38 +0100, Michal Privoznik wrote:
Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one). The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 169 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 169 insertions(+)
+ + priv->prPid = cpid; + ret = 0; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + virProcessKillPainfully(cpid, true); + } + virCommandFree(cmd); + if (pidfile) { + if (unlink(pidfile) < 0 && + errno != ENOENT && + !virGetLastError()) + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + VIR_FREE(pidfile);
By removing the pidfile, you can't make sure later that the PR helper program is still the same process, thus when killing the VM by the pid number only you might actually kill a different process.
I unfortunately already deleted the top of the message so you'll need to fix the function qemuProcessKillPRDaemon to see whether it's the same process, similarly to what we do when reconnecting to qemu.
Okay, let me write here what I told you in person. I've tried couple of approaches to do proper locking of pidfile: a) run qemu-pr-helper -f $pidfile in combination with virPidFileForceCleanupPath(). This doesn't work because qemu-pr-helper uses lockf() to lock the pidfile. Libvirt uses F_SETLK. These two locks don't know about each other. Sure, we can adapt virPid* to do lockf()-ing. But we will fail miserably if qemu ever decides to change that to say fnctl(). b) I've changed qemu-pr-helper to use fnctl() to lock pidfile. They have a nice wrapper over it - qemu_lock_fd(). This still did not fly. Their wrapper prefers F_OFD_SETLK over F_SETLK. OFD stands for Open File Description Locks. The lock is not associated with process but with FD and thus survives fork(). Again, incompatible locks. If file is locked with F_OFD_SETLK another process can F_SETLK it successfully. c) My third attempt was to not rely on qemu-pr-helper locking at all (after all they can change it anytime which could break our tailored solution). So I've modified qemuProcessStartPRDaemonHook() so it calls virPidFileAcquirePath() and leaks the pidfile FD to qemu-pr-helper. This looked promising: command hooks are called just before exec(), after all FDs are closed. Unfortunately, virPidFileAcquirePath() sets FD_CLOEXEC flag, so no leaking is possible. Sure I can call virSetInherit() to cancel the CLOEXEC flag. BUT, here the proposal: IIUC the only problem we have with storing pid and killing it later is that we might end up killing a different process than qemu-pr-helper because qemu-pr-helper may die and another process might be spawn with its PID. Well, this would be solved with the events I'm mentioning in cover letter. The idea is that qemu sends us an event as soon as qemu-pr-helper process dies. So we would spawn a new one and update the stored PID. This way we would never kill wrong process. BTW: don't look into qemuProcessReconnect. It suffers from the same problem. If libvirtd is stopped, qemu quits and another process is spawned with its PID, we kill the new process as soon as we try to reconnect to the monitor, because connecting to the monitor fails (obviously) so we jump onto error label where qemuProcessStop() is called. Worse, if there's OOM situation and we fail to asprintf() pidfile path we jump onto the label too (and massacre the process we think is qemu). My proposal is to: 1) keep tracking of PID as-is now, 2) forbid starting domains with <reservations managed='yes'/> until the events are implemented. Alternatively, I can go with c) (in the end I have it working locally) and do all the changes necessary. Frankly, I don't have preference. I also wonder if API design for events and query-* command for reconnect might makes us lean towards one of the options. Michal

On Thu, Mar 22, 2018 at 11:16:47AM +0100, Michal Privoznik wrote:
On 03/15/2018 01:31 PM, Peter Krempa wrote:
On Wed, Mar 14, 2018 at 17:05:38 +0100, Michal Privoznik wrote:
Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one). The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 169 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 169 insertions(+)
+ + priv->prPid = cpid; + ret = 0; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + virProcessKillPainfully(cpid, true); + } + virCommandFree(cmd); + if (pidfile) { + if (unlink(pidfile) < 0 && + errno != ENOENT && + !virGetLastError()) + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + VIR_FREE(pidfile);
By removing the pidfile, you can't make sure later that the PR helper program is still the same process, thus when killing the VM by the pid number only you might actually kill a different process.
I unfortunately already deleted the top of the message so you'll need to fix the function qemuProcessKillPRDaemon to see whether it's the same process, similarly to what we do when reconnecting to qemu.
In general when pidfiles are locked using lockf()/fcntl(), you should never unlink them unless you have first acquired the lock, and even then the code that acquires the lock needs to be made robust against races. a race condition Our virPidFileAcquirePath copes with race'ing unlinks, but QEMU's code does not. We should fix QEMU in this respect too.
Okay, let me write here what I told you in person. I've tried couple of approaches to do proper locking of pidfile:
a) run qemu-pr-helper -f $pidfile in combination with virPidFileForceCleanupPath(). This doesn't work because qemu-pr-helper uses lockf() to lock the pidfile. Libvirt uses F_SETLK. These two locks don't know about each other. Sure, we can adapt virPid* to do lockf()-ing. But we will fail miserably if qemu ever decides to change that to say fnctl().
I don't think that is correct - lockf() is just a thing wrapper around fnctl() F_SETLK from what I understand. Many systems do this equivalance, although technically POSIX leaves it undefined.
b) I've changed qemu-pr-helper to use fnctl() to lock pidfile. They have a nice wrapper over it - qemu_lock_fd(). This still did not fly. Their wrapper prefers F_OFD_SETLK over F_SETLK. OFD stands for Open File Description Locks. The lock is not associated with process but with FD and thus survives fork(). Again, incompatible locks. If file is locked with F_OFD_SETLK another process can F_SETLK it successfully.
QEMU does the OFD_SETLK/SETLK magic because its APIs were primarily design for use with the block layer, where it must have sane semantics to prevent accidental loss of the lock. The limitations of F_SETLK are not really an issue for pidfiles which are opened once and never closed. So it would be valid to change QEMU to use fcntl(F_SETLK) exclusively for pidfiles. This would be semantically compatible with lockf() on most systems, and get QEMU away from under-specified semantics of lockf().
c) My third attempt was to not rely on qemu-pr-helper locking at all (after all they can change it anytime which could break our tailored solution). So I've modified qemuProcessStartPRDaemonHook() so it calls virPidFileAcquirePath() and leaks the pidfile FD to qemu-pr-helper. This looked promising: command hooks are called just before exec(), after all FDs are closed. Unfortunately, virPidFileAcquirePath() sets FD_CLOEXEC flag, so no leaking is possible. Sure I can call virSetInherit() to cancel the CLOEXEC flag. BUT, here the proposal:
IIUC the only problem we have with storing pid and killing it later is that we might end up killing a different process than qemu-pr-helper because qemu-pr-helper may die and another process might be spawn with its PID. Well, this would be solved with the events I'm mentioning in cover letter. The idea is that qemu sends us an event as soon as qemu-pr-helper process dies. So we would spawn a new one and update the stored PID. This way we would never kill wrong process.
BTW: don't look into qemuProcessReconnect. It suffers from the same problem. If libvirtd is stopped, qemu quits and another process is spawned with its PID, we kill the new process as soon as we try to reconnect to the monitor, because connecting to the monitor fails (obviously) so we jump onto error label where qemuProcessStop() is called. Worse, if there's OOM situation and we fail to asprintf() pidfile path we jump onto the label too (and massacre the process we think is qemu).
My proposal is to: 1) keep tracking of PID as-is now, 2) forbid starting domains with <reservations managed='yes'/> until the events are implemented.
Alternatively, I can go with c) (in the end I have it working locally) and do all the changes necessary. Frankly, I don't have preference. I also wonder if API design for events and query-* command for reconnect might makes us lean towards one of the options.
I would suggest doing both a) and b). Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Surprisingly, nothing special is happening here. If we are the first to use the managed helper then spawn it. If not, we're almost done. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 38 +++++++++++++++++++++----- src/qemu/qemu_process.h | 7 +++++ 3 files changed, 110 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index e0a5300f0..8cc0b631d 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -348,6 +348,58 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, } +static int +qemuBuildPRDefInfoProps(virDomainObjPtr vm, + virDomainDiskDefPtr disk, + virJSONValuePtr *prmgrProps, + const char **prAlias, + const char **prPath) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + qemuDomainStorageSourcePrivatePtr srcPriv; + virJSONValuePtr props = NULL; + int ret = -1; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + + *prmgrProps = NULL; + + if (priv->prPid != (pid_t) -1 || + !srcPriv->prd || + !srcPriv->prd->alias) + return 0; + + if (virJSONValueObjectCreate(&props, + "s:path", srcPriv->prd->path, + NULL) < 0) + goto cleanup; + + if (qemuProcessSetupOnePRDaemon(vm, disk) < 0) + goto cleanup; + + *prAlias = srcPriv->prd->alias; + *prPath = srcPriv->prd->path; + *prmgrProps = props; + props = NULL; + ret = 0; + cleanup: + virJSONValueFree(props); + return ret; +} + + +static void +qemuDestroyPRDefObject(virDomainObjPtr vm, + const char *alias, + const char *path) +{ + if (!alias) + return; + + qemuProcessKillPRDaemon(vm, path, false); +} + + /** * qemuDomainAttachDiskGeneric: * @@ -365,12 +417,16 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, char *devstr = NULL; char *drivestr = NULL; char *drivealias = NULL; + const char *prAlias = NULL; + const char *prPath = NULL; bool driveAdded = false; bool secobjAdded = false; bool encobjAdded = false; + bool prmgrAdded = false; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); virJSONValuePtr secobjProps = NULL; virJSONValuePtr encobjProps = NULL; + virJSONValuePtr prmgrProps = NULL; qemuDomainStorageSourcePrivatePtr srcPriv; qemuDomainSecretInfoPtr secinfo = NULL; qemuDomainSecretInfoPtr encinfo = NULL; @@ -403,6 +459,9 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, disk->info.alias) < 0) goto error; + if (qemuBuildPRDefInfoProps(vm, disk, &prmgrProps, &prAlias, &prPath) < 0) + goto error; + if (!(drivestr = qemuBuildDriveStr(disk, false, priv->qemuCaps))) goto error; @@ -435,6 +494,15 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, encobjAdded = true; } + if (prmgrProps) { + rv = qemuMonitorAddObject(priv->mon, "pr-manager-helper", prAlias, + prmgrProps); + prmgrProps = NULL; /* qemuMonitorAddObject consumes */ + if (rv < 0) + goto exit_monitor; + prmgrAdded = true; + } + if (qemuMonitorAddDrive(priv->mon, drivestr) < 0) goto exit_monitor; driveAdded = true; @@ -455,6 +523,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, cleanup: virJSONValueFree(secobjProps); virJSONValueFree(encobjProps); + virJSONValueFree(prmgrProps); qemuDomainSecretDiskDestroy(disk); VIR_FREE(devstr); VIR_FREE(drivestr); @@ -472,6 +541,8 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, secinfo->s.aes.alias)); if (encobjAdded) ignore_value(qemuMonitorDelObject(priv->mon, encinfo->s.aes.alias)); + if (prmgrAdded) + ignore_value(qemuMonitorDelObject(priv->mon, prAlias)); if (qemuDomainObjExitMonitor(driver, vm) < 0) ret = -2; virErrorRestore(&orig_err); @@ -481,6 +552,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, error: qemuDomainDelDiskSrcTLSObject(driver, vm, disk->src); ignore_value(qemuHotplugPrepareDiskAccess(driver, vm, disk, NULL, true)); + qemuDestroyPRDefObject(vm, prAlias, prPath); goto cleanup; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index b876d293a..cb160727c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2556,16 +2556,40 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, } -static void -qemuProcessKillPRDaemon(virDomainObjPtr vm) +void +qemuProcessKillPRDaemon(virDomainObjPtr vm, + const char *socketPath, + bool force) { qemuDomainObjPrivatePtr priv = vm->privateData; + size_t nmanaged = 0; + size_t i; if (priv->prPid == (pid_t) -1) return; - virProcessKillPainfully(priv->prPid, true); - priv->prPid = (pid_t) -1; + for (i = 0; i < vm->def->ndisks; i++) { + qemuDomainStorageSourcePrivatePtr srcPriv; + virDomainDiskDefPtr disk = vm->def->disks[i]; + + if (!virStoragePRDefIsManaged(disk->src->pr)) + continue; + + nmanaged++; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + if (!socketPath) + socketPath = srcPriv->prd->path; + } + + if (force || nmanaged <= 1) { + virProcessKillPainfully(priv->prPid, true); + priv->prPid = (pid_t) -1; + if (socketPath && + unlink(socketPath) < 0 && + errno != ENOENT) + VIR_WARN("Unable to remove pr helper socket %s", socketPath); + } } @@ -2593,7 +2617,7 @@ qemuProcessSetupOnePRDaemonHook(void *opaque) } -static int +int qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, virDomainDiskDefPtr disk) { @@ -2713,7 +2737,7 @@ qemuProcessSetupPRDaemon(virDomainObjPtr vm) ret = 0; cleanup: if (ret < 0) - qemuProcessKillPRDaemon(vm); + qemuProcessKillPRDaemon(vm, NULL, true); return ret; } @@ -6812,7 +6836,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, VIR_FREE(vm->def->seclabels[i]->imagelabel); } - qemuProcessKillPRDaemon(vm); + qemuProcessKillPRDaemon(vm, NULL, true); qemuHostdevReAttachDomainDevices(driver, vm->def); diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index 274111567..fab45eb2d 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -203,4 +203,11 @@ int qemuProcessRefreshDisks(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob); +int qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk); + +void qemuProcessKillPRDaemon(virDomainObjPtr vm, + const char *socketPath, + bool force); + #endif /* __QEMU_PROCESS_H__ */ -- 2.16.1

On Wed, Mar 14, 2018 at 17:05:39 +0100, Michal Privoznik wrote:
Surprisingly, nothing special is happening here. If we are the first to use the managed helper then spawn it. If not, we're almost done.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 38 +++++++++++++++++++++----- src/qemu/qemu_process.h | 7 +++++ 3 files changed, 110 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index e0a5300f0..8cc0b631d 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -348,6 +348,58 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, }
+static int +qemuBuildPRDefInfoProps(virDomainObjPtr vm, + virDomainDiskDefPtr disk, + virJSONValuePtr *prmgrProps, + const char **prAlias, + const char **prPath)
It looks like this function is doing a bit too much for the name. It's not only creating the JSON props of the object but setting up the daemon. That looks like should be separated.
+{ + qemuDomainObjPrivatePtr priv = vm->privateData; + qemuDomainStorageSourcePrivatePtr srcPriv; + virJSONValuePtr props = NULL; + int ret = -1; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + + *prmgrProps = NULL; + + if (priv->prPid != (pid_t) -1 || + !srcPriv->prd || + !srcPriv->prd->alias) + return 0;
So, this will exit early if you have the managed daemon, so you will not add the object for the unmanaged option if hotplugging? As said above, this function does too much. It should be split and then you will not have these problems.
+ if (virJSONValueObjectCreate(&props, + "s:path", srcPriv->prd->path, + NULL) < 0) + goto cleanup; + + if (qemuProcessSetupOnePRDaemon(vm, disk) < 0) + goto cleanup; + + *prAlias = srcPriv->prd->alias; + *prPath = srcPriv->prd->path; + *prmgrProps = props; + props = NULL; + ret = 0; + cleanup: + virJSONValueFree(props); + return ret; +} + + +static void +qemuDestroyPRDefObject(virDomainObjPtr vm, + const char *alias, + const char *path) +{ + if (!alias) + return; + + qemuProcessKillPRDaemon(vm, path, false);
This does not make much sense. You only ever need to kill the managed one, so passing random path is weird. This should be called only for the managed daemon.
+} + + /** * qemuDomainAttachDiskGeneric: *
[...]
@@ -403,6 +459,9 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, disk->info.alias) < 0) goto error;
+ if (qemuBuildPRDefInfoProps(vm, disk, &prmgrProps, &prAlias, &prPath) < 0) + goto error; + if (!(drivestr = qemuBuildDriveStr(disk, false, priv->qemuCaps))) goto error;
@@ -435,6 +494,15 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, encobjAdded = true; }
+ if (prmgrProps) { + rv = qemuMonitorAddObject(priv->mon, "pr-manager-helper", prAlias, + prmgrProps); + prmgrProps = NULL; /* qemuMonitorAddObject consumes */ + if (rv < 0) + goto exit_monitor; + prmgrAdded = true; + } + if (qemuMonitorAddDrive(priv->mon, drivestr) < 0) goto exit_monitor; driveAdded = true;
[...]
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index b876d293a..cb160727c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2556,16 +2556,40 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, }
-static void -qemuProcessKillPRDaemon(virDomainObjPtr vm) +void +qemuProcessKillPRDaemon(virDomainObjPtr vm, + const char *socketPath,
socketPath seems useless here ...
+ bool force) { qemuDomainObjPrivatePtr priv = vm->privateData; + size_t nmanaged = 0; + size_t i;
if (priv->prPid == (pid_t) -1) return;
- virProcessKillPainfully(priv->prPid, true); - priv->prPid = (pid_t) -1; + for (i = 0; i < vm->def->ndisks; i++) { + qemuDomainStorageSourcePrivatePtr srcPriv; + virDomainDiskDefPtr disk = vm->def->disks[i]; + + if (!virStoragePRDefIsManaged(disk->src->pr)) + continue; + + nmanaged++; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + if (!socketPath) + socketPath = srcPriv->prd->path; + } + + if (force || nmanaged <= 1) { + virProcessKillPainfully(priv->prPid, true);
since this will be true only for the managed daemons, thus you'll have the socket path from the definition.
+ priv->prPid = (pid_t) -1; + if (socketPath && + unlink(socketPath) < 0 && + errno != ENOENT) + VIR_WARN("Unable to remove pr helper socket %s", socketPath); + } }

Just line in previous commit, if we are the last ones that are using the pr-manager delete it and also kill the pr-helper process. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 19 +++++++++++++++++++ src/qemu/qemu_domain.h | 3 +++ src/qemu/qemu_hotplug.c | 22 ++++++++++++++++++++++ 3 files changed, 44 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 3e05cf0f8..505e81784 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11892,6 +11892,25 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, } +size_t +qemuDomainGetPRDManagedCount(const virDomainDef *def) +{ + size_t i; + size_t nmanaged = 0; + + for (i = 0; i < def->ndisks; i++) { + virDomainDiskDefPtr disk = def->disks[i]; + + if (!virStoragePRDefIsManaged(disk->src->pr)) + continue; + + nmanaged++; + } + + return nmanaged; +} + + static int qemuDomainPrepareDiskPRD(qemuDomainObjPrivatePtr priv, virDomainDiskDefPtr disk) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 418b47153..c2f67f379 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -1007,6 +1007,9 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob); +size_t +qemuDomainGetPRDManagedCount(const virDomainDef *def); + int qemuDomainPrepareDiskSource(virDomainDiskDefPtr disk, qemuDomainObjPrivatePtr priv, diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 8cc0b631d..6fe23f618 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -400,6 +400,13 @@ qemuDestroyPRDefObject(virDomainObjPtr vm, } +static bool +qemuDomainDiskNeedRemovePR(virDomainObjPtr vm) +{ + return qemuDomainGetPRDManagedCount(vm->def) == 1; +} + + /** * qemuDomainAttachDiskGeneric: * @@ -3797,6 +3804,8 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, char *drivestr; char *objAlias = NULL; char *encAlias = NULL; + const char *prAlias = NULL; + const char *prPath = NULL; VIR_DEBUG("Removing disk %s from domain %p %s", disk->info.alias, vm, vm->def->name); @@ -3834,6 +3843,14 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } } + if (qemuDomainDiskNeedRemovePR(vm)) { + qemuDomainStorageSourcePrivatePtr srcPriv; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + prAlias = srcPriv->prd->alias; + prPath = srcPriv->prd->path; + } + qemuDomainObjEnterMonitor(driver, vm); qemuMonitorDriveDel(priv->mon, drivestr); @@ -3849,6 +3866,9 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, encAlias)); VIR_FREE(encAlias); + if (prAlias) + ignore_value(qemuMonitorDelObject(priv->mon, prAlias)); + if (disk->src->haveTLS) ignore_value(qemuMonitorDelObject(priv->mon, disk->src->tlsAlias)); @@ -3867,6 +3887,8 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } } + qemuDestroyPRDefObject(vm, prAlias, prPath); + qemuDomainReleaseDeviceAddress(vm, &disk->info, src); if (qemuSecurityRestoreDiskLabel(driver, vm, disk) < 0) -- 2.16.1

The capability tracks if qemu has pr-manager-helper object. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml | 1 + 2 files changed, 2 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 180485e63..ec2871bbd 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -1695,6 +1695,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "sclplmconsole", QEMU_CAPS_DEVICE_SCLPLMCONSOLE }, { "isa-serial", QEMU_CAPS_DEVICE_ISA_SERIAL }, { "pl011", QEMU_CAPS_DEVICE_PL011 }, + { "pr-manager-helper", QEMU_CAPS_PR_MANAGER_HELPER }, }; static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBalloon[] = { diff --git a/tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml b/tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml index 17abac910..5c1679271 100644 --- a/tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_2.11.0.s390x.xml @@ -146,6 +146,7 @@ <flag name='disk-share-rw'/> <flag name='iscsi.password-secret'/> <flag name='dump-completed'/> + <flag name='pr-manager-helper'/> <version>2011000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>340897</microcodeVersion> -- 2.16.1
participants (3)
-
Daniel P. Berrangé
-
Michal Privoznik
-
Peter Krempa