[libvirt] [PATCH v2 00/12] Basic implementation of persistent reservations

v2 of: https://www.redhat.com/archives/libvir-list/2018-January/msg00584.html diff to v1: - Dropped the hash table and stayed with simple pid_t in vm->privateData - Couple of small fixes (basically to address Peter's review) What is still missing: - Event from qemu when the helper process dies (in my testing, qemu process dies as soon as it tries to do PR). - <hostdev/> support, which surprisingly works even without these patches. I mean, if I add the following hostdev the guest is already able to do PR even with plain origin/master: <hostdev mode='subsystem' type='scsi' managed='no' rawio='no'> <source> <adapter name='scsi_host7'/> <address bus='0' target='0' unit='0'/> </source> <address type='drive' controller='0' bus='0' target='0' unit='4'/> </hostdev> where host7 is an iSCSI device: /sys/bus/scsi/devices/7\:0\:0\:0/block/sdd/ and the part of generated cmd line: -drive file=/dev/sg3,if=none,id=drive-hostdev0 \ -device scsi-generic,bus=scsi0.0,channel=0,scsi-id=0,lun=4,drive=drive-hostdev0,id=hostdev0 However, the qemu is running under my regular user nor it has any special capabilities: # grep -e "^Name" -e "^Uid" -e "^Gid" -e "^Cap" /proc/$(pgrep qemu)/status Name: qemu-system-x86 Uid: 1000 1000 1000 1000 Gid: 1000 1000 1000 1000 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: 0000003fffffffff CapAmb: 0000000000000000 As usual, you can find the patches on my github too: https://github.com/zippy2/libvirt/tree/pr_v2 Michal Privoznik (12): virstoragefile: Introduce virStoragePRDef qemuDomainDiskChangeSupported: Deny changing reservations qemu: Introduce pr-manager-helper capability qemu: Generate alias and socket path for pr-helper qemu: Store pr runtime data in status XML qemu: Generate cmd line at startup qemu: Introduce pr_helper to qemu.conf qemu_domain: Track pr-helper PID in status XML qemu: Start PR daemons on domain startup qemu_hotplug: Hotplug of reservations qemu_hotplug: Hotunplug of reservations qemu: Detect pr-manager-helper capability docs/formatdomain.html.in | 25 ++- docs/schemas/domaincommon.rng | 34 +--- docs/schemas/storagecommon.rng | 50 +++++ m4/virt-driver-qemu.m4 | 5 + src/conf/domain_conf.c | 36 ++++ src/libvirt_private.syms | 6 + src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 4 + src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 40 ++++ src/qemu/qemu_conf.c | 7 +- src/qemu/qemu_conf.h | 1 + src/qemu/qemu_domain.c | 213 ++++++++++++++++++++- src/qemu/qemu_domain.h | 15 ++ src/qemu/qemu_hotplug.c | 94 +++++++++ src/qemu/qemu_process.c | 188 ++++++++++++++++++ src/qemu/qemu_process.h | 7 + src/qemu/test_libvirtd_qemu.aug.in | 1 + src/util/virstoragefile.c | 181 +++++++++++++++++ src/util/virstoragefile.h | 19 ++ .../disk-virtio-scsi-reservations-not-managed.args | 28 +++ .../disk-virtio-scsi-reservations-not-managed.xml | 40 ++++ .../disk-virtio-scsi-reservations.args | 29 +++ .../disk-virtio-scsi-reservations.xml | 38 ++++ tests/qemuxml2argvtest.c | 8 + .../disk-virtio-scsi-reservations-not-managed.xml | 1 + .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 4 + 29 files changed, 1042 insertions(+), 37 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml -- 2.16.1

This is a definition that holds information on SCSI persistent reservation settings. The XML part looks like this: <reservations enabled='yes' managed='no'> <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> </reservations> If @managed is set to 'yes' then the <source/> is not parsed. This design was agreed on here: https://www.redhat.com/archives/libvir-list/2017-November/msg01005.html Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/formatdomain.html.in | 25 +++- docs/schemas/domaincommon.rng | 34 +---- docs/schemas/storagecommon.rng | 50 +++++++ src/conf/domain_conf.c | 36 +++++ src/libvirt_private.syms | 3 + src/util/virstoragefile.c | 148 +++++++++++++++++++++ src/util/virstoragefile.h | 15 +++ .../disk-virtio-scsi-reservations-not-managed.xml | 40 ++++++ .../disk-virtio-scsi-reservations.xml | 38 ++++++ .../disk-virtio-scsi-reservations-not-managed.xml | 1 + .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 4 + 12 files changed, 364 insertions(+), 31 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 6fd2189cd..cbdc887a0 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2563,7 +2563,10 @@ </disk> <disk type='block' device='lun'> <driver name='qemu' type='raw'/> - <source dev='/dev/sda'/> + <source dev='/dev/sda'> + <reservations enabled='yes' managed='no'> + <source type='unix' path='/path/to/qemu-pr-helper' mode='client'/> + </reservations> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='3' unit='0'/> </disk> @@ -2926,6 +2929,26 @@ <a href="formatstorageencryption.html">Storage Encryption</a> page for more information. </dd> + <dt><code>reservations</code></dt> + <dd><span class="since">Since libvirt 4.1.0</span>, the + <code>reservations</code> can be a sub-element of the + <code>source</code> element for storage sources (QEMU driver only). + If present (and enabled) it enabled persistent reservations for + SCSI based disks. The element has one mandatory attribute + <code>enabled</code> with accepted values <code>yes</code> and + <code>no</code>. If the feature is enabled, then there's another + mandatory attribute <code>managed</code> (accepted values are the + same as for <code>enabled</code>) that enables or disables libvirt + spawning any helper process (should one be needed). However, if + libvirt is not enabled spawning helper process (i.e. hypervisor + should just use one which is already running), path to its socket + must be provided in child element <code>source</code>, which + currently accepts only the following attributes: <code>type</code> + with one value <code>unix</code>, <code>path</code> with path the + socket, and finally <code>mode</code> which accepts either + <code>server</code> or <code>client</code> and specifies the role + of hypervisor. + </dd> </dl> <p> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 8165e699d..198d4400b 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1530,6 +1530,9 @@ <optional> <ref name="encryption"/> </optional> + <optional> + <ref name="reservations"/> + </optional> <zeroOrMore> <ref name='devSeclabel'/> </zeroOrMore> @@ -2431,18 +2434,6 @@ </attribute> </optional> </define> - <define name="reconnect"> - <element name="reconnect"> - <attribute name="enabled"> - <ref name="virYesNo"/> - </attribute> - <optional> - <attribute name="timeout"> - <ref name="unsignedInt"/> - </attribute> - </optional> - </element> - </define> <!-- An interface description can either be of type bridge in which case @@ -2491,24 +2482,7 @@ <value>vhostuser</value> </attribute> <interleave> - <element name="source"> - <attribute name="type"> - <value>unix</value> - </attribute> - <attribute name="path"> - <ref name="absFilePath"/> - </attribute> - <attribute name="mode"> - <choice> - <value>server</value> - <value>client</value> - </choice> - </attribute> - <optional> - <ref name="reconnect"/> - </optional> - <empty/> - </element> + <ref name="unixSocketSource"/> <ref name="interface-options"/> </interleave> </group> diff --git a/docs/schemas/storagecommon.rng b/docs/schemas/storagecommon.rng index edee1b084..eed0b3334 100644 --- a/docs/schemas/storagecommon.rng +++ b/docs/schemas/storagecommon.rng @@ -39,6 +39,56 @@ </element> </define> + <define name="reconnect"> + <element name="reconnect"> + <attribute name="enabled"> + <ref name="virYesNo"/> + </attribute> + <optional> + <attribute name="timeout"> + <ref name="unsignedInt"/> + </attribute> + </optional> + </element> + </define> + + <define name='unixSocketSource'> + <element name="source"> + <attribute name="type"> + <value>unix</value> + </attribute> + <attribute name="path"> + <ref name="absFilePath"/> + </attribute> + <attribute name="mode"> + <choice> + <value>server</value> + <value>client</value> + </choice> + </attribute> + <optional> + <ref name="reconnect"/> + </optional> + <empty/> + </element> + </define> + + <define name='reservations'> + <element name='reservations'> + <attribute name='enabled'> + <ref name='virYesNo'/> + </attribute> + <optional> + <attribute name='managed'> + <ref name='virYesNo'/> + </attribute> + </optional> + <optional> + <ref name='unixSocketSource'/> + </optional> + </element> + </define> + <define name='secret'> <element name='secret'> <attribute name='type'> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index f2ddde7a3..a1a6b0162 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -5199,6 +5199,13 @@ virDomainDiskDefValidate(const virDomainDiskDef *disk) } } + if (disk->src->pr && + disk->device != VIR_DOMAIN_DISK_DEVICE_LUN) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("<reservations/> allowed only for lun disks")); + return -1; + } + /* Reject disks with a bus type that is not compatible with the * given address type. The function considers only buses that are * handled in common code. For other bus types it's not possible @@ -8613,6 +8620,29 @@ virDomainDiskSourcePrivateDataParse(xmlXPathContextPtr ctxt, } +static int +virDomainDiskSourcePRParse(xmlNodePtr node, + virStoragePRDefPtr *prsrc) +{ + xmlNodePtr child; + virStoragePRDefPtr pr = NULL; + + for (child = node->children; child; child = child->next) { + if (child->type == XML_ELEMENT_NODE && + virXMLNodeNameEqual(child, "reservations")) { + + if (!(pr = virStoragePRDefParseNode(node->doc, child))) + return -1; + + *prsrc = pr; + return 0; + } + } + + return 0; +} + + int virDomainDiskSourceParse(xmlNodePtr node, xmlXPathContextPtr ctxt, @@ -8657,6 +8687,9 @@ virDomainDiskSourceParse(xmlNodePtr node, if (virDomainDiskSourceEncryptionParse(node, &src->encryption) < 0) goto cleanup; + if (virDomainDiskSourcePRParse(node, &src->pr) < 0) + goto cleanup; + if (virDomainDiskSourcePrivateDataParse(ctxt, src, flags, xmlopt) < 0) goto cleanup; @@ -22940,6 +22973,9 @@ virDomainDiskSourceFormatInternal(virBufferPtr buf, virStorageEncryptionFormat(&childBuf, src->encryption) < 0) goto error; + if (src->pr) + virStoragePRDefFormat(&childBuf, src->pr); + if (virDomainDiskSourceFormatPrivateData(&childBuf, src, flags, xmlopt) < 0) goto error; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 03f23dbab..071642c00 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2779,6 +2779,9 @@ virStorageNetHostDefFree; virStorageNetHostTransportTypeFromString; virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; +virStoragePRDefFormat; +virStoragePRDefFree; +virStoragePRDefParseNode; virStorageSourceBackingStoreClear; virStorageSourceClear; virStorageSourceCopy; diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index d1972d6d1..833d69f6d 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -1911,6 +1911,153 @@ virStorageAuthDefFormat(virBufferPtr buf, } +void +virStoragePRDefFree(virStoragePRDefPtr prd) +{ + if (!prd) + return; + + VIR_FREE(prd->path); + VIR_FREE(prd); +} + + +static virStoragePRDefPtr +virStoragePRDefParseXML(xmlXPathContextPtr ctxt) +{ + virStoragePRDefPtr prd, ret = NULL; + char *enabled = NULL; + char *managed = NULL; + char *type = NULL; + char *path = NULL; + char *mode = NULL; + + if (VIR_ALLOC(prd) < 0) + return NULL; + + if (!(enabled = virXPathString("string(./@enabled)", ctxt))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing @enabled attribute for <reservations/>")); + goto cleanup; + } + + if ((prd->enabled = virTristateBoolTypeFromString(enabled)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'enabled': %s"), NULLSTR(enabled)); + goto cleanup; + } + + if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + managed = virXPathString("string(./@managed)", ctxt); + if ((prd->managed = virTristateBoolTypeFromString(managed)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'managed': %s"), NULLSTR(managed)); + goto cleanup; + } + + if (prd->managed == VIR_TRISTATE_BOOL_NO) { + type = virXPathString("string(./source[1]/@type)", ctxt); + path = virXPathString("string(./source[1]/@path)", ctxt); + mode = virXPathString("string(./source[1]/@mode)", ctxt); + + if (!type) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection type")); + goto cleanup; + } + + if (!path) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing path")); + goto cleanup; + } + + if (!mode) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection mode")); + goto cleanup; + } + + if (STRNEQ(type, "unix")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported type: %s"), type); + goto cleanup; + } + + if (STRNEQ(mode, "client")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported mode: %s"), mode); + goto cleanup; + } + + prd->path = path; + path = NULL; + + } + } + + ret = prd; + prd = NULL; + + cleanup: + VIR_FREE(mode); + VIR_FREE(path); + VIR_FREE(type); + VIR_FREE(managed); + VIR_FREE(enabled); + virStoragePRDefFree(prd); + return ret; +} + + +virStoragePRDefPtr +virStoragePRDefParseNode(xmlDocPtr xml, + xmlNodePtr root) +{ + xmlXPathContextPtr ctxt = NULL; + virStoragePRDefPtr prd = NULL; + + ctxt = xmlXPathNewContext(xml); + if (ctxt == NULL) { + virReportOOMError(); + goto cleanup; + } + + ctxt->node = root; + prd = virStoragePRDefParseXML(ctxt); + + cleanup: + xmlXPathFreeContext(ctxt); + return prd; +} + + +void +virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd) +{ + virBufferAsprintf(buf, "<reservations enabled='%s'", + virTristateBoolTypeToString(prd->enabled)); + if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + virBufferAsprintf(buf, " managed='%s'", + virTristateBoolTypeToString(prd->managed)); + if (prd->managed == VIR_TRISTATE_BOOL_NO) { + virBufferAddLit(buf, ">\n"); + virBufferAdjustIndent(buf, 2); + virBufferAddLit(buf, "<source type='unix'"); + virBufferEscapeString(buf, " path='%s'", prd->path); + virBufferAddLit(buf, " mode='client'/>\n"); + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</reservations>\n"); + } else { + virBufferAddLit(buf, "/>\n"); + } + } else { + virBufferAddLit(buf, "/>\n"); + } +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) @@ -2289,6 +2436,7 @@ virStorageSourceClear(virStorageSourcePtr def) virBitmapFree(def->features); VIR_FREE(def->compat); virStorageEncryptionFree(def->encryption); + virStoragePRDefFree(def->pr); virStorageSourceSeclabelsClear(def); virStoragePermsFree(def->perms); VIR_FREE(def->timestamps); diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index 0095cd138..968653660 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -216,6 +216,14 @@ struct _virStorageAuthDef { virSecretLookupTypeDef seclookupdef; }; +typedef struct _virStoragePRDef virStoragePRDef; +typedef virStoragePRDef *virStoragePRDefPtr; +struct _virStoragePRDef { + int enabled; /* enum virTristateBool */ + int managed; /* enum virTristateBool */ + char *path; +}; + typedef struct _virStorageDriverData virStorageDriverData; typedef virStorageDriverData *virStorageDriverDataPtr; @@ -243,6 +251,7 @@ struct _virStorageSource { bool authInherited; virStorageEncryptionPtr encryption; bool encryptionInherited; + virStoragePRDefPtr pr; virObjectPtr privateData; @@ -369,6 +378,12 @@ virStorageAuthDefPtr virStorageAuthDefCopy(const virStorageAuthDef *src); virStorageAuthDefPtr virStorageAuthDefParse(xmlDocPtr xml, xmlNodePtr root); void virStorageAuthDefFormat(virBufferPtr buf, virStorageAuthDefPtr authdef); +void virStoragePRDefFree(virStoragePRDefPtr prd); +virStoragePRDefPtr virStoragePRDefParseNode(xmlDocPtr xml, + xmlNodePtr root); +void virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd); + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model); diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml new file mode 100644 index 000000000..8ee623a70 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml @@ -0,0 +1,40 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>8</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <disk type='block' device='lun'> + <source dev='/dev/HostVG/QEMUGuest1'> + <reservations enabled='yes' managed='no'> + <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> + </reservations> + </source> + <target dev='sdb' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <driver queues='8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml new file mode 100644 index 000000000..874446f7b --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml @@ -0,0 +1,38 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>8</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <disk type='block' device='lun'> + <source dev='/dev/HostVG/QEMUGuest1'> + <reservations enabled='yes' managed='yes'/> + </source> + <target dev='sdb' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <driver queues='8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml new file mode 120000 index 000000000..f96d62cb6 --- /dev/null +++ b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml \ No newline at end of file diff --git a/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml new file mode 120000 index 000000000..dde52fd1d --- /dev/null +++ b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/disk-virtio-scsi-reservations.xml \ No newline at end of file diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 0eb9e6c77..58cbf8438 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -531,6 +531,10 @@ mymain(void) QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-num_queues", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); + DO_TEST("disk-virtio-scsi-reservations", + QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); + DO_TEST("disk-virtio-scsi-reservations-not-managed", + QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-cmd_per_lun", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-max_sectors", -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
This is a definition that holds information on SCSI persistent reservation settings. The XML part looks like this:
<reservations enabled='yes' managed='no'> <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> </reservations>
If @managed is set to 'yes' then the <source/> is not parsed. This design was agreed on here:
https://www.redhat.com/archives/libvir-list/2017-November/msg01005.html
The design may have been agreed upon, but sometimes once a bit of work is done and the problem better understood the design can change. I still don't understand what the purpose of enabled='no' would be. Especially since when set to 'no', the <source> isn't formatted. I'd prefer to see a bit more wordy commit to describe the functionality and what it's used for... others, well I know, less is better because it's all self documenting anyway... In particular it's "confusing" what's done for when <source> is provided and what is done when it's not.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/formatdomain.html.in | 25 +++- docs/schemas/domaincommon.rng | 34 +---- docs/schemas/storagecommon.rng | 50 +++++++ src/conf/domain_conf.c | 36 +++++ src/libvirt_private.syms | 3 + src/util/virstoragefile.c | 148 +++++++++++++++++++++ src/util/virstoragefile.h | 15 +++ .../disk-virtio-scsi-reservations-not-managed.xml | 40 ++++++ .../disk-virtio-scsi-reservations.xml | 38 ++++++ .../disk-virtio-scsi-reservations-not-managed.xml | 1 + .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 4 + 12 files changed, 364 insertions(+), 31 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 6fd2189cd..cbdc887a0 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2563,7 +2563,10 @@ </disk> <disk type='block' device='lun'> <driver name='qemu' type='raw'/> - <source dev='/dev/sda'/> + <source dev='/dev/sda'> + <reservations enabled='yes' managed='no'>
See note below w/r/t managed='no'
+ <source type='unix' path='/path/to/qemu-pr-helper' mode='client'/> + </reservations> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='3' unit='0'/> </disk> @@ -2926,6 +2929,26 @@ <a href="formatstorageencryption.html">Storage Encryption</a> page for more information. </dd> + <dt><code>reservations</code></dt> + <dd><span class="since">Since libvirt 4.1.0</span>, the + <code>reservations</code> can be a sub-element of the + <code>source</code> element for storage sources (QEMU driver only). + If present (and enabled) it enabled persistent reservations for + SCSI based disks. The element has one mandatory attribute + <code>enabled</code> with accepted values <code>yes</code> and + <code>no</code>. If the feature is enabled, then there's another + mandatory attribute <code>managed</code> (accepted values are the + same as for <code>enabled</code>) that enables or disables libvirt + spawning any helper process (should one be needed). However, if + libvirt is not enabled spawning helper process (i.e. hypervisor + should just use one which is already running), path to its socket + must be provided in child element <code>source</code>, which + currently accepts only the following attributes: <code>type</code> + with one value <code>unix</code>, <code>path</code> with path the + socket, and finally <code>mode</code> which accepts either + <code>server</code> or <code>client</code> and specifies the role + of hypervisor. + </dd>
Since 4.2.0, the optional <code>reservations</code> sub-element of the storage <code>source</code> element can be used to enable persistent reservations for SCSI based disks (QEMU driver only). The element has one mandatory attribute <code>managed</code> set to either 'yes' or 'no'. When <code>managed</code> is 'yes', libvirt will create a helper process to manage the reservations and will check at restart for the helper process. When <code>managed</code> is 'no', libvirt will provide no management of the helper process. If libvirt is managing the helper process, then the <code>source</code> sub-element must be provided with mandatory attributes <code>type</code>, <code>path</code>, and <code>mode</code>. The <code>type</code> attribute currently only accepts 'unix' as an option and the <code>mode</code> attribute only accepts 'client'. The <code>path</code> attribute supplies the path to the UNIX socket for communication between libvirt and the helper process. FWIW: What I wrote above is backwards from the implementation for managed. At least it makes more sense to me. If libvirt is managing something, then it will do the work, if libvirt is not managing something then it doesn't do anything. Think of how hostdev's are managed or not. When managed, libvirt will detach and reattach. When not managed the user is responsible to perform the detach and reattach. BTW: While writing this I'm now wondering - so how does the managed='no' path actually find or communicate with the helper process? Maybe that answer comes later.
</dl>
<p> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 8165e699d..198d4400b 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1530,6 +1530,9 @@ <optional> <ref name="encryption"/> </optional> + <optional> + <ref name="reservations"/> + </optional> <zeroOrMore> <ref name='devSeclabel'/> </zeroOrMore> @@ -2431,18 +2434,6 @@ </attribute> </optional> </define> - <define name="reconnect"> - <element name="reconnect"> - <attribute name="enabled"> - <ref name="virYesNo"/> - </attribute> - <optional> - <attribute name="timeout"> - <ref name="unsignedInt"/> - </attribute> - </optional> - </element> - </define>
Why does this hunk move to storagecommon.rng? Since it's going to be common between domain and storage (for vhostuser and persistent reservations), maybe it should move to basictypes.rng.
<!-- An interface description can either be of type bridge in which case @@ -2491,24 +2482,7 @@ <value>vhostuser</value> </attribute> <interleave> - <element name="source"> - <attribute name="type"> - <value>unix</value> - </attribute> - <attribute name="path"> - <ref name="absFilePath"/> - </attribute> - <attribute name="mode"> - <choice> - <value>server</value> - <value>client</value> - </choice> - </attribute> - <optional> - <ref name="reconnect"/> - </optional> - <empty/> - </element> + <ref name="unixSocketSource"/> <ref name="interface-options"/> </interleave> </group> diff --git a/docs/schemas/storagecommon.rng b/docs/schemas/storagecommon.rng index edee1b084..eed0b3334 100644 --- a/docs/schemas/storagecommon.rng +++ b/docs/schemas/storagecommon.rng @@ -39,6 +39,56 @@ </element> </define>
Perhaps all of the following should be in basictypes.rng? Or some other new common rng file... Common between domain and storage where we seem to be having more and more crossover.
+ <define name="reconnect"> + <element name="reconnect"> + <attribute name="enabled"> + <ref name="virYesNo"/> + </attribute> + <optional> + <attribute name="timeout"> + <ref name="unsignedInt"/> + </attribute> + </optional> + </element> + </define> + + <define name='unixSocketSource'> + <element name="source"> + <attribute name="type"> + <value>unix</value> + </attribute> + <attribute name="path"> + <ref name="absFilePath"/> + </attribute> + <attribute name="mode"> + <choice> + <value>server</value> + <value>client</value> + </choice> + </attribute> + <optional> + <ref name="reconnect"/> + </optional> + <empty/> + </element> + </define> + + <define name='reservations'> + <element name='reservations'> + <attribute name='enabled'> + <ref name='virYesNo'/> + </attribute> + <optional> + <attribute name='managed'> + <ref name='virYesNo'/> + </attribute> + </optional>> + <optional> + <ref name='unixSocketSource'/>
The parse/format won't allow the mode=server value nor will it do any sort of reconnect...
+ </optional> + </element> + </define> + <define name='secret'> <element name='secret'> <attribute name='type'> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index f2ddde7a3..a1a6b0162 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -5199,6 +5199,13 @@ virDomainDiskDefValidate(const virDomainDiskDef *disk) } }
+ if (disk->src->pr && + disk->device != VIR_DOMAIN_DISK_DEVICE_LUN) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("<reservations/> allowed only for lun disks")); + return -1; + } +> /* Reject disks with a bus type that is not compatible with the * given address type. The function considers only buses that are * handled in common code. For other bus types it's not possible @@ -8613,6 +8620,29 @@ virDomainDiskSourcePrivateDataParse(xmlXPathContextPtr ctxt, }
+static int +virDomainDiskSourcePRParse(xmlNodePtr node, + virStoragePRDefPtr *prsrc)
Personally, it's a bit "confusing" (at best) to have the Format routine be virStoragePRDefFormat
+{ + xmlNodePtr child; + virStoragePRDefPtr pr = NULL; + + for (child = node->children; child; child = child->next) { + if (child->type == XML_ELEMENT_NODE && + virXMLNodeNameEqual(child, "reservations")) { + + if (!(pr = virStoragePRDefParseNode(node->doc, child))) + return -1; + + *prsrc = pr; + return 0; + } + } + + return 0; +} + + int virDomainDiskSourceParse(xmlNodePtr node, xmlXPathContextPtr ctxt, @@ -8657,6 +8687,9 @@ virDomainDiskSourceParse(xmlNodePtr node, if (virDomainDiskSourceEncryptionParse(node, &src->encryption) < 0) goto cleanup;
+ if (virDomainDiskSourcePRParse(node, &src->pr) < 0) + goto cleanup; + if (virDomainDiskSourcePrivateDataParse(ctxt, src, flags, xmlopt) < 0) goto cleanup;
@@ -22940,6 +22973,9 @@ virDomainDiskSourceFormatInternal(virBufferPtr buf, virStorageEncryptionFormat(&childBuf, src->encryption) < 0) goto error;
+ if (src->pr) + virStoragePRDefFormat(&childBuf, src->pr); + if (virDomainDiskSourceFormatPrivateData(&childBuf, src, flags, xmlopt) < 0) goto error;
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 03f23dbab..071642c00 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2779,6 +2779,9 @@ virStorageNetHostDefFree; virStorageNetHostTransportTypeFromString; virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; +virStoragePRDefFormat; +virStoragePRDefFree; +virStoragePRDefParseNode; virStorageSourceBackingStoreClear; virStorageSourceClear; virStorageSourceCopy; diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index d1972d6d1..833d69f6d 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -1911,6 +1911,153 @@ virStorageAuthDefFormat(virBufferPtr buf, }
+void +virStoragePRDefFree(virStoragePRDefPtr prd) +{ + if (!prd) + return; + + VIR_FREE(prd->path); + VIR_FREE(prd); +} + + +static virStoragePRDefPtr +virStoragePRDefParseXML(xmlXPathContextPtr ctxt) +{ + virStoragePRDefPtr prd, ret = NULL; + char *enabled = NULL; + char *managed = NULL; + char *type = NULL; + char *path = NULL; + char *mode = NULL; + + if (VIR_ALLOC(prd) < 0) + return NULL; + + if (!(enabled = virXPathString("string(./@enabled)", ctxt))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing @enabled attribute for <reservations/>")); + goto cleanup; + } + + if ((prd->enabled = virTristateBoolTypeFromString(enabled)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'enabled': %s"), NULLSTR(enabled)); + goto cleanup; + } + + if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + managed = virXPathString("string(./@managed)", ctxt); + if ((prd->managed = virTristateBoolTypeFromString(managed)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid value for 'managed': %s"), NULLSTR(managed)); + goto cleanup; + } + + if (prd->managed == VIR_TRISTATE_BOOL_NO) {
Consider my thoughts above vis-a-vis managed.
+ type = virXPathString("string(./source[1]/@type)", ctxt); + path = virXPathString("string(./source[1]/@path)", ctxt); + mode = virXPathString("string(./source[1]/@mode)", ctxt); + + if (!type) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection type")); + goto cleanup; + } + + if (!path) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing path"));
missing connection path (to be consistent)
+ goto cleanup; + } + + if (!mode) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing connection mode")); + goto cleanup; + } + + if (STRNEQ(type, "unix")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported type: %s"), type); + goto cleanup; + } + + if (STRNEQ(mode, "client")) { + virReportError(VIR_ERR_XML_ERROR, + _("unsupported mode: %s"), mode); + goto cleanup; + }
BTW: Seems pretty reasonable to add the same reconnect options that vhostuser has. I mean if the helper process dies does libvirt need to be restarted in order to recognize and restart? Or does it restart after some time period when it recognizes the helper process died?
+ + prd->path = path; + path = NULL; + + } + } + + ret = prd; + prd = NULL; + + cleanup: + VIR_FREE(mode); + VIR_FREE(path); + VIR_FREE(type); + VIR_FREE(managed); + VIR_FREE(enabled); + virStoragePRDefFree(prd); + return ret; +} + + +virStoragePRDefPtr +virStoragePRDefParseNode(xmlDocPtr xml, + xmlNodePtr root) +{ + xmlXPathContextPtr ctxt = NULL; + virStoragePRDefPtr prd = NULL; + + ctxt = xmlXPathNewContext(xml); + if (ctxt == NULL) { + virReportOOMError(); + goto cleanup; + } + + ctxt->node = root; + prd = virStoragePRDefParseXML(ctxt); + + cleanup: + xmlXPathFreeContext(ctxt); + return prd; +} + + +void +virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd) +{ + virBufferAsprintf(buf, "<reservations enabled='%s'", + virTristateBoolTypeToString(prd->enabled));
So if someone changes enabled from 'yes' to 'no', then if they had anything in the <source> buffer it'd be lost. So I see no reason for enabled
+ if (prd->enabled == VIR_TRISTATE_BOOL_YES) { + virBufferAsprintf(buf, " managed='%s'", + virTristateBoolTypeToString(prd->managed)); + if (prd->managed == VIR_TRISTATE_BOOL_NO) {
As noted earlier this seems backwards to me.
+ virBufferAddLit(buf, ">\n"); + virBufferAdjustIndent(buf, 2); + virBufferAddLit(buf, "<source type='unix'"); + virBufferEscapeString(buf, " path='%s'", prd->path); + virBufferAddLit(buf, " mode='client'/>\n");
So no chance of this being 'server', thus what's the point?
+ virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</reservations>\n"); + } else { + virBufferAddLit(buf, "/>\n"); + } + } else { + virBufferAddLit(buf, "/>\n"); + } +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) @@ -2289,6 +2436,7 @@ virStorageSourceClear(virStorageSourcePtr def) virBitmapFree(def->features); VIR_FREE(def->compat); virStorageEncryptionFree(def->encryption); + virStoragePRDefFree(def->pr); virStorageSourceSeclabelsClear(def); virStoragePermsFree(def->perms); VIR_FREE(def->timestamps); diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index 0095cd138..968653660 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -216,6 +216,14 @@ struct _virStorageAuthDef { virSecretLookupTypeDef seclookupdef; };
+typedef struct _virStoragePRDef virStoragePRDef; +typedef virStoragePRDef *virStoragePRDefPtr; +struct _virStoragePRDef { + int enabled; /* enum virTristateBool */ + int managed; /* enum virTristateBool */ + char *path; +}; + typedef struct _virStorageDriverData virStorageDriverData; typedef virStorageDriverData *virStorageDriverDataPtr;
@@ -243,6 +251,7 @@ struct _virStorageSource { bool authInherited; virStorageEncryptionPtr encryption; bool encryptionInherited; + virStoragePRDefPtr pr;
virObjectPtr privateData;
@@ -369,6 +378,12 @@ virStorageAuthDefPtr virStorageAuthDefCopy(const virStorageAuthDef *src); virStorageAuthDefPtr virStorageAuthDefParse(xmlDocPtr xml, xmlNodePtr root); void virStorageAuthDefFormat(virBufferPtr buf, virStorageAuthDefPtr authdef);
+void virStoragePRDefFree(virStoragePRDefPtr prd); +virStoragePRDefPtr virStoragePRDefParseNode(xmlDocPtr xml, + xmlNodePtr root); +void virStoragePRDefFormat(virBufferPtr buf, + virStoragePRDefPtr prd); + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model); diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml new file mode 100644 index 000000000..8ee623a70 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml @@ -0,0 +1,40 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>8</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <disk type='block' device='lun'> + <source dev='/dev/HostVG/QEMUGuest1'> + <reservations enabled='yes' managed='no'>
As noted earlier the managed attribute seems backwards.
+ <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> + </reservations> + </source> + <target dev='sdb' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <driver queues='8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml new file mode 100644 index 000000000..874446f7b --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml @@ -0,0 +1,38 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>8</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <disk type='block' device='lun'> + <source dev='/dev/HostVG/QEMUGuest1'> + <reservations enabled='yes' managed='yes'/>
As I noted earlier this seems backwards. John
+ </source> + <target dev='sdb' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <driver queues='8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml new file mode 120000 index 000000000..f96d62cb6 --- /dev/null +++ b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml \ No newline at end of file diff --git a/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml new file mode 120000 index 000000000..dde52fd1d --- /dev/null +++ b/tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/disk-virtio-scsi-reservations.xml \ No newline at end of file diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 0eb9e6c77..58cbf8438 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -531,6 +531,10 @@ mymain(void) QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-num_queues", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); + DO_TEST("disk-virtio-scsi-reservations", + QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); + DO_TEST("disk-virtio-scsi-reservations-not-managed", + QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-cmd_per_lun", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_VIRTIO_SCSI); DO_TEST("disk-virtio-scsi-max_sectors",

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
This is a definition that holds information on SCSI persistent reservation settings. The XML part looks like this:
<reservations enabled='yes' managed='no'> <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> </reservations>
If @managed is set to 'yes' then the <source/> is not parsed. This design was agreed on here:
https://www.redhat.com/archives/libvir-list/2017-November/msg01005.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/formatdomain.html.in | 25 +++- docs/schemas/domaincommon.rng | 34 +---- docs/schemas/storagecommon.rng | 50 +++++++ src/conf/domain_conf.c | 36 +++++ src/libvirt_private.syms | 3 + src/util/virstoragefile.c | 148 +++++++++++++++++++++ src/util/virstoragefile.h | 15 +++ .../disk-virtio-scsi-reservations-not-managed.xml | 40 ++++++ .../disk-virtio-scsi-reservations.xml | 38 ++++++ .../disk-virtio-scsi-reservations-not-managed.xml | 1 + .../disk-virtio-scsi-reservations.xml | 1 + tests/qemuxml2xmltest.c | 4 + 12 files changed, 364 insertions(+), 31 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml
Something else that dawned on me when looking at the original series. [...]
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index f2ddde7a3..a1a6b0162 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -5199,6 +5199,13 @@ virDomainDiskDefValidate(const virDomainDiskDef *disk) } }
+ if (disk->src->pr && + disk->device != VIR_DOMAIN_DISK_DEVICE_LUN) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("<reservations/> allowed only for lun disks")); + return -1; + } +
If someone is using a storage pool such as: <disk type='volume' device='lun'> <driver name='qemu' type='raw'/> <source pool='sourcepool' volume='unit:0:4:0'/> <target dev='sda' bus='scsi'/> </disk> then how do they define/use the PR? For LUN's like iSCSI and NPIV it can be easier or even preferred to use the storage pool as opposed to using the <source protocol=..."/> syntax. Search on iscsi in the formatdomain page for iSCSI syntax examples. See: https://wiki.libvirt.org/page/NPIV_in_libvirt for the NPIV syntax examples. John
/* Reject disks with a bus type that is not compatible with the * given address type. The function considers only buses that are * handled in common code. For other bus types it's not possible @@ -8613,6 +8620,29 @@ virDomainDiskSourcePrivateDataParse(xmlXPathContextPtr ctxt, }
[...]

Couple of reasons for that: a) there's no monitor command to change path where the pr-helper connects to, or b) there's no monitor command to introduce a new pr-helper for a disk that already exists. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 8 ++++++++ src/util/virstoragefile.c | 18 ++++++++++++++++++ src/util/virstoragefile.h | 2 ++ 4 files changed, 29 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 071642c00..d4bff0f0b 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2781,6 +2781,7 @@ virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; virStoragePRDefFormat; virStoragePRDefFree; +virStoragePRDefIsEqual; virStoragePRDefParseNode; virStorageSourceBackingStoreClear; virStorageSourceClear; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b4efc82d..de8974d66 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7480,6 +7480,14 @@ qemuDomainDiskChangeSupported(virDomainDiskDefPtr disk, CHECK_EQ(src->readonly, "readonly", true); CHECK_EQ(src->shared, "shared", true); + if (!virStoragePRDefIsEqual(disk->src->pr, + orig_disk->src->pr)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("cannot modify field '%s' of the disk"), + "reservations"); + return false; + } + #undef CHECK_EQ return true; diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 833d69f6d..328f59788 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -2058,6 +2058,24 @@ virStoragePRDefFormat(virBufferPtr buf, } +bool +virStoragePRDefIsEqual(virStoragePRDefPtr a, + virStoragePRDefPtr b) +{ + if (!a && !b) + return true; + + if (!a || !b) + return false; + + if (a->enabled != b->enabled || + a->managed != b->managed || + STRNEQ_NULLABLE(a->path, b->path)) + return false; + + return true; +} + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index 968653660..e70c99076 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -383,6 +383,8 @@ virStoragePRDefPtr virStoragePRDefParseNode(xmlDocPtr xml, xmlNodePtr root); void virStoragePRDefFormat(virBufferPtr buf, virStoragePRDefPtr prd); +bool virStoragePRDefIsEqual(virStoragePRDefPtr a, + virStoragePRDefPtr b); virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Couple of reasons for that:
a) there's no monitor command to change path where the pr-helper connects to, or b) there's no monitor command to introduce a new pr-helper for a disk that already exists.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 8 ++++++++ src/util/virstoragefile.c | 18 ++++++++++++++++++ src/util/virstoragefile.h | 2 ++ 4 files changed, 29 insertions(+)
[...]
index 8b4efc82d..de8974d66 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7480,6 +7480,14 @@ qemuDomainDiskChangeSupported(virDomainDiskDefPtr disk, CHECK_EQ(src->readonly, "readonly", true); CHECK_EQ(src->shared, "shared", true);
+ if (!virStoragePRDefIsEqual(disk->src->pr, + orig_disk->src->pr)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("cannot modify field '%s' of the disk"), + "reservations");
Formatting problems above - the " should line up I think... also "diskreservations" will look odd. John
+ return false; + } + #undef CHECK_EQ
return true;

On 03/02/2018 02:58 AM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Couple of reasons for that:
a) there's no monitor command to change path where the pr-helper connects to, or b) there's no monitor command to introduce a new pr-helper for a disk that already exists.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 8 ++++++++ src/util/virstoragefile.c | 18 ++++++++++++++++++ src/util/virstoragefile.h | 2 ++ 4 files changed, 29 insertions(+)
[...]
index 8b4efc82d..de8974d66 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7480,6 +7480,14 @@ qemuDomainDiskChangeSupported(virDomainDiskDefPtr disk, CHECK_EQ(src->readonly, "readonly", true); CHECK_EQ(src->shared, "shared", true);
+ if (!virStoragePRDefIsEqual(disk->src->pr, + orig_disk->src->pr)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("cannot modify field '%s' of the disk"), + "reservations");
Formatting problems above - the " should line up I think... also "diskreservations" will look odd.
That's not what would be written. This is what would be: error: cannot modify field 'reservations' of the disk or translated: fehler: Das Feld 'reservations' kann nicht geändert werden IIRC it was discussed in v1 too. The idea is to have 'reservations' not translated because it refers to the XML element name. However, the rest of the error message can be localized. Michal

On 03/06/2018 12:31 PM, Michal Privoznik wrote:
On 03/02/2018 02:58 AM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Couple of reasons for that:
a) there's no monitor command to change path where the pr-helper connects to, or b) there's no monitor command to introduce a new pr-helper for a disk that already exists.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 8 ++++++++ src/util/virstoragefile.c | 18 ++++++++++++++++++ src/util/virstoragefile.h | 2 ++ 4 files changed, 29 insertions(+)
[...]
index 8b4efc82d..de8974d66 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7480,6 +7480,14 @@ qemuDomainDiskChangeSupported(virDomainDiskDefPtr disk, CHECK_EQ(src->readonly, "readonly", true); CHECK_EQ(src->shared, "shared", true);
+ if (!virStoragePRDefIsEqual(disk->src->pr, + orig_disk->src->pr)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("cannot modify field '%s' of the disk"), + "reservations");
Formatting problems above - the " should line up I think... also "diskreservations" will look odd.
That's not what would be written. This is what would be:
error: cannot modify field 'reservations' of the disk
or translated:
fehler: Das Feld 'reservations' kann nicht geändert werden
IIRC it was discussed in v1 too. The idea is to have 'reservations' not translated because it refers to the XML element name. However, the rest of the error message can be localized.
Oh right - I missed the pesky '%s'... John

The capability tracks if qemu has pr-manager-helper object. At this time don't actually detect if qemu has the capability. Not just yet. Only after the code is written the feature will be enabled. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + src/qemu/qemu_capabilities.h | 1 + 2 files changed, 2 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index b5eb8cf46..d0005c71d 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -459,6 +459,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "pl011", "machine.pseries.max-cpu-compat", "dump-completed", + "pr-manager-helper", ); diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index c2ec2be19..8d315665a 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -444,6 +444,7 @@ typedef enum { QEMU_CAPS_DEVICE_PL011, /* -device pl011 (not user-instantiable) */ QEMU_CAPS_MACHINE_PSERIES_MAX_CPU_COMPAT, /* -machine pseries,max-cpu-compat= */ QEMU_CAPS_DUMP_COMPLETED, /* DUMP_COMPLETED event */ + QEMU_CAPS_PR_MANAGER_HELPER, /* -object pr-manager-helper */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
The capability tracks if qemu has pr-manager-helper object. At this time don't actually detect if qemu has the capability. Not just yet. Only after the code is written the feature will be enabled.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + src/qemu/qemu_capabilities.h | 1 + 2 files changed, 2 insertions(+)
Very strange to not see the replies or xml files adjusted here. And why is patch 12 not merged in here? Like it was in v1 where you got an ACK from Peter? John
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index b5eb8cf46..d0005c71d 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -459,6 +459,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "pl011", "machine.pseries.max-cpu-compat", "dump-completed", + "pr-manager-helper", );
diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index c2ec2be19..8d315665a 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -444,6 +444,7 @@ typedef enum { QEMU_CAPS_DEVICE_PL011, /* -device pl011 (not user-instantiable) */ QEMU_CAPS_MACHINE_PSERIES_MAX_CPU_COMPAT, /* -machine pseries,max-cpu-compat= */ QEMU_CAPS_DUMP_COMPLETED, /* DUMP_COMPLETED event */ + QEMU_CAPS_PR_MANAGER_HELPER, /* -object pr-manager-helper */
QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags;

On 03/02/2018 02:58 AM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
The capability tracks if qemu has pr-manager-helper object. At this time don't actually detect if qemu has the capability. Not just yet. Only after the code is written the feature will be enabled.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + src/qemu/qemu_capabilities.h | 1 + 2 files changed, 2 insertions(+)
Very strange to not see the replies or xml files adjusted here.
And why is patch 12 not merged in here? Like it was in v1 where you got an ACK from Peter?
Because Peter also said that the patches should be ordered in such way that after each patch nothing is broken. So I had two options: a) rewrite all the patches from scratch, or b) separate the capability detection into its own patch so effectively there is no way to use the feature until the very last patch and thus there's nothing to break. This was used fairly often historically when we were introducing new features in multiple patches. Michal

On 03/06/2018 12:31 PM, Michal Privoznik wrote:
On 03/02/2018 02:58 AM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
The capability tracks if qemu has pr-manager-helper object. At this time don't actually detect if qemu has the capability. Not just yet. Only after the code is written the feature will be enabled.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + src/qemu/qemu_capabilities.h | 1 + 2 files changed, 2 insertions(+)
Very strange to not see the replies or xml files adjusted here.
And why is patch 12 not merged in here? Like it was in v1 where you got an ACK from Peter?
Because Peter also said that the patches should be ordered in such way that after each patch nothing is broken. So I had two options:
a) rewrite all the patches from scratch, or b) separate the capability detection into its own patch so effectively there is no way to use the feature until the very last patch and thus there's nothing to break. This was used fairly often historically when we were introducing new features in multiple patches.
Fair enough. I assume that if we had 2.12 capabilities, the flag would show up in one of the xml replies file... I understand the point, just strange to see it split. I'm fine with the split though. John

While we're not generating the command line just yet (look for the next commits), we can generate the alias for pr-manager. A domain can have up to one managed pr-manager (in which case socket path is decided by libvirt and pr-helper is spawned by libvirt too), but up to ndisks of unmanaged ones (one per disk basically). In case of the former we can generate a simple alias and be sure it'll not conflict. But in case of the latter to avoid any conflicts let's generate alias that's based on something unique - like disk target. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 2 ++ src/qemu/qemu_domain.c | 80 +++++++++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_domain.h | 10 ++++++ src/util/virstoragefile.c | 15 +++++++++ src/util/virstoragefile.h | 2 ++ 5 files changed, 106 insertions(+), 3 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d4bff0f0b..bb0183ea6 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2781,7 +2781,9 @@ virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; virStoragePRDefFormat; virStoragePRDefFree; +virStoragePRDefIsEnabled; virStoragePRDefIsEqual; +virStoragePRDefIsManaged; virStoragePRDefParseNode; virStorageSourceBackingStoreClear; virStorageSourceClear; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index de8974d66..dffc4c30e 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -982,6 +982,18 @@ qemuDomainSecretInfoFree(qemuDomainSecretInfoPtr *secinfo) } +static void +qemuDomainDiskPRDFree(qemuDomainDiskPRDPtr prd) +{ + if (!prd) + return; + + VIR_FREE(prd->alias); + VIR_FREE(prd->path); + VIR_FREE(prd); +} + + static virClassPtr qemuDomainDiskPrivateClass; static void qemuDomainDiskPrivateDispose(void *obj); @@ -1062,6 +1074,7 @@ qemuDomainStorageSourcePrivateDispose(void *obj) qemuDomainSecretInfoFree(&priv->secinfo); qemuDomainSecretInfoFree(&priv->encinfo); + qemuDomainDiskPRDFree(priv->prd); } @@ -1473,9 +1486,6 @@ qemuDomainSecretStorageSourcePrepare(qemuDomainObjPrivatePtr priv, if (!hasAuth && !hasEnc) return 0; - if (!(src->privateData = qemuDomainStorageSourcePrivateNew())) - return -1; - srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); if (hasAuth) { @@ -11538,17 +11548,81 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, } +static int +qemuDomainPrepareDiskPRD(qemuDomainObjPrivatePtr priv, + virDomainDiskDefPtr disk) +{ + qemuDomainStorageSourcePrivatePtr srcPriv; + virStoragePRDefPtr prd = disk->src->pr; + char *prAlias = NULL; + char *prPath = NULL; + int ret = -1; + + if (!virStoragePRDefIsEnabled(prd)) + return 0; + + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_PR_MANAGER_HELPER)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("reservations not supported with this QEMU binary")); + goto cleanup; + } + + if (!virStorageSourceIsLocalStorage(disk->src)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("reservations supported only for local storage")); + goto cleanup; + } + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + + if (virStoragePRDefIsManaged(prd)) { + /* Generate PR helper socket path & alias that are same + * for each disk in the domain. */ + + if (VIR_STRDUP(prAlias, "pr-helper0") < 0) + return -1; + + if (virAsprintf(&prPath, "%s/pr-helper0.sock", priv->libDir) < 0) + return -1; + + } else { + if (virAsprintf(&prAlias, "pr-helper-%s", disk->dst) < 0) + return -1; + + if (VIR_STRDUP(prPath, prd->path) < 0) + return -1; + } + + if (VIR_ALLOC(srcPriv->prd) < 0) + goto cleanup; + VIR_STEAL_PTR(srcPriv->prd->alias, prAlias); + VIR_STEAL_PTR(srcPriv->prd->path, prPath); + + ret = 0; + cleanup: + VIR_FREE(prPath); + VIR_FREE(prAlias); + return ret; +} + + int qemuDomainPrepareDiskSource(virDomainDiskDefPtr disk, qemuDomainObjPrivatePtr priv, virQEMUDriverConfigPtr cfg) { + if (!(disk->src->privateData = qemuDomainStorageSourcePrivateNew())) + return -1; + if (qemuDomainPrepareDiskSourceTLS(disk->src, cfg) < 0) return -1; if (qemuDomainSecretDiskPrepare(priv, disk) < 0) return -1; + if (qemuDomainPrepareDiskPRD(priv, disk) < 0) + return -1; + if (disk->src->type == VIR_STORAGE_TYPE_NETWORK && disk->src->protocol == VIR_STORAGE_NET_PROTOCOL_GLUSTER && virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_GLUSTER_DEBUG_LEVEL)) { diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 6d3e6eb5e..b9258eb8e 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -373,6 +373,13 @@ struct _qemuDomainDiskPrivate { bool removable; /* device media can be removed/changed */ }; +typedef struct _qemuDomainDiskPRD qemuDomainDiskPRD; +typedef qemuDomainDiskPRD *qemuDomainDiskPRDPtr; +struct _qemuDomainDiskPRD { + char *alias; + char *path; /* socket path. */ +}; + # define QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src) \ ((qemuDomainStorageSourcePrivatePtr) (src)->privateData) @@ -386,6 +393,9 @@ struct _qemuDomainStorageSourcePrivate { /* data required for decryption of encrypted storage source */ qemuDomainSecretInfoPtr encinfo; + + /* data required for persistent reservations */ + qemuDomainDiskPRDPtr prd; }; virObjectPtr qemuDomainStorageSourcePrivateNew(void); diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 328f59788..a85fec330 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -2076,6 +2076,21 @@ virStoragePRDefIsEqual(virStoragePRDefPtr a, return true; } + +bool +virStoragePRDefIsEnabled(virStoragePRDefPtr prd) +{ + return prd && prd->enabled == VIR_TRISTATE_BOOL_YES; +} + + +bool +virStoragePRDefIsManaged(virStoragePRDefPtr prd) +{ + return prd && prd->managed == VIR_TRISTATE_BOOL_YES; +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index e70c99076..cf931bbb1 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -385,6 +385,8 @@ void virStoragePRDefFormat(virBufferPtr buf, virStoragePRDefPtr prd); bool virStoragePRDefIsEqual(virStoragePRDefPtr a, virStoragePRDefPtr b); +bool virStoragePRDefIsEnabled(virStoragePRDefPtr prd); +bool virStoragePRDefIsManaged(virStoragePRDefPtr prd); virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
While we're not generating the command line just yet (look for the next commits), we can generate the alias for pr-manager. A domain can have up to one managed pr-manager (in which case socket path is decided by libvirt and pr-helper is spawned by libvirt too), but up to ndisks of unmanaged ones (one per disk basically). In case of the former we can generate a simple alias and be sure it'll not conflict. But in case of the latter to avoid any conflicts let's generate alias that's based on something unique - like disk target.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 2 ++ src/qemu/qemu_domain.c | 80 +++++++++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_domain.h | 10 ++++++ src/util/virstoragefile.c | 15 +++++++++ src/util/virstoragefile.h | 2 ++ 5 files changed, 106 insertions(+), 3 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d4bff0f0b..bb0183ea6 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2781,7 +2781,9 @@ virStorageNetHostTransportTypeToString; virStorageNetProtocolTypeToString; virStoragePRDefFormat; virStoragePRDefFree; +virStoragePRDefIsEnabled; virStoragePRDefIsEqual; +virStoragePRDefIsManaged; virStoragePRDefParseNode; virStorageSourceBackingStoreClear; virStorageSourceClear; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index de8974d66..dffc4c30e 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -982,6 +982,18 @@ qemuDomainSecretInfoFree(qemuDomainSecretInfoPtr *secinfo) }
+static void +qemuDomainDiskPRDFree(qemuDomainDiskPRDPtr prd) +{ + if (!prd) + return; + + VIR_FREE(prd->alias); + VIR_FREE(prd->path); + VIR_FREE(prd); +} + + static virClassPtr qemuDomainDiskPrivateClass; static void qemuDomainDiskPrivateDispose(void *obj);
@@ -1062,6 +1074,7 @@ qemuDomainStorageSourcePrivateDispose(void *obj)
qemuDomainSecretInfoFree(&priv->secinfo); qemuDomainSecretInfoFree(&priv->encinfo); + qemuDomainDiskPRDFree(priv->prd); }
@@ -1473,9 +1486,6 @@ qemuDomainSecretStorageSourcePrepare(qemuDomainObjPrivatePtr priv, if (!hasAuth && !hasEnc) return 0;
- if (!(src->privateData = qemuDomainStorageSourcePrivateNew())) - return -1; -
How does the following not fail with the previous hunk removed? Oh, I see, it's moved to qemuDomainPrepareDiskSource.... could be separated out, but I see why it's here - so fine. Just eye opening ;-)
srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src);
if (hasAuth) { @@ -11538,17 +11548,81 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, }
+static int +qemuDomainPrepareDiskPRD(qemuDomainObjPrivatePtr priv, + virDomainDiskDefPtr disk) +{ + qemuDomainStorageSourcePrivatePtr srcPriv; + virStoragePRDefPtr prd = disk->src->pr; + char *prAlias = NULL; + char *prPath = NULL; + int ret = -1; + + if (!virStoragePRDefIsEnabled(prd)) + return 0; + + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_PR_MANAGER_HELPER)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("reservations not supported with this QEMU binary")); + goto cleanup; + } + + if (!virStorageSourceIsLocalStorage(disk->src)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("reservations supported only for local storage")); + goto cleanup; + }
Ironically the above two could return -1 directly...
+ + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + + if (virStoragePRDefIsManaged(prd)) { + /* Generate PR helper socket path & alias that are same + * for each disk in the domain. */ + + if (VIR_STRDUP(prAlias, "pr-helper0") < 0) + return -1; + + if (virAsprintf(&prPath, "%s/pr-helper0.sock", priv->libDir) < 0) + return -1;
Leaks prAlias BTW: Whatever "managed" ends up being needs to be described somewhere either prior to this or after this, but knowing how the connection between libvirt and qemu's process is going to happen will be really helpful... Something that I would think would be described in patch 1, but again that's just me. This whole managed thing is really confusing the way it's written. May make sense to you, but it doesn't make sense to me, just saying.
+ + } else { + if (virAsprintf(&prAlias, "pr-helper-%s", disk->dst) < 0) + return -1;
Or using the 'disk->info.alias' like other consumers.
+ + if (VIR_STRDUP(prPath, prd->path) < 0) + return -1;
Leaks prAlias I'm going to stop here - mainly because I'm confused about managed, but hopefully a fresh look in the morning will help. John
+ } + + if (VIR_ALLOC(srcPriv->prd) < 0) + goto cleanup; + VIR_STEAL_PTR(srcPriv->prd->alias, prAlias); + VIR_STEAL_PTR(srcPriv->prd->path, prPath); + + ret = 0; + cleanup: + VIR_FREE(prPath); + VIR_FREE(prAlias); + return ret; +} + + int qemuDomainPrepareDiskSource(virDomainDiskDefPtr disk, qemuDomainObjPrivatePtr priv, virQEMUDriverConfigPtr cfg) { + if (!(disk->src->privateData = qemuDomainStorageSourcePrivateNew())) + return -1; + if (qemuDomainPrepareDiskSourceTLS(disk->src, cfg) < 0) return -1;
if (qemuDomainSecretDiskPrepare(priv, disk) < 0) return -1;
+ if (qemuDomainPrepareDiskPRD(priv, disk) < 0) + return -1; + if (disk->src->type == VIR_STORAGE_TYPE_NETWORK && disk->src->protocol == VIR_STORAGE_NET_PROTOCOL_GLUSTER && virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_GLUSTER_DEBUG_LEVEL)) { diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 6d3e6eb5e..b9258eb8e 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -373,6 +373,13 @@ struct _qemuDomainDiskPrivate { bool removable; /* device media can be removed/changed */ };
+typedef struct _qemuDomainDiskPRD qemuDomainDiskPRD; +typedef qemuDomainDiskPRD *qemuDomainDiskPRDPtr; +struct _qemuDomainDiskPRD { + char *alias; + char *path; /* socket path. */ +}; + # define QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src) \ ((qemuDomainStorageSourcePrivatePtr) (src)->privateData)
@@ -386,6 +393,9 @@ struct _qemuDomainStorageSourcePrivate {
/* data required for decryption of encrypted storage source */ qemuDomainSecretInfoPtr encinfo; + + /* data required for persistent reservations */ + qemuDomainDiskPRDPtr prd; };
virObjectPtr qemuDomainStorageSourcePrivateNew(void); diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 328f59788..a85fec330 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -2076,6 +2076,21 @@ virStoragePRDefIsEqual(virStoragePRDefPtr a, return true; }
+ +bool +virStoragePRDefIsEnabled(virStoragePRDefPtr prd) +{ + return prd && prd->enabled == VIR_TRISTATE_BOOL_YES; +} + + +bool +virStoragePRDefIsManaged(virStoragePRDefPtr prd) +{ + return prd && prd->managed == VIR_TRISTATE_BOOL_YES;> +} + + virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src, const char *model) diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index e70c99076..cf931bbb1 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -385,6 +385,8 @@ void virStoragePRDefFormat(virBufferPtr buf, virStoragePRDefPtr prd); bool virStoragePRDefIsEqual(virStoragePRDefPtr a, virStoragePRDefPtr b); +bool virStoragePRDefIsEnabled(virStoragePRDefPtr prd); +bool virStoragePRDefIsManaged(virStoragePRDefPtr prd);
virSecurityDeviceLabelDefPtr virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src,

On 03/02/2018 02:59 AM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
While we're not generating the command line just yet (look for the next commits), we can generate the alias for pr-manager. A domain can have up to one managed pr-manager (in which case socket path is decided by libvirt and pr-helper is spawned by libvirt too), but up to ndisks of unmanaged ones (one per disk basically). In case of the former we can generate a simple alias and be sure it'll not conflict. But in case of the latter to avoid any conflicts let's generate alias that's based on something unique - like disk target.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 2 ++ src/qemu/qemu_domain.c | 80 +++++++++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_domain.h | 10 ++++++ src/util/virstoragefile.c | 15 +++++++++ src/util/virstoragefile.h | 2 ++ 5 files changed, 106 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index de8974d66..dffc4c30e 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11538,17 +11548,81 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, }
+static int +qemuDomainPrepareDiskPRD(qemuDomainObjPrivatePtr priv, + virDomainDiskDefPtr disk) +{ + qemuDomainStorageSourcePrivatePtr srcPriv; + virStoragePRDefPtr prd = disk->src->pr; + char *prAlias = NULL; + char *prPath = NULL; + int ret = -1; + + if (!virStoragePRDefIsEnabled(prd)) + return 0; + + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_PR_MANAGER_HELPER)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("reservations not supported with this QEMU binary")); + goto cleanup; + } + + if (!virStorageSourceIsLocalStorage(disk->src)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("reservations supported only for local storage")); + goto cleanup; + }
Ironically the above two could return -1 directly...
+ + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + + if (virStoragePRDefIsManaged(prd)) { + /* Generate PR helper socket path & alias that are same + * for each disk in the domain. */ + + if (VIR_STRDUP(prAlias, "pr-helper0") < 0) + return -1; + + if (virAsprintf(&prPath, "%s/pr-helper0.sock", priv->libDir) < 0) + return -1;
Leaks prAlias
Ah, good catch.
BTW: Whatever "managed" ends up being needs to be described somewhere either prior to this or after this, but knowing how the connection between libvirt and qemu's process is going to happen will be really helpful... Something that I would think would be described in patch 1, but again that's just me. This whole managed thing is really confusing the way it's written. May make sense to you, but it doesn't make sense to me, just saying.
Fair enough. I'll add something.
+ + } else { + if (virAsprintf(&prAlias, "pr-helper-%s", disk->dst) < 0) + return -1;
Or using the 'disk->info.alias' like other consumers.
Okay. I though that disk->dst would be easier to find on cmd line but on the other hand, grepping cmd line for disk->info.alias shows every argument that has something to do with the disk. Michal

Now that we generate pr-manager alias and socket path store them in status XML so that they are preserved across daemon restarts. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 88 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index dffc4c30e..45205fd03 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2540,6 +2540,92 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, } +static int +qemuStorageSourcePrivateDataParsePR(xmlXPathContextPtr ctxt, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = NULL; + int rc; + int ret = -1; + + if ((rc = virXPathBoolean("boolean(./prd)", ctxt)) == 0) { + return 0; + } else if (rc < 0) { + return ret; + } + + if (VIR_ALLOC(prd) < 0) + goto cleanup; + + if (!(prd->alias = virXPathString("string(./prd/alias)", ctxt)) || + !(prd->path = virXPathString("string(./prd/path)", ctxt))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed <prd/>")); + goto cleanup; + } + + VIR_STEAL_PTR(srcPriv->prd, prd); + ret = 0; + cleanup: + qemuDomainDiskPRDFree(prd); + return ret; +} + + +static int +qemuStorageSourcePrivateDataFormatPR(virStorageSourcePtr src, + virBufferPtr buf) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd; + + if (!srcPriv || !srcPriv->prd) + return 0; + + prd = srcPriv->prd; + + virBufferAddLit(buf, "<prd>\n"); + virBufferAdjustIndent(buf, 2); + virBufferAsprintf(buf, "<alias>%s</alias>\n", prd->alias); + virBufferAsprintf(buf, "<path>%s</path>\n", prd->path); + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</prd>\n"); + return 0; +} + + +static int +qemuStorageSourcePrivateDataParse(xmlXPathContextPtr ctxt, + virStorageSourcePtr src) +{ + if (!(src->privateData = qemuDomainStorageSourcePrivateNew())) + return -1; + + if (virStorageSourcePrivateDataParseRelPath(ctxt, src) < 0) + return -1; + + if (qemuStorageSourcePrivateDataParsePR(ctxt, src) < 0) + return -1; + + return 0; +} + + +static int +qemuStorageSourcePrivateDataFormat(virStorageSourcePtr src, + virBufferPtr buf) +{ + if (virStorageSourcePrivateDataFormatRelPath(src, buf) < 0) + return -1; + + if (qemuStorageSourcePrivateDataFormatPR(src, buf) < 0) + return -1; + + return 0; +} + + virDomainXMLPrivateDataCallbacks virQEMUDriverPrivateDataCallbacks = { .alloc = qemuDomainObjPrivateAlloc, .free = qemuDomainObjPrivateFree, @@ -2548,8 +2634,8 @@ virDomainXMLPrivateDataCallbacks virQEMUDriverPrivateDataCallbacks = { .chrSourceNew = qemuDomainChrSourcePrivateNew, .parse = qemuDomainObjPrivateXMLParse, .format = qemuDomainObjPrivateXMLFormat, - .storageParse = virStorageSourcePrivateDataParseRelPath, - .storageFormat = virStorageSourcePrivateDataFormatRelPath, + .storageParse = qemuStorageSourcePrivateDataParse, + .storageFormat = qemuStorageSourcePrivateDataFormat, }; -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Now that we generate pr-manager alias and socket path store them in status XML so that they are preserved across daemon restarts.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 88 insertions(+), 2 deletions(-)
Something that dawned on my this morning (sorry ;-)) - the ->alias could easily be generated at reconnect time too.
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index dffc4c30e..45205fd03 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2540,6 +2540,92 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, }
+static int +qemuStorageSourcePrivateDataParsePR(xmlXPathContextPtr ctxt, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = NULL; + int rc; + int ret = -1; + + if ((rc = virXPathBoolean("boolean(./prd)", ctxt)) == 0) { ^^ Extra space above
+ return 0; + } else if (rc < 0) { + return ret; + } + + if (VIR_ALLOC(prd) < 0) + goto cleanup;
return ret works too since prd == NULL
+ + if (!(prd->alias = virXPathString("string(./prd/alias)", ctxt)) || + !(prd->path = virXPathString("string(./prd/path)", ctxt))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed <prd/>")); + goto cleanup; + } + + VIR_STEAL_PTR(srcPriv->prd, prd); + ret = 0; + cleanup: + qemuDomainDiskPRDFree(prd); + return ret; +} + + +static int +qemuStorageSourcePrivateDataFormatPR(virStorageSourcePtr src, + virBufferPtr buf) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd; + + if (!srcPriv || !srcPriv->prd) + return 0; + + prd = srcPriv->prd;
Does saving the information really "matter" in whatever case it is that uses a 'static' alias and path? IOW: Should we save some sort of boolean to indicate PR was desired and then if path is also provided, we can use that path; otherwise, use the 'static' path when we try to reconnect the socket.
+ + virBufferAddLit(buf, "<prd>\n"); + virBufferAdjustIndent(buf, 2); + virBufferAsprintf(buf, "<alias>%s</alias>\n", prd->alias); + virBufferAsprintf(buf, "<path>%s</path>\n", prd->path);
alias and path could be attributes of prd too rather than elements on their own, but that's just your implementation detail... IDC, but figured I'd note it anyway. John
+ virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</prd>\n"); + return 0; +} + + +static int +qemuStorageSourcePrivateDataParse(xmlXPathContextPtr ctxt, + virStorageSourcePtr src) +{ + if (!(src->privateData = qemuDomainStorageSourcePrivateNew())) + return -1; + + if (virStorageSourcePrivateDataParseRelPath(ctxt, src) < 0) + return -1; + + if (qemuStorageSourcePrivateDataParsePR(ctxt, src) < 0) + return -1; + + return 0; +} + + +static int +qemuStorageSourcePrivateDataFormat(virStorageSourcePtr src, + virBufferPtr buf) +{ + if (virStorageSourcePrivateDataFormatRelPath(src, buf) < 0) + return -1; + + if (qemuStorageSourcePrivateDataFormatPR(src, buf) < 0) + return -1; + + return 0; +} + + virDomainXMLPrivateDataCallbacks virQEMUDriverPrivateDataCallbacks = { .alloc = qemuDomainObjPrivateAlloc, .free = qemuDomainObjPrivateFree, @@ -2548,8 +2634,8 @@ virDomainXMLPrivateDataCallbacks virQEMUDriverPrivateDataCallbacks = { .chrSourceNew = qemuDomainChrSourcePrivateNew, .parse = qemuDomainObjPrivateXMLParse, .format = qemuDomainObjPrivateXMLFormat, - .storageParse = virStorageSourcePrivateDataParseRelPath, - .storageFormat = virStorageSourcePrivateDataFormatRelPath, + .storageParse = qemuStorageSourcePrivateDataParse, + .storageFormat = qemuStorageSourcePrivateDataFormat, };

On 03/02/2018 01:50 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Now that we generate pr-manager alias and socket path store them in status XML so that they are preserved across daemon restarts.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 88 insertions(+), 2 deletions(-)
Something that dawned on my this morning (sorry ;-)) - the ->alias could easily be generated at reconnect time too.
Sure, but then we can end up in similar situation like with private paths for domain. I mean, we did not use to store them in status XML so now, whenever we are parsing one we have to see if one is stored there or generate the old one. Luckily there were just two possible options. Just imagine if there were three. We'd have no idea which one to generate. Long story short, I really prefer to store whatever might change in the status XML.
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index dffc4c30e..45205fd03 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2540,6 +2540,92 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, }
+static int +qemuStorageSourcePrivateDataParsePR(xmlXPathContextPtr ctxt, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = NULL; + int rc; + int ret = -1; + + if ((rc = virXPathBoolean("boolean(./prd)", ctxt)) == 0) { ^^ Extra space above
+ return 0; + } else if (rc < 0) { + return ret; + } + + if (VIR_ALLOC(prd) < 0) + goto cleanup;
return ret works too since prd == NULL
+ + if (!(prd->alias = virXPathString("string(./prd/alias)", ctxt)) || + !(prd->path = virXPathString("string(./prd/path)", ctxt))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed <prd/>")); + goto cleanup; + } + + VIR_STEAL_PTR(srcPriv->prd, prd); + ret = 0; + cleanup: + qemuDomainDiskPRDFree(prd); + return ret; +} + + +static int +qemuStorageSourcePrivateDataFormatPR(virStorageSourcePtr src, + virBufferPtr buf) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd; + + if (!srcPriv || !srcPriv->prd) + return 0; + + prd = srcPriv->prd;
Does saving the information really "matter" in whatever case it is that uses a 'static' alias and path? IOW: Should we save some sort of boolean to indicate PR was desired and then if path is also provided, we can use that path; otherwise, use the 'static' path when we try to reconnect the socket.
I don't think we need that. This information is stored in the disk source XML: @managed == yes/no. Because of patch 02/12 we are guaranteed it will not change on migration/restore. Also, I don't see any value in having an managed pr-helper and making it unmanaged all of a sudden. Or vice versa. I expect users to use @managed='yes' prevalently.
+ + virBufferAddLit(buf, "<prd>\n"); + virBufferAdjustIndent(buf, 2); + virBufferAsprintf(buf, "<alias>%s</alias>\n", prd->alias); + virBufferAsprintf(buf, "<path>%s</path>\n", prd->path);
alias and path could be attributes of prd too rather than elements on their own, but that's just your implementation detail... IDC, but figured I'd note it anyway.
Yeah. Unless something is clear yes/no value (like @managed/@enabled) I tend to put knobs like these into separate elements as it is more extendable should we need it in the future IMO. But I don't have much strong opinion on that either. Michal

On 03/06/2018 12:31 PM, Michal Privoznik wrote:
On 03/02/2018 01:50 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Now that we generate pr-manager alias and socket path store them in status XML so that they are preserved across daemon restarts.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 88 insertions(+), 2 deletions(-)
Something that dawned on my this morning (sorry ;-)) - the ->alias could easily be generated at reconnect time too.
Sure, but then we can end up in similar situation like with private paths for domain. I mean, we did not use to store them in status XML so now, whenever we are parsing one we have to see if one is stored there or generate the old one. Luckily there were just two possible options. Just imagine if there were three. We'd have no idea which one to generate.
Long story short, I really prefer to store whatever might change in the status XML.
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index dffc4c30e..45205fd03 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2540,6 +2540,92 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, }
+static int +qemuStorageSourcePrivateDataParsePR(xmlXPathContextPtr ctxt, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = NULL; + int rc; + int ret = -1; + + if ((rc = virXPathBoolean("boolean(./prd)", ctxt)) == 0) { ^^ Extra space above
+ return 0; + } else if (rc < 0) { + return ret; + } + + if (VIR_ALLOC(prd) < 0) + goto cleanup;
return ret works too since prd == NULL
+ + if (!(prd->alias = virXPathString("string(./prd/alias)", ctxt)) || + !(prd->path = virXPathString("string(./prd/path)", ctxt))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed <prd/>")); + goto cleanup; + } + + VIR_STEAL_PTR(srcPriv->prd, prd); + ret = 0; + cleanup: + qemuDomainDiskPRDFree(prd); + return ret; +} + + +static int +qemuStorageSourcePrivateDataFormatPR(virStorageSourcePtr src, + virBufferPtr buf) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd; + + if (!srcPriv || !srcPriv->prd) + return 0; + + prd = srcPriv->prd;
Does saving the information really "matter" in whatever case it is that uses a 'static' alias and path? IOW: Should we save some sort of boolean to indicate PR was desired and then if path is also provided, we can use that path; otherwise, use the 'static' path when we try to reconnect the socket.
I don't think we need that. This information is stored in the disk source XML: @managed == yes/no. Because of patch 02/12 we are guaranteed it will not change on migration/restore. Also, I don't see any value in having an managed pr-helper and making it unmanaged all of a sudden. Or vice versa. I expect users to use @managed='yes' prevalently.
There's something subtle that I'm missing with this code... maybe it's just not fully comprehending the two modes/paths that are being added. Are we making things too complicated.
+ + virBufferAddLit(buf, "<prd>\n"); + virBufferAdjustIndent(buf, 2); + virBufferAsprintf(buf, "<alias>%s</alias>\n", prd->alias); + virBufferAsprintf(buf, "<path>%s</path>\n", prd->path);
alias and path could be attributes of prd too rather than elements on their own, but that's just your implementation detail... IDC, but figured I'd note it anyway.
Yeah. Unless something is clear yes/no value (like @managed/@enabled) I tend to put knobs like these into separate elements as it is more extendable should we need it in the future IMO. But I don't have much strong opinion on that either.
IDC either - to some degree it's the author's choice... Trying to stay as close as possible to other elements. You may be "impacted" w/r/t Peter's patches changing the status file parse/format tests. John

This is the easier part. All we need to do here is put -object pr-manger-helper,id=$alias,path=$socketPath and then just reference the object in -drive file.pr-manger=$alias. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_command.c | 40 ++++++++++++++++++++++ .../disk-virtio-scsi-reservations-not-managed.args | 28 +++++++++++++++ .../disk-virtio-scsi-reservations.args | 29 ++++++++++++++++ tests/qemuxml2argvtest.c | 8 +++++ 4 files changed, 105 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c..069d60d35 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1514,6 +1514,20 @@ qemuDiskSourceGetProps(virStorageSourcePtr src) } +static void +qemuBuildDriveSourcePR(virBufferPtr buf, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + + if (!prd || !prd->alias) + return; + + virBufferAsprintf(buf, ",file.pr-manager=%s", prd->alias); +} + + static int qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, virQEMUCapsPtr qemuCaps, @@ -1590,6 +1604,8 @@ qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, if (disk->src->debug) virBufferAsprintf(buf, ",file.debug=%d", disk->src->debugLevel); + + qemuBuildDriveSourcePR(buf, disk->src); } else { if (!(source = virQEMUBuildDriveCommandlineFromJSON(srcprops))) goto cleanup; @@ -9789,6 +9805,28 @@ qemuBuildPanicCommandLine(virCommandPtr cmd, } +static void +qemuBuildMasterPRCommandLine(virCommandPtr cmd, + const virDomainDef *def) +{ + size_t i; + + for (i = 0; i < def->ndisks; i++) { + const virDomainDiskDef *disk = def->disks[i]; + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!prd || !prd->alias) + continue; + + virBufferAsprintf(&buf, "pr-manager-helper,id=%s,path=%s", prd->alias, prd->path); + virCommandAddArg(cmd, "-object"); + virCommandAddArgBuffer(cmd, &buf); + } +} + + /** * qemuBuildCommandLineValidate: * @@ -9941,6 +9979,8 @@ qemuBuildCommandLine(virQEMUDriverPtr driver, if (qemuBuildMasterKeyCommandLine(cmd, priv) < 0) goto error; + qemuBuildMasterPRCommandLine(cmd, def); + if (enableFips) virCommandAddArg(cmd, "-enable-fips"); diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args new file mode 100644 index 000000000..387432c18 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args @@ -0,0 +1,28 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i686 \ +-name QEMUGuest1 \ +-S \ +-object pr-manager-helper,id=pr-helper-sdb,path=/path/to/qemu-pr-helper.sock \ +-M pc \ +-m 214 \ +-smp 8,sockets=8,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-nographic \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=readline \ +-no-acpi \ +-boot c \ +-device virtio-scsi-pci,id=scsi0,num_queues=8,bus=pci.0,addr=0x3 \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest1,file.pr-manager=pr-helper-sdb,format=raw,\ +if=none,id=drive-scsi0-0-0-0,boot=on \ +-device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\ +drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args new file mode 100644 index 000000000..2dc53cf05 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i686 \ +-name QEMUGuest1 \ +-S \ +-object pr-manager-helper,id=pr-helper0,\ +path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock \ +-M pc \ +-m 214 \ +-smp 8,sockets=8,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-nographic \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=readline \ +-no-acpi \ +-boot c \ +-device virtio-scsi-pci,id=scsi0,num_queues=8,bus=pci.0,addr=0x3 \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest1,file.pr-manager=pr-helper0,format=raw,\ +if=none,id=drive-scsi0-0-0-0,boot=on \ +-device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\ +drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 688846b9b..429cdf079 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -2966,6 +2966,14 @@ mymain(void) QEMU_CAPS_HDA_DUPLEX); DO_TEST("user-aliases2", QEMU_CAPS_DEVICE_IOH3420, QEMU_CAPS_ICH9_AHCI); + DO_TEST("disk-virtio-scsi-reservations", + QEMU_CAPS_DRIVE_BOOT, QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_SCSI_BLOCK, QEMU_CAPS_PR_MANAGER_HELPER); + + DO_TEST("disk-virtio-scsi-reservations-not-managed", + QEMU_CAPS_DRIVE_BOOT, QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_SCSI_BLOCK, QEMU_CAPS_PR_MANAGER_HELPER); + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) virFileDeleteTree(fakerootdir); -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
This is the easier part. All we need to do here is put -object pr-manger-helper,id=$alias,path=$socketPath and then just reference the object in -drive file.pr-manger=$alias.
s/manger/manager/ My fingers usually the same way though as manger
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_command.c | 40 ++++++++++++++++++++++ .../disk-virtio-scsi-reservations-not-managed.args | 28 +++++++++++++++ .../disk-virtio-scsi-reservations.args | 29 ++++++++++++++++ tests/qemuxml2argvtest.c | 8 +++++ 4 files changed, 105 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c..069d60d35 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1514,6 +1514,20 @@ qemuDiskSourceGetProps(virStorageSourcePtr src) }
+static void +qemuBuildDriveSourcePR(virBufferPtr buf, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + + if (!prd || !prd->alias) + return; + + virBufferAsprintf(buf, ",file.pr-manager=%s", prd->alias); +} + + static int qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, virQEMUCapsPtr qemuCaps, @@ -1590,6 +1604,8 @@ qemuBuildDriveSourceStr(virDomainDiskDefPtr disk,
if (disk->src->debug) virBufferAsprintf(buf, ",file.debug=%d", disk->src->debugLevel); + + qemuBuildDriveSourcePR(buf, disk->src); } else { if (!(source = virQEMUBuildDriveCommandlineFromJSON(srcprops))) goto cleanup; @@ -9789,6 +9805,28 @@ qemuBuildPanicCommandLine(virCommandPtr cmd, }
+static void +qemuBuildMasterPRCommandLine(virCommandPtr cmd, + const virDomainDef *def) +{ + size_t i; + + for (i = 0; i < def->ndisks; i++) { + const virDomainDiskDef *disk = def->disks[i]; + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!prd || !prd->alias) + continue; + + virBufferAsprintf(&buf, "pr-manager-helper,id=%s,path=%s", prd->alias, prd->path); + virCommandAddArg(cmd, "-object"); + virCommandAddArgBuffer(cmd, &buf);
What happens when there's more than one disk using the managed mode where you have a "static" alias and path, wouldn't there be multiple lines with: -object pr-manager-helper,id=pr-helper0,path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock ? And how is QEMU going to react to that? IOW: Shouldn't this code know it's already created an object for that case and not generate another one? The other one : -object pr-manager-helper,id=pr-helper-sdb,path=/path/to/qemu-pr-helper.sock I get, but I'm still not thrilled with "sdb" as opposed to the disk->info.alias of "scsi0-0-0-0" more commonly used. IIRC, there's no guarantee that what libvirt calls "sdb" ends up being "sdb" on the guest. My vague recollection of the algorithm that "automagically" generates the address based on sda, sdb, sdc, etc. So "sdb" IIRC would related to an address that would create an alias using "0-0-1"; whereas, "sda" would create that "0-0-0" value. The fact that you've defined the <address> and <target> originally avoids the virDomainDiskDefAssignAddress algorithm. Also see the virDiskNameToBusDeviceIndex as well as the virDomainHostdevAssignAddress (and code around that) in order to see the awfulness of which I write. The real fun begins when you have <disk>'s and <hostdev>'s. BTW: Seeing this and thinking about the command line jiggles a memory thread related to virStorageSourceParseBacking* and related tests where the code can handle various JSON outputs where it's not clear to me whether you'll need to add tests for the "file.pr-manager" processing. I think you might, but Peter understands more of that than I do. John
+ } +} + + /** * qemuBuildCommandLineValidate: * @@ -9941,6 +9979,8 @@ qemuBuildCommandLine(virQEMUDriverPtr driver, if (qemuBuildMasterKeyCommandLine(cmd, priv) < 0) goto error;
+ qemuBuildMasterPRCommandLine(cmd, def); + if (enableFips) virCommandAddArg(cmd, "-enable-fips");
diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args new file mode 100644 index 000000000..387432c18 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args @@ -0,0 +1,28 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i686 \ +-name QEMUGuest1 \ +-S \ +-object pr-manager-helper,id=pr-helper-sdb,path=/path/to/qemu-pr-helper.sock \ +-M pc \ +-m 214 \ +-smp 8,sockets=8,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-nographic \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=readline \ +-no-acpi \ +-boot c \ +-device virtio-scsi-pci,id=scsi0,num_queues=8,bus=pci.0,addr=0x3 \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest1,file.pr-manager=pr-helper-sdb,format=raw,\ +if=none,id=drive-scsi0-0-0-0,boot=on \ +-device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\ +drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args new file mode 100644 index 000000000..2dc53cf05 --- /dev/null +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i686 \ +-name QEMUGuest1 \ +-S \ +-object pr-manager-helper,id=pr-helper0,\ +path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock \ +-M pc \ +-m 214 \ +-smp 8,sockets=8,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-nographic \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=readline \ +-no-acpi \ +-boot c \ +-device virtio-scsi-pci,id=scsi0,num_queues=8,bus=pci.0,addr=0x3 \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest1,file.pr-manager=pr-helper0,format=raw,\ +if=none,id=drive-scsi0-0-0-0,boot=on \ +-device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\ +drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 688846b9b..429cdf079 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -2966,6 +2966,14 @@ mymain(void) QEMU_CAPS_HDA_DUPLEX); DO_TEST("user-aliases2", QEMU_CAPS_DEVICE_IOH3420, QEMU_CAPS_ICH9_AHCI);
+ DO_TEST("disk-virtio-scsi-reservations", + QEMU_CAPS_DRIVE_BOOT, QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_SCSI_BLOCK, QEMU_CAPS_PR_MANAGER_HELPER); + + DO_TEST("disk-virtio-scsi-reservations-not-managed", + QEMU_CAPS_DRIVE_BOOT, QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_SCSI_BLOCK, QEMU_CAPS_PR_MANAGER_HELPER); + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) virFileDeleteTree(fakerootdir);

On Fri, Mar 02, 2018 at 08:22:32 -0500, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
This is the easier part. All we need to do here is put -object pr-manger-helper,id=$alias,path=$socketPath and then just reference the object in -drive file.pr-manger=$alias.
s/manger/manager/
My fingers usually the same way though as manger
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_command.c | 40 ++++++++++++++++++++++ .../disk-virtio-scsi-reservations-not-managed.args | 28 +++++++++++++++ .../disk-virtio-scsi-reservations.args | 29 ++++++++++++++++ tests/qemuxml2argvtest.c | 8 +++++ 4 files changed, 105 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c..069d60d35 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1514,6 +1514,20 @@ qemuDiskSourceGetProps(virStorageSourcePtr src) }
+static void +qemuBuildDriveSourcePR(virBufferPtr buf, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + + if (!prd || !prd->alias) + return; + + virBufferAsprintf(buf, ",file.pr-manager=%s", prd->alias); +} + + static int qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, virQEMUCapsPtr qemuCaps, @@ -1590,6 +1604,8 @@ qemuBuildDriveSourceStr(virDomainDiskDefPtr disk,
if (disk->src->debug) virBufferAsprintf(buf, ",file.debug=%d", disk->src->debugLevel); + + qemuBuildDriveSourcePR(buf, disk->src); } else { if (!(source = virQEMUBuildDriveCommandlineFromJSON(srcprops))) goto cleanup; @@ -9789,6 +9805,28 @@ qemuBuildPanicCommandLine(virCommandPtr cmd, }
+static void +qemuBuildMasterPRCommandLine(virCommandPtr cmd, + const virDomainDef *def) +{ + size_t i; + + for (i = 0; i < def->ndisks; i++) { + const virDomainDiskDef *disk = def->disks[i]; + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!prd || !prd->alias) + continue; + + virBufferAsprintf(&buf, "pr-manager-helper,id=%s,path=%s", prd->alias, prd->path); + virCommandAddArg(cmd, "-object"); + virCommandAddArgBuffer(cmd, &buf);
What happens when there's more than one disk using the managed mode where you have a "static" alias and path, wouldn't there be multiple lines with:
-object pr-manager-helper,id=pr-helper0,path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock
? And how is QEMU going to react to that?
IOW: Shouldn't this code know it's already created an object for that case and not generate another one?
The other one :
-object pr-manager-helper,id=pr-helper-sdb,path=/path/to/qemu-pr-helper.sock
I get, but I'm still not thrilled with "sdb" as opposed to the disk->info.alias of "scsi0-0-0-0" more commonly used. IIRC, there's no guarantee that what libvirt calls "sdb" ends up being "sdb" on the guest. My vague recollection of the algorithm that "automagically" generates the address based on sda, sdb, sdc, etc. So "sdb" IIRC would related to an address that would create an alias using "0-0-1"; whereas, "sda" would create that "0-0-0" value.
The fact that you've defined the <address> and <target> originally avoids the virDomainDiskDefAssignAddress algorithm. Also see the virDiskNameToBusDeviceIndex as well as the virDomainHostdevAssignAddress (and code around that) in order to see the awfulness of which I write. The real fun begins when you have <disk>'s and <hostdev>'s.
BTW: Seeing this and thinking about the command line jiggles a memory thread related to virStorageSourceParseBacking* and related tests where the code can handle various JSON outputs where it's not clear to me whether you'll need to add tests for the "file.pr-manager" processing. I think you might, but Peter understands more of that than I do.
No, this should not be handled in JSON at all. Referencing an alias in JSON is wrong since it is tied to the one single run of the VM where that would be created and could be something completely different in any other run of the VM

On 03/02/2018 02:22 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
This is the easier part. All we need to do here is put -object pr-manger-helper,id=$alias,path=$socketPath and then just reference the object in -drive file.pr-manger=$alias.
s/manger/manager/
My fingers usually the same way though as manger
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_command.c | 40 ++++++++++++++++++++++ .../disk-virtio-scsi-reservations-not-managed.args | 28 +++++++++++++++ .../disk-virtio-scsi-reservations.args | 29 ++++++++++++++++ tests/qemuxml2argvtest.c | 8 +++++ 4 files changed, 105 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c..069d60d35 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1514,6 +1514,20 @@ qemuDiskSourceGetProps(virStorageSourcePtr src) }
+static void +qemuBuildDriveSourcePR(virBufferPtr buf, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + + if (!prd || !prd->alias) + return; + + virBufferAsprintf(buf, ",file.pr-manager=%s", prd->alias); +} + + static int qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, virQEMUCapsPtr qemuCaps, @@ -1590,6 +1604,8 @@ qemuBuildDriveSourceStr(virDomainDiskDefPtr disk,
if (disk->src->debug) virBufferAsprintf(buf, ",file.debug=%d", disk->src->debugLevel); + + qemuBuildDriveSourcePR(buf, disk->src); } else { if (!(source = virQEMUBuildDriveCommandlineFromJSON(srcprops))) goto cleanup; @@ -9789,6 +9805,28 @@ qemuBuildPanicCommandLine(virCommandPtr cmd, }
+static void +qemuBuildMasterPRCommandLine(virCommandPtr cmd, + const virDomainDef *def) +{ + size_t i; + + for (i = 0; i < def->ndisks; i++) { + const virDomainDiskDef *disk = def->disks[i]; + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!prd || !prd->alias) + continue; + + virBufferAsprintf(&buf, "pr-manager-helper,id=%s,path=%s", prd->alias, prd->path); + virCommandAddArg(cmd, "-object"); + virCommandAddArgBuffer(cmd, &buf);
What happens when there's more than one disk using the managed mode where you have a "static" alias and path, wouldn't there be multiple lines with:
-object pr-manager-helper,id=pr-helper0,path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock
Ah sure.
? And how is QEMU going to react to that?
IOW: Shouldn't this code know it's already created an object for that case and not generate another one?
The other one :
-object pr-manager-helper,id=pr-helper-sdb,path=/path/to/qemu-pr-helper.sock
I get, but I'm still not thrilled with "sdb" as opposed to the disk->info.alias of "scsi0-0-0-0" more commonly used. IIRC, there's no guarantee that what libvirt calls "sdb" ends up being "sdb" on the guest. My vague recollection of the algorithm that "automagically" generates the address based on sda, sdb, sdc, etc. So "sdb" IIRC would related to an address that would create an alias using "0-0-1"; whereas, "sda" would create that "0-0-0" value.
Yes. I've changed it in the other patch so now 'pr-helper-scsi0-0-0-0' is generated. Michal

On 03/06/2018 12:31 PM, Michal Privoznik wrote:
On 03/02/2018 02:22 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
This is the easier part. All we need to do here is put -object pr-manger-helper,id=$alias,path=$socketPath and then just reference the object in -drive file.pr-manger=$alias.
s/manger/manager/
My fingers usually the same way though as manger
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_command.c | 40 ++++++++++++++++++++++ .../disk-virtio-scsi-reservations-not-managed.args | 28 +++++++++++++++ .../disk-virtio-scsi-reservations.args | 29 ++++++++++++++++ tests/qemuxml2argvtest.c | 8 +++++ 4 files changed, 105 insertions(+) create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.args create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.args
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c..069d60d35 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1514,6 +1514,20 @@ qemuDiskSourceGetProps(virStorageSourcePtr src) }
+static void +qemuBuildDriveSourcePR(virBufferPtr buf, + virStorageSourcePtr src) +{ + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + + if (!prd || !prd->alias) + return; + + virBufferAsprintf(buf, ",file.pr-manager=%s", prd->alias); +} + + static int qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, virQEMUCapsPtr qemuCaps, @@ -1590,6 +1604,8 @@ qemuBuildDriveSourceStr(virDomainDiskDefPtr disk,
if (disk->src->debug) virBufferAsprintf(buf, ",file.debug=%d", disk->src->debugLevel); + + qemuBuildDriveSourcePR(buf, disk->src); } else { if (!(source = virQEMUBuildDriveCommandlineFromJSON(srcprops))) goto cleanup; @@ -9789,6 +9805,28 @@ qemuBuildPanicCommandLine(virCommandPtr cmd, }
+static void +qemuBuildMasterPRCommandLine(virCommandPtr cmd, + const virDomainDef *def) +{ + size_t i; + + for (i = 0; i < def->ndisks; i++) { + const virDomainDiskDef *disk = def->disks[i]; + qemuDomainStorageSourcePrivatePtr srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + qemuDomainDiskPRDPtr prd = srcPriv->prd; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!prd || !prd->alias) + continue; + + virBufferAsprintf(&buf, "pr-manager-helper,id=%s,path=%s", prd->alias, prd->path); + virCommandAddArg(cmd, "-object"); + virCommandAddArgBuffer(cmd, &buf);
What happens when there's more than one disk using the managed mode where you have a "static" alias and path, wouldn't there be multiple lines with:
-object pr-manager-helper,id=pr-helper0,path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock
Ah sure.
So having multiple "id=pr-helper0" is OK by QEMU? OK, then. John
? And how is QEMU going to react to that?
IOW: Shouldn't this code know it's already created an object for that case and not generate another one?
The other one :
-object pr-manager-helper,id=pr-helper-sdb,path=/path/to/qemu-pr-helper.sock
I get, but I'm still not thrilled with "sdb" as opposed to the disk->info.alias of "scsi0-0-0-0" more commonly used. IIRC, there's no guarantee that what libvirt calls "sdb" ends up being "sdb" on the guest. My vague recollection of the algorithm that "automagically" generates the address based on sda, sdb, sdc, etc. So "sdb" IIRC would related to an address that would create an alias using "0-0-1"; whereas, "sda" would create that "0-0-0" value.
Yes. I've changed it in the other patch so now 'pr-helper-scsi0-0-0-0' is generated.
Michal

On 03/08/2018 01:18 AM, John Ferlan wrote:
On 03/06/2018 12:31 PM, Michal Privoznik wrote:
On 03/02/2018 02:22 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
What happens when there's more than one disk using the managed mode where you have a "static" alias and path, wouldn't there be multiple lines with:
-object pr-manager-helper,id=pr-helper0,path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock
Ah sure.
So having multiple "id=pr-helper0" is OK by QEMU? OK, then.
No it's not. I've fixed this locally and I'll post v3 soon. Michal

Just like we allow users overriding path to bridge-helper detected at compile time we can allow them to override path to qemu-pr-helper. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- m4/virt-driver-qemu.m4 | 5 +++++ src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 4 ++++ src/qemu/qemu_conf.c | 7 ++++++- src/qemu/qemu_conf.h | 1 + src/qemu/test_libvirtd_qemu.aug.in | 1 + 6 files changed, 18 insertions(+), 1 deletion(-) diff --git a/m4/virt-driver-qemu.m4 b/m4/virt-driver-qemu.m4 index b9bafdab9..0e0f261d6 100644 --- a/m4/virt-driver-qemu.m4 +++ b/m4/virt-driver-qemu.m4 @@ -57,6 +57,11 @@ AC_DEFUN([LIBVIRT_DRIVER_CHECK_QEMU], [ [/usr/libexec:/usr/lib/qemu:/usr/lib]) AC_DEFINE_UNQUOTED([QEMU_BRIDGE_HELPER], ["$QEMU_BRIDGE_HELPER"], [QEMU bridge helper]) + AC_PATH_PROG([QEMU_PR_HELPER], [qemu-pr-helper], + [/usr/libexec/qemu-pr-helper], + [/usr/libexec:/usr/lib/qemu:/usr/lib]) + AC_DEFINE_UNQUOTED([QEMU_PR_HELPER], ["$QEMU_PR_HELPER"], + [QEMU PR helper]) ]) AC_DEFUN([LIBVIRT_DRIVER_RESULT_QEMU], [ diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug index c19bf3a43..2dc16e91f 100644 --- a/src/qemu/libvirtd_qemu.aug +++ b/src/qemu/libvirtd_qemu.aug @@ -86,6 +86,7 @@ module Libvirtd_qemu = let process_entry = str_entry "hugetlbfs_mount" | bool_entry "clear_emulator_capabilities" | str_entry "bridge_helper" + | str_entry "pr_helper" | bool_entry "set_process_name" | int_entry "max_processes" | int_entry "max_files" diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 43dd561cc..4bc668406 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -775,3 +775,7 @@ # This directory is used for memoryBacking source if configured as file. # NOTE: big files will be stored here #memory_backing_dir = "/var/lib/libvirt/qemu/ram" + +# Path to the SCSI persistent reservations helper. This helper is +# used whenever <reservations/> are enabled for SCSI disks. +#pr_helper = "/usr/libexec/qemu-pr-helper" diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 36cf3a281..8c69dbe75 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -307,7 +307,8 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) goto error; } - if (VIR_STRDUP(cfg->bridgeHelperName, QEMU_BRIDGE_HELPER) < 0) + if (VIR_STRDUP(cfg->bridgeHelperName, QEMU_BRIDGE_HELPER) < 0 || + VIR_STRDUP(cfg->prHelperName, QEMU_PR_HELPER) < 0) goto error; cfg->clearEmulatorCapabilities = true; @@ -392,6 +393,7 @@ static void virQEMUDriverConfigDispose(void *obj) } VIR_FREE(cfg->hugetlbfs); VIR_FREE(cfg->bridgeHelperName); + VIR_FREE(cfg->prHelperName); VIR_FREE(cfg->saveImageFormat); VIR_FREE(cfg->dumpImageFormat); @@ -759,6 +761,9 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg, if (virConfGetValueString(conf, "bridge_helper", &cfg->bridgeHelperName) < 0) goto cleanup; + if (virConfGetValueString(conf, "pr_helper", &cfg->prHelperName) < 0) + goto cleanup; + if (virConfGetValueBool(conf, "mac_filter", &cfg->macFilter) < 0) goto cleanup; diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index 83fd45282..5baa3af5d 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -153,6 +153,7 @@ struct _virQEMUDriverConfig { size_t nhugetlbfs; char *bridgeHelperName; + char *prHelperName; bool macFilter; diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qemu.aug.in index 688e5b9fd..c0efae47b 100644 --- a/src/qemu/test_libvirtd_qemu.aug.in +++ b/src/qemu/test_libvirtd_qemu.aug.in @@ -100,3 +100,4 @@ module Test_libvirtd_qemu = { "1" = "mount" } } { "memory_backing_dir" = "/var/lib/libvirt/qemu/ram" } +{ "pr_helper" = "/usr/libexec/qemu-pr-helper" } -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Just like we allow users overriding path to bridge-helper detected at compile time we can allow them to override path to qemu-pr-helper.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- m4/virt-driver-qemu.m4 | 5 +++++ src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 4 ++++ src/qemu/qemu_conf.c | 7 ++++++- src/qemu/qemu_conf.h | 1 + src/qemu/test_libvirtd_qemu.aug.in | 1 + 6 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/m4/virt-driver-qemu.m4 b/m4/virt-driver-qemu.m4 index b9bafdab9..0e0f261d6 100644 --- a/m4/virt-driver-qemu.m4 +++ b/m4/virt-driver-qemu.m4 @@ -57,6 +57,11 @@ AC_DEFUN([LIBVIRT_DRIVER_CHECK_QEMU], [ [/usr/libexec:/usr/lib/qemu:/usr/lib]) AC_DEFINE_UNQUOTED([QEMU_BRIDGE_HELPER], ["$QEMU_BRIDGE_HELPER"], [QEMU bridge helper]) + AC_PATH_PROG([QEMU_PR_HELPER], [qemu-pr-helper], + [/usr/libexec/qemu-pr-helper], + [/usr/libexec:/usr/lib/qemu:/usr/lib]) + AC_DEFINE_UNQUOTED([QEMU_PR_HELPER], ["$QEMU_PR_HELPER"], + [QEMU PR helper]) ])
AC_DEFUN([LIBVIRT_DRIVER_RESULT_QEMU], [ diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug index c19bf3a43..2dc16e91f 100644 --- a/src/qemu/libvirtd_qemu.aug +++ b/src/qemu/libvirtd_qemu.aug @@ -86,6 +86,7 @@ module Libvirtd_qemu = let process_entry = str_entry "hugetlbfs_mount" | bool_entry "clear_emulator_capabilities" | str_entry "bridge_helper" + | str_entry "pr_helper" | bool_entry "set_process_name" | int_entry "max_processes" | int_entry "max_files" diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 43dd561cc..4bc668406 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -775,3 +775,7 @@ # This directory is used for memoryBacking source if configured as file. # NOTE: big files will be stored here #memory_backing_dir = "/var/lib/libvirt/qemu/ram" + +# Path to the SCSI persistent reservations helper. This helper is +# used whenever <reservations/> are enabled for SCSI disks. +#pr_helper = "/usr/libexec/qemu-pr-helper"
So "how" would I know as a user/admin whether or not I *needed* to uncomment this? or how it's used. From how I read bridge_helper it's there because "This executable is used to create <source type='bridge'> interfaces when libvirtd is running unprivileged." - So then the question becomes does PR make sense or is it useful for the running unprivileged case? In fact, IIRC, using PR requires a bit of special permissions and certain capabilities anyway (related to sg_io or rawio). So is having this available in qemu.conf "required"? I cannot imagine the name of the image changing. Are there any other qualifiers than '-k' that may be useful to turn on in the pr helper...
diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 36cf3a281..8c69dbe75 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -307,7 +307,8 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) goto error; }
- if (VIR_STRDUP(cfg->bridgeHelperName, QEMU_BRIDGE_HELPER) < 0) + if (VIR_STRDUP(cfg->bridgeHelperName, QEMU_BRIDGE_HELPER) < 0 || + VIR_STRDUP(cfg->prHelperName, QEMU_PR_HELPER) < 0) goto error;
cfg->clearEmulatorCapabilities = true; @@ -392,6 +393,7 @@ static void virQEMUDriverConfigDispose(void *obj) } VIR_FREE(cfg->hugetlbfs); VIR_FREE(cfg->bridgeHelperName); + VIR_FREE(cfg->prHelperName);
VIR_FREE(cfg->saveImageFormat); VIR_FREE(cfg->dumpImageFormat); @@ -759,6 +761,9 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg, if (virConfGetValueString(conf, "bridge_helper", &cfg->bridgeHelperName) < 0) goto cleanup;
+ if (virConfGetValueString(conf, "pr_helper", &cfg->prHelperName) < 0) + goto cleanup; + if (virConfGetValueBool(conf, "mac_filter", &cfg->macFilter) < 0) goto cleanup;
diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index 83fd45282..5baa3af5d 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -153,6 +153,7 @@ struct _virQEMUDriverConfig { size_t nhugetlbfs;
char *bridgeHelperName; + char *prHelperName;
bool macFilter;
diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qemu.aug.in index 688e5b9fd..c0efae47b 100644 --- a/src/qemu/test_libvirtd_qemu.aug.in +++ b/src/qemu/test_libvirtd_qemu.aug.in @@ -100,3 +100,4 @@ module Test_libvirtd_qemu = { "1" = "mount" } } { "memory_backing_dir" = "/var/lib/libvirt/qemu/ram" } +{ "pr_helper" = "/usr/libexec/qemu-pr-helper" }
I'm kind of surprised this passed the testing... I have this recollection of these aug files being really particular about the order and I would "expect" the "pr_helper" to be closer to the "bridge_helper". Ahh, I see, the placement in qemu.conf at the end is what allows this to work as it is... John

On 03/02/2018 03:12 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Just like we allow users overriding path to bridge-helper detected at compile time we can allow them to override path to qemu-pr-helper.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- m4/virt-driver-qemu.m4 | 5 +++++ src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 4 ++++ src/qemu/qemu_conf.c | 7 ++++++- src/qemu/qemu_conf.h | 1 + src/qemu/test_libvirtd_qemu.aug.in | 1 + 6 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/m4/virt-driver-qemu.m4 b/m4/virt-driver-qemu.m4 index b9bafdab9..0e0f261d6 100644 --- a/m4/virt-driver-qemu.m4 +++ b/m4/virt-driver-qemu.m4 @@ -57,6 +57,11 @@ AC_DEFUN([LIBVIRT_DRIVER_CHECK_QEMU], [ [/usr/libexec:/usr/lib/qemu:/usr/lib]) AC_DEFINE_UNQUOTED([QEMU_BRIDGE_HELPER], ["$QEMU_BRIDGE_HELPER"], [QEMU bridge helper]) + AC_PATH_PROG([QEMU_PR_HELPER], [qemu-pr-helper], + [/usr/libexec/qemu-pr-helper], + [/usr/libexec:/usr/lib/qemu:/usr/lib]) + AC_DEFINE_UNQUOTED([QEMU_PR_HELPER], ["$QEMU_PR_HELPER"], + [QEMU PR helper]) ])
AC_DEFUN([LIBVIRT_DRIVER_RESULT_QEMU], [ diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug index c19bf3a43..2dc16e91f 100644 --- a/src/qemu/libvirtd_qemu.aug +++ b/src/qemu/libvirtd_qemu.aug @@ -86,6 +86,7 @@ module Libvirtd_qemu = let process_entry = str_entry "hugetlbfs_mount" | bool_entry "clear_emulator_capabilities" | str_entry "bridge_helper" + | str_entry "pr_helper" | bool_entry "set_process_name" | int_entry "max_processes" | int_entry "max_files" diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 43dd561cc..4bc668406 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -775,3 +775,7 @@ # This directory is used for memoryBacking source if configured as file. # NOTE: big files will be stored here #memory_backing_dir = "/var/lib/libvirt/qemu/ram" + +# Path to the SCSI persistent reservations helper. This helper is +# used whenever <reservations/> are enabled for SCSI disks. +#pr_helper = "/usr/libexec/qemu-pr-helper"
So "how" would I know as a user/admin whether or not I *needed* to uncomment this? or how it's used. From how I read bridge_helper it's there because "This executable is used to create <source type='bridge'> interfaces when libvirtd is running unprivileged." - So then the question becomes does PR make sense or is it useful for the running unprivileged case? In fact, IIRC, using PR requires a bit of special permissions and certain capabilities anyway (related to sg_io or rawio). So is having this available in qemu.conf "required"? I cannot imagine the name of the image changing.
The qemu-pr-helper binary is supposed to be suid. Just like the bridge-helper. You know you need to uncomment this whenever you want to change the default (displayed in the comment). Just like anything else in the config file. The whole idea of PR is so that your qemu can run unprivileged, without rawio and all privileged operations are offloaded to the small, well understood pr-helper.
Are there any other qualifiers than '-k' that may be useful to turn on in the pr helper...
Maybe for the managed case we can use -u $user -g $group so that ph-helper changes UID/GID after it starts to that matching qemu process. But I don't think it is strictly needed now. It would be nice to have feature. Certainly not something we need to configure through config file. Michal

Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 16 ++++++++++++++++ src/qemu/qemu_domain.h | 2 ++ 2 files changed, 18 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 45205fd03..f7da62dba 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1864,6 +1864,7 @@ qemuDomainObjPrivateAlloc(void *opaque) priv->migMaxBandwidth = QEMU_DOMAIN_MIG_BANDWIDTH_MAX; priv->driver = opaque; + priv->prPid = (pid_t) -1; return priv; @@ -1926,6 +1927,8 @@ qemuDomainObjPrivateDataClear(qemuDomainObjPrivatePtr priv) virBitmapFree(priv->migrationCaps); priv->migrationCaps = NULL; + + priv->prPid = (pid_t) -1; } @@ -2172,6 +2175,11 @@ qemuDomainObjPrivateXMLFormat(virBufferPtr buf, if (qemuDomainObjPrivateXMLFormatBlockjobs(buf, vm) < 0) return -1; + if (priv->prPid != (pid_t) -1) { + virBufferAsprintf(buf, "<prPid>%lld</prPid>\n", + (long long) priv->prPid); + } + return 0; } @@ -2318,6 +2326,7 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, xmlNodePtr *nodes = NULL; xmlNodePtr node = NULL; virQEMUCapsPtr qemuCaps = NULL; + long long prPid = -1; if (VIR_ALLOC(priv->monConfig) < 0) goto error; @@ -2524,6 +2533,13 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, if (qemuDomainObjPrivateXMLParseBlockjobs(priv, ctxt) < 0) goto error; + if (virXPathLongLong("string(./prPid)", ctxt, &prPid) < -1) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("unable to parse <prPid/>")); + goto error; + } + priv->prPid = (pid_t) prPid; + return 0; error: diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index b9258eb8e..418b47153 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -342,6 +342,8 @@ struct _qemuDomainObjPrivate { /* Migration capabilities. Rechecked on reconnect, not to be saved in * private XML. */ virBitmapPtr migrationCaps; + + pid_t prPid; }; # define QEMU_DOMAIN_PRIVATE(vm) \ -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 16 ++++++++++++++++ src/qemu/qemu_domain.h | 2 ++ 2 files changed, 18 insertions(+)
This seems reasonable... Although given the next patch and usage of "daemon*s*" I'd almost expect this to be an array of prPid; otherwise, plurality in the subsequent patch really don't make sense. John
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 45205fd03..f7da62dba 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1864,6 +1864,7 @@ qemuDomainObjPrivateAlloc(void *opaque)
priv->migMaxBandwidth = QEMU_DOMAIN_MIG_BANDWIDTH_MAX; priv->driver = opaque; + priv->prPid = (pid_t) -1;
return priv;
@@ -1926,6 +1927,8 @@ qemuDomainObjPrivateDataClear(qemuDomainObjPrivatePtr priv)
virBitmapFree(priv->migrationCaps); priv->migrationCaps = NULL; + + priv->prPid = (pid_t) -1; }
@@ -2172,6 +2175,11 @@ qemuDomainObjPrivateXMLFormat(virBufferPtr buf, if (qemuDomainObjPrivateXMLFormatBlockjobs(buf, vm) < 0) return -1;
+ if (priv->prPid != (pid_t) -1) { + virBufferAsprintf(buf, "<prPid>%lld</prPid>\n", + (long long) priv->prPid); + } + return 0; }
@@ -2318,6 +2326,7 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, xmlNodePtr *nodes = NULL; xmlNodePtr node = NULL; virQEMUCapsPtr qemuCaps = NULL; + long long prPid = -1;
if (VIR_ALLOC(priv->monConfig) < 0) goto error; @@ -2524,6 +2533,13 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, if (qemuDomainObjPrivateXMLParseBlockjobs(priv, ctxt) < 0) goto error;
+ if (virXPathLongLong("string(./prPid)", ctxt, &prPid) < -1) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("unable to parse <prPid/>")); + goto error; + } + priv->prPid = (pid_t) prPid; + return 0;
error: diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index b9258eb8e..418b47153 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -342,6 +342,8 @@ struct _qemuDomainObjPrivate { /* Migration capabilities. Rechecked on reconnect, not to be saved in * private XML. */ virBitmapPtr migrationCaps; + + pid_t prPid; };
# define QEMU_DOMAIN_PRIVATE(vm) \

Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one). The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 164 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 164 insertions(+) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index dc33eb7bc..33e0ad30c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2551,6 +2551,164 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, ret = 0; cleanup: virObjectUnref(caps); + + return ret; +} + + +static void +qemuProcessKillPRDaemons(virDomainObjPtr vm) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (priv->prPid == (pid_t) -1) + return; + + virProcessKillPainfully(priv->prPid, true); + priv->prPid = (pid_t) -1; +} + + +static int +qemuProcessSetupOnePRDaemonHook(void *opaque) +{ + virDomainObjPtr vm = opaque; + size_t i, nfds = 0; + int *fds = NULL; + int ret = -1; + + if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0) + return ret; + + if (nfds > 0 && + virProcessSetNamespaces(nfds, fds) < 0) + goto cleanup; + + ret = 0; + cleanup: + for (i = 0; i < nfds; i++) + VIR_FORCE_CLOSE(fds[i]); + VIR_FREE(fds); + return ret; +} + + +static int +qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + virQEMUDriverConfigPtr cfg; + qemuDomainStorageSourcePrivatePtr srcPriv; + qemuDomainDiskPRDPtr prd; + char *pidfile = NULL; + pid_t cpid = -1; + virCommandPtr cmd = NULL; + virTimeBackOffVar timebackoff; + const unsigned long long timeout = 500000; /* ms */ + int ret = -1; + + if (priv->prPid != (pid_t) -1 || + !virStoragePRDefIsManaged(disk->src->pr)) + return 0; + + cfg = virQEMUDriverGetConfig(driver); + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + prd = srcPriv->prd; + + if (!virFileIsExecutable(cfg->prHelperName)) { + virReportSystemError(errno, _("'%s' is not a suitable pr helper"), + cfg->prHelperName); + goto cleanup; + } + + if (!(pidfile = virPidFileBuildPath(cfg->stateDir, prd->alias))) + goto cleanup; + + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(cfg->prHelperName, + "-k", prd->path, + NULL))) + goto cleanup; + + virCommandDaemonize(cmd); + virCommandSetPidFile(cmd, pidfile); + virCommandSetPreExecHook(cmd, qemuProcessSetupOnePRDaemonHook, vm); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (virPidFileReadPath(pidfile, &cpid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pr helper %s didn't show up"), prd->alias); + goto cleanup; + } + + if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + if (virFileExists(prd->path)) + break; + + if (virProcessKill(cpid, 0) == 0) + continue; + + virReportSystemError(errno, + _("pr helper %s died unexpectedly"), prd->alias); + goto cleanup; + } + + if (priv->cgroup && + virCgroupAddMachineTask(priv->cgroup, cpid) < 0) + goto cleanup; + + if (qemuSecurityDomainSetPathLabel(driver->securityManager, + vm->def, prd->path, true) < 0) + goto cleanup; + + priv->prPid = cpid; + ret = 0; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + virProcessKillPainfully(cpid, true); + } + virCommandFree(cmd); + if (unlink(pidfile) < 0 && + errno != ENOENT && + !virGetLastError()) + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + VIR_FREE(pidfile); + virObjectUnref(cfg); + return ret; +} + + +static int +qemuProcessSetupPRDaemons(virDomainObjPtr vm) +{ + size_t i; + int ret = -1; + + for (i = 0; i < vm->def->ndisks; i++) { + if (qemuProcessSetupOnePRDaemon(vm, vm->def->disks[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + if (ret < 0) + qemuProcessKillPRDaemons(vm); return ret; } @@ -6074,6 +6232,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessResctrlCreate(driver, vm) < 0) goto cleanup; + VIR_DEBUG("Setting up PR daemons"); + if (qemuProcessSetupPRDaemons(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting domain security labels"); if (qemuSecuritySetAllLabel(driver, vm, @@ -6654,6 +6816,8 @@ void qemuProcessStop(virQEMUDriverPtr driver, VIR_FREE(vm->def->seclabels[i]->imagelabel); } + qemuProcessKillPRDaemons(vm); + qemuHostdevReAttachDomainDevices(driver, vm->def); def = vm->def; -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one).
"can be only one" Since there can only be one why do we bother w/ plurality?
The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec().
Seems like a good note/comment in the code where qemuProcessSetupPRDaemons is called so that someone doesn't have to go back to the commit message in order to find out why the call was placed where it was.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 164 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 164 insertions(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index dc33eb7bc..33e0ad30c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2551,6 +2551,164 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, ret = 0; cleanup: virObjectUnref(caps); + + return ret; +} + + +static void +qemuProcessKillPRDaemons(virDomainObjPtr vm)
Daemon*s* is is a misnomer
+{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (priv->prPid == (pid_t) -1) + return; + + virProcessKillPainfully(priv->prPid, true); + priv->prPid = (pid_t) -1; +} + + +static int +qemuProcessSetupOnePRDaemonHook(void *opaque) +{ + virDomainObjPtr vm = opaque; + size_t i, nfds = 0; + int *fds = NULL; + int ret = -1; + + if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0) + return ret;
<sigh> another false positive for Coverity since it for some reason believes "virProcessGetNamespaces" allocates memory that is stored into "fds" when there's an error returned. Strangely, setting *nfdlist = 0 in the (ret < 0) part of cleanup: in virProcessGetNamespaces and going to cleanup here magically clears the error...
+ + if (nfds > 0 && + virProcessSetNamespaces(nfds, fds) < 0) + goto cleanup; + + ret = 0; + cleanup: + for (i = 0; i < nfds; i++) + VIR_FORCE_CLOSE(fds[i]); + VIR_FREE(fds); + return ret; +} + +
If we returned: -1 error, 0 started, and 1 unnecessary, then our caller could be a bit smarter about needing to run through the whole ndisks list.
+static int +qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + virQEMUDriverConfigPtr cfg; + qemuDomainStorageSourcePrivatePtr srcPriv; + qemuDomainDiskPRDPtr prd; + char *pidfile = NULL; + pid_t cpid = -1; + virCommandPtr cmd = NULL; + virTimeBackOffVar timebackoff; + const unsigned long long timeout = 500000; /* ms */ + int ret = -1; + + if (priv->prPid != (pid_t) -1 || + !virStoragePRDefIsManaged(disk->src->pr)) + return 0;
Ah... so this is where we ensure there is only ever one pr-helper... and this !managed usage still doesn't help my earlier confusion. In one mode we have: -object pr-manager-helper,id=pr-helper-sdb,\ path=/path/to/qemu-pr-helper.sock and the other we have: -object pr-manager-helper,id=pr-helper0,\ path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock But for "everything" (or both) we still have the qemu-pr-helper process. For both the alias/id is added to the -drive command line using "file.pr-manager=$alias". In one mode we would start the qemu-pr-helper, but in the other we're not. So, if none of the lun's used the second mode, then what do they connect to? Furthermore, the managed mode helps decide which alias to use and of course which socket will be used. With the alias, I had the earlier question/confusion over how many objects would be created using the "other" mode in the above examples since "theoretically speaking" of course the id= should be unique. Then the path is just a client path to the socket which in the former mode is user defined and the latter mode is static or the same for every lun using that mode. So can we have more than one lun using the same client socket path? I don't know if the above helps explain my confusion or makes it even harder to understand. I hope it helps.
+ + cfg = virQEMUDriverGetConfig(driver); + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + prd = srcPriv->prd; + + if (!virFileIsExecutable(cfg->prHelperName)) { + virReportSystemError(errno, _("'%s' is not a suitable pr helper"), + cfg->prHelperName); + goto cleanup; + } + + if (!(pidfile = virPidFileBuildPath(cfg->stateDir, prd->alias))) + goto cleanup; + + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(cfg->prHelperName, + "-k", prd->path, + NULL))) + goto cleanup; + + virCommandDaemonize(cmd); + virCommandSetPidFile(cmd, pidfile); + virCommandSetPreExecHook(cmd, qemuProcessSetupOnePRDaemonHook, vm); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (virPidFileReadPath(pidfile, &cpid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pr helper %s didn't show up"), prd->alias); + goto cleanup; + } + + if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + if (virFileExists(prd->path)) + break; + + if (virProcessKill(cpid, 0) == 0) + continue; + + virReportSystemError(errno, + _("pr helper %s died unexpectedly"), prd->alias); + goto cleanup; + } + + if (priv->cgroup && + virCgroupAddMachineTask(priv->cgroup, cpid) < 0) + goto cleanup; + + if (qemuSecurityDomainSetPathLabel(driver->securityManager, + vm->def, prd->path, true) < 0) + goto cleanup; + + priv->prPid = cpid; + ret = 0; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + virProcessKillPainfully(cpid, true); + } + virCommandFree(cmd); + if (unlink(pidfile) < 0 &&
I thought I pointed this out in the previous series, but Coverity gets grumpy here since @pidfile could be NULL if we jump to cleanup before it's created when virFileIsExecutable of prHelperName fails.
+ errno != ENOENT && + !virGetLastError()) + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + VIR_FREE(pidfile); + virObjectUnref(cfg); + return ret; +} + + +static int +qemuProcessSetupPRDaemons(virDomainObjPtr vm)
Daemon*s* is a misnomer since only one is created.
+{ + size_t i; + int ret = -1; + + for (i = 0; i < vm->def->ndisks; i++) { + if (qemuProcessSetupOnePRDaemon(vm, vm->def->disks[i]) < 0)
Failure of the above means virProcessKillPainfully was called; however, successful completion means we could stop going through the loop at least for now, although that could change in a couple of patches, but just putting the thought out there.
+ goto cleanup; + } + + ret = 0; + cleanup: + if (ret < 0) + qemuProcessKillPRDaemons(vm); return ret; }
@@ -6074,6 +6232,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessResctrlCreate(driver, vm) < 0) goto cleanup;
+ VIR_DEBUG("Setting up PR daemons");
daemon*s* misnomer John
+ if (qemuProcessSetupPRDaemons(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting domain security labels"); if (qemuSecuritySetAllLabel(driver, vm, @@ -6654,6 +6816,8 @@ void qemuProcessStop(virQEMUDriverPtr driver, VIR_FREE(vm->def->seclabels[i]->imagelabel); }
+ qemuProcessKillPRDaemons(vm); + qemuHostdevReAttachDomainDevices(driver, vm->def);
def = vm->def;

On 03/02/2018 04:54 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one).
"can be only one"
Since there can only be one why do we bother w/ plurality?
Yeah, this is leftover from v1 where the code was prepared for multiple managed helpers per one domain.
The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec().
Seems like a good note/comment in the code where qemuProcessSetupPRDaemons is called so that someone doesn't have to go back to the commit message in order to find out why the call was placed where it was.
Okay, I will add it. But we need some rules on this because sometimes I add a comment and reviewer says putting the info in the commit message is enough. Or vice versa.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 164 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 164 insertions(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index dc33eb7bc..33e0ad30c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2551,6 +2551,164 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, ret = 0; cleanup: virObjectUnref(caps); + + return ret; +} + + +static void +qemuProcessKillPRDaemons(virDomainObjPtr vm)
Daemon*s* is is a misnomer
+{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (priv->prPid == (pid_t) -1) + return; + + virProcessKillPainfully(priv->prPid, true); + priv->prPid = (pid_t) -1; +} + + +static int +qemuProcessSetupOnePRDaemonHook(void *opaque) +{ + virDomainObjPtr vm = opaque; + size_t i, nfds = 0; + int *fds = NULL; + int ret = -1; + + if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0) + return ret;
<sigh> another false positive for Coverity since it for some reason believes "virProcessGetNamespaces" allocates memory that is stored into "fds" when there's an error returned.
Strangely, setting *nfdlist = 0 in the (ret < 0) part of cleanup: in virProcessGetNamespaces and going to cleanup here magically clears the error...
+ + if (nfds > 0 && + virProcessSetNamespaces(nfds, fds) < 0) + goto cleanup; + + ret = 0; + cleanup: + for (i = 0; i < nfds; i++) + VIR_FORCE_CLOSE(fds[i]); + VIR_FREE(fds); + return ret; +} + +
If we returned:
-1 error, 0 started, and 1 unnecessary, then our caller could be a bit smarter about needing to run through the whole ndisks list.
This might help if there're thousands of disks, but I guess since we're far from that it's unnecessary IMO. But I can change that if you want me to :-)
+static int +qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + virQEMUDriverConfigPtr cfg; + qemuDomainStorageSourcePrivatePtr srcPriv; + qemuDomainDiskPRDPtr prd; + char *pidfile = NULL; + pid_t cpid = -1; + virCommandPtr cmd = NULL; + virTimeBackOffVar timebackoff; + const unsigned long long timeout = 500000; /* ms */ + int ret = -1; + + if (priv->prPid != (pid_t) -1 || + !virStoragePRDefIsManaged(disk->src->pr)) + return 0;
Ah... so this is where we ensure there is only ever one pr-helper... and this !managed usage still doesn't help my earlier confusion.
In one mode we have:
-object pr-manager-helper,id=pr-helper-sdb,\ path=/path/to/qemu-pr-helper.sock
and the other we have:
-object pr-manager-helper,id=pr-helper0,\ path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock
But for "everything" (or both) we still have the qemu-pr-helper process.
Question is, who starts the process. Let's forget about PR for a while and take a look on <hostdev/>-s. The also have @managed attribute. Question there is who detaches the hostdev on domain startup. The command line is generated the same in both cases. But if @managed = 'yes' then libvirt takes the extra step and detaches the hostdev from host. Otherwise it relies on user (which is equal to admin for the sake of this conversation) that it detached the hostdev themselves.
For both the alias/id is added to the -drive command line using "file.pr-manager=$alias".
Sure. We have to.
In one mode we would start the qemu-pr-helper, but in the other we're not. So, if none of the lun's used the second mode, then what do they connect to?
To socket that's created by user spawned qemu-pr-helper. Alternatively there were some discussion whether systemd's socket activation should be implemented for qemu-pr-helper. In that case, it's going to be systemd who starts the process once qemu tries to connect.
Furthermore, the managed mode helps decide which alias to use and of course which socket will be used. With the alias, I had the earlier question/confusion over how many objects would be created using the "other" mode in the above examples since "theoretically speaking" of course the id= should be unique. Then the path is just a client path to the socket which in the former mode is user defined and the latter mode is static or the same for every lun using that mode. So can we have more than one lun using the same client socket path?
Theoretically we can. But why bother? We can try to find unique socket paths and add them just once. But then we have to: a) allocate an unique alias for pr-manager-helper, b) track use count on hot(un-)plug. IMO it only complicates things. But sure, in the long run we can switch to that approach.
I don't know if the above helps explain my confusion or makes it even harder to understand. I hope it helps.
+ + cfg = virQEMUDriverGetConfig(driver); + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + prd = srcPriv->prd; + + if (!virFileIsExecutable(cfg->prHelperName)) { + virReportSystemError(errno, _("'%s' is not a suitable pr helper"), + cfg->prHelperName); + goto cleanup; + } + + if (!(pidfile = virPidFileBuildPath(cfg->stateDir, prd->alias))) + goto cleanup; + + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(cfg->prHelperName, + "-k", prd->path, + NULL))) + goto cleanup; + + virCommandDaemonize(cmd); + virCommandSetPidFile(cmd, pidfile); + virCommandSetPreExecHook(cmd, qemuProcessSetupOnePRDaemonHook, vm); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (virPidFileReadPath(pidfile, &cpid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pr helper %s didn't show up"), prd->alias); + goto cleanup; + } + + if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + if (virFileExists(prd->path)) + break; + + if (virProcessKill(cpid, 0) == 0) + continue; + + virReportSystemError(errno, + _("pr helper %s died unexpectedly"), prd->alias); + goto cleanup; + } + + if (priv->cgroup && + virCgroupAddMachineTask(priv->cgroup, cpid) < 0) + goto cleanup; + + if (qemuSecurityDomainSetPathLabel(driver->securityManager, + vm->def, prd->path, true) < 0) + goto cleanup; + + priv->prPid = cpid; + ret = 0; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + virProcessKillPainfully(cpid, true); + } + virCommandFree(cmd); + if (unlink(pidfile) < 0 &&
I thought I pointed this out in the previous series, but Coverity gets grumpy here since @pidfile could be NULL if we jump to cleanup before it's created when virFileIsExecutable of prHelperName fails.
Ah okay. Michal

On 03/06/2018 12:31 PM, Michal Privoznik wrote:
On 03/02/2018 04:54 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Before we exec() qemu we have to spawn pr-helper processes for all managed reservations (well, technically there can only one).
"can be only one"
Since there can only be one why do we bother w/ plurality?
Yeah, this is leftover from v1 where the code was prepared for multiple managed helpers per one domain.
The only caveat there is that we should place the process into the same namespace and cgroup as qemu (so that it shares the same view of the system). But we can do that only after we've forked. That means calling the setup function between fork() and exec().
Seems like a good note/comment in the code where qemuProcessSetupPRDaemons is called so that someone doesn't have to go back to the commit message in order to find out why the call was placed where it was.
Okay, I will add it. But we need some rules on this because sometimes I add a comment and reviewer says putting the info in the commit message is enough. Or vice versa.
Agreed - it would be nice, but it's reviewer dependent. I look at it this way - is the code really self documenting? Is there something being done that we should note in code comment rather than forcing a reader to "find" the commit that added the function (needle meet haystack) or the upstream review where even more relevant discussion may have taken place.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 164 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 164 insertions(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index dc33eb7bc..33e0ad30c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2551,6 +2551,164 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, ret = 0; cleanup: virObjectUnref(caps); + + return ret; +} + + +static void +qemuProcessKillPRDaemons(virDomainObjPtr vm)
Daemon*s* is is a misnomer
+{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (priv->prPid == (pid_t) -1) + return; + + virProcessKillPainfully(priv->prPid, true); + priv->prPid = (pid_t) -1; +} + + +static int +qemuProcessSetupOnePRDaemonHook(void *opaque) +{ + virDomainObjPtr vm = opaque; + size_t i, nfds = 0; + int *fds = NULL; + int ret = -1; + + if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0) + return ret;
<sigh> another false positive for Coverity since it for some reason believes "virProcessGetNamespaces" allocates memory that is stored into "fds" when there's an error returned.
Strangely, setting *nfdlist = 0 in the (ret < 0) part of cleanup: in virProcessGetNamespaces and going to cleanup here magically clears the error...
+ + if (nfds > 0 && + virProcessSetNamespaces(nfds, fds) < 0) + goto cleanup; + + ret = 0; + cleanup: + for (i = 0; i < nfds; i++) + VIR_FORCE_CLOSE(fds[i]); + VIR_FREE(fds); + return ret; +} + +
If we returned:
-1 error, 0 started, and 1 unnecessary, then our caller could be a bit smarter about needing to run through the whole ndisks list.
This might help if there're thousands of disks, but I guess since we're far from that it's unnecessary IMO. But I can change that if you want me to :-)
You know what happens now that you've said that? Think about domain stats and excessive disks... Sure for a loop of < 50 probably no big deal and perhaps the norm is smaller disk guests, but then you run into the one that isn't and have to redo/rework things.
+static int +qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + virQEMUDriverConfigPtr cfg; + qemuDomainStorageSourcePrivatePtr srcPriv; + qemuDomainDiskPRDPtr prd; + char *pidfile = NULL; + pid_t cpid = -1; + virCommandPtr cmd = NULL; + virTimeBackOffVar timebackoff; + const unsigned long long timeout = 500000; /* ms */ + int ret = -1; + + if (priv->prPid != (pid_t) -1 || + !virStoragePRDefIsManaged(disk->src->pr)) + return 0;
Ah... so this is where we ensure there is only ever one pr-helper... and this !managed usage still doesn't help my earlier confusion.
In one mode we have:
-object pr-manager-helper,id=pr-helper-sdb,\ path=/path/to/qemu-pr-helper.sock
and the other we have:
-object pr-manager-helper,id=pr-helper0,\ path=/tmp/lib/domain--1-QEMUGuest1/pr-helper0.sock
But for "everything" (or both) we still have the qemu-pr-helper process.
Question is, who starts the process. Let's forget about PR for a while and take a look on <hostdev/>-s. The also have @managed attribute. Question there is who detaches the hostdev on domain startup. The command line is generated the same in both cases. But if @managed = 'yes' then libvirt takes the extra step and detaches the hostdev from host. Otherwise it relies on user (which is equal to admin for the sake of this conversation) that it detached the hostdev themselves.
This is the bit I'm really struggling with - who starts the process. To some degree does is matter who or what the process is if all we're providing is some functionality to either start and connect to or just connect to some process by some socket. I think I noted that before we're focused on pr-manager now, but what's next? Someone else will copy that code in qemu and ask libvirt to support it.
For both the alias/id is added to the -drive command line using "file.pr-manager=$alias".
Sure. We have to.
In one mode we would start the qemu-pr-helper, but in the other we're not. So, if none of the lun's used the second mode, then what do they connect to?
To socket that's created by user spawned qemu-pr-helper. Alternatively there were some discussion whether systemd's socket activation should be implemented for qemu-pr-helper. In that case, it's going to be systemd who starts the process once qemu tries to connect.
Oh sure, great, let's bring in systemd into the mix and really confuse me }-<... I'm glad you understand this - just look both ways before you cross the street please, we don't need you getting hit by a tram! John
Furthermore, the managed mode helps decide which alias to use and of course which socket will be used. With the alias, I had the earlier question/confusion over how many objects would be created using the "other" mode in the above examples since "theoretically speaking" of course the id= should be unique. Then the path is just a client path to the socket which in the former mode is user defined and the latter mode is static or the same for every lun using that mode. So can we have more than one lun using the same client socket path?
Theoretically we can. But why bother? We can try to find unique socket paths and add them just once. But then we have to:
a) allocate an unique alias for pr-manager-helper, b) track use count on hot(un-)plug.
IMO it only complicates things. But sure, in the long run we can switch to that approach.
I don't know if the above helps explain my confusion or makes it even harder to understand. I hope it helps.
+ + cfg = virQEMUDriverGetConfig(driver); + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + prd = srcPriv->prd; + + if (!virFileIsExecutable(cfg->prHelperName)) { + virReportSystemError(errno, _("'%s' is not a suitable pr helper"), + cfg->prHelperName); + goto cleanup; + } + + if (!(pidfile = virPidFileBuildPath(cfg->stateDir, prd->alias))) + goto cleanup; + + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Cannot remove stale PID file %s"), + pidfile); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(cfg->prHelperName, + "-k", prd->path, + NULL))) + goto cleanup; + + virCommandDaemonize(cmd); + virCommandSetPidFile(cmd, pidfile); + virCommandSetPreExecHook(cmd, qemuProcessSetupOnePRDaemonHook, vm); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (virPidFileReadPath(pidfile, &cpid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pr helper %s didn't show up"), prd->alias); + goto cleanup; + } + + if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + if (virFileExists(prd->path)) + break; + + if (virProcessKill(cpid, 0) == 0) + continue; + + virReportSystemError(errno, + _("pr helper %s died unexpectedly"), prd->alias); + goto cleanup; + } + + if (priv->cgroup && + virCgroupAddMachineTask(priv->cgroup, cpid) < 0) + goto cleanup; + + if (qemuSecurityDomainSetPathLabel(driver->securityManager, + vm->def, prd->path, true) < 0) + goto cleanup; + + priv->prPid = cpid; + ret = 0; + cleanup: + if (ret < 0) { + virCommandAbort(cmd); + virProcessKillPainfully(cpid, true); + } + virCommandFree(cmd); + if (unlink(pidfile) < 0 &&
I thought I pointed this out in the previous series, but Coverity gets grumpy here since @pidfile could be NULL if we jump to cleanup before it's created when virFileIsExecutable of prHelperName fails.
Ah okay.
Michal

Surprisingly, nothing special is happening here. If we are the first to use the managed helper then spawn it. If not, we're almost done. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 38 +++++++++++++++++++++----- src/qemu/qemu_process.h | 7 +++++ 3 files changed, 110 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index f28006e3c..2ebb68fbc 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -348,6 +348,58 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, } +static int +qemuBuildPRDefInfoProps(virDomainObjPtr vm, + virDomainDiskDefPtr disk, + virJSONValuePtr *prmgrProps, + const char **prAlias, + const char **prPath) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + qemuDomainStorageSourcePrivatePtr srcPriv; + virJSONValuePtr props = NULL; + int ret = -1; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + + *prmgrProps = NULL; + + if (priv->prPid != (pid_t) -1 || + !srcPriv->prd || + !srcPriv->prd->alias) + return 0; + + if (virJSONValueObjectCreate(&props, + "s:path", srcPriv->prd->path, + NULL) < 0) + goto cleanup; + + if (qemuProcessSetupOnePRDaemon(vm, disk) < 0) + goto cleanup; + + *prAlias = srcPriv->prd->alias; + *prPath = srcPriv->prd->path; + *prmgrProps = props; + props = NULL; + ret = 0; + cleanup: + virJSONValueFree(props); + return ret; +} + + +static void +qemuDestroyPRDefObject(virDomainObjPtr vm, + const char *alias, + const char *path) +{ + if (!alias) + return; + + qemuProcessKillPRDaemons(vm, path, false); +} + + /** * qemuDomainAttachDiskGeneric: * @@ -365,12 +417,16 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, char *devstr = NULL; char *drivestr = NULL; char *drivealias = NULL; + const char *prAlias = NULL; + const char *prPath = NULL; bool driveAdded = false; bool secobjAdded = false; bool encobjAdded = false; + bool prmgrAdded = false; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); virJSONValuePtr secobjProps = NULL; virJSONValuePtr encobjProps = NULL; + virJSONValuePtr prmgrProps = NULL; qemuDomainStorageSourcePrivatePtr srcPriv; qemuDomainSecretInfoPtr secinfo = NULL; qemuDomainSecretInfoPtr encinfo = NULL; @@ -403,6 +459,9 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, disk->info.alias) < 0) goto error; + if (qemuBuildPRDefInfoProps(vm, disk, &prmgrProps, &prAlias, &prPath) < 0) + goto error; + if (!(drivestr = qemuBuildDriveStr(disk, false, priv->qemuCaps))) goto error; @@ -435,6 +494,15 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, encobjAdded = true; } + if (prmgrProps) { + rv = qemuMonitorAddObject(priv->mon, "pr-manager-helper", prAlias, + prmgrProps); + prmgrProps = NULL; /* qemuMonitorAddObject consumes */ + if (rv < 0) + goto exit_monitor; + prmgrAdded = true; + } + if (qemuMonitorAddDrive(priv->mon, drivestr) < 0) goto exit_monitor; driveAdded = true; @@ -455,6 +523,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, cleanup: virJSONValueFree(secobjProps); virJSONValueFree(encobjProps); + virJSONValueFree(prmgrProps); qemuDomainSecretDiskDestroy(disk); VIR_FREE(devstr); VIR_FREE(drivestr); @@ -472,6 +541,8 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, secinfo->s.aes.alias)); if (encobjAdded) ignore_value(qemuMonitorDelObject(priv->mon, encinfo->s.aes.alias)); + if (prmgrAdded) + ignore_value(qemuMonitorDelObject(priv->mon, prAlias)); if (qemuDomainObjExitMonitor(driver, vm) < 0) ret = -2; virErrorRestore(&orig_err); @@ -481,6 +552,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, error: qemuDomainDelDiskSrcTLSObject(driver, vm, disk->src); ignore_value(qemuHotplugPrepareDiskAccess(driver, vm, disk, NULL, true)); + qemuDestroyPRDefObject(vm, prAlias, prPath); goto cleanup; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 33e0ad30c..3a914b6c6 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2556,16 +2556,40 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, } -static void -qemuProcessKillPRDaemons(virDomainObjPtr vm) +void +qemuProcessKillPRDaemons(virDomainObjPtr vm, + const char *socketPath, + bool force) { qemuDomainObjPrivatePtr priv = vm->privateData; + size_t nmanaged = 0; + size_t i; if (priv->prPid == (pid_t) -1) return; - virProcessKillPainfully(priv->prPid, true); - priv->prPid = (pid_t) -1; + for (i = 0; i < vm->def->ndisks; i++) { + qemuDomainStorageSourcePrivatePtr srcPriv; + virDomainDiskDefPtr disk = vm->def->disks[i]; + + if (!virStoragePRDefIsManaged(disk->src->pr)) + continue; + + nmanaged++; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + if (!socketPath) + socketPath = srcPriv->prd->path; + } + + if (force || nmanaged <= 1) { + virProcessKillPainfully(priv->prPid, true); + priv->prPid = (pid_t) -1; + if (socketPath && + unlink(socketPath) < 0 && + errno != ENOENT) + VIR_WARN("Unable to remove pr helper socket %s", socketPath); + } } @@ -2593,7 +2617,7 @@ qemuProcessSetupOnePRDaemonHook(void *opaque) } -static int +int qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, virDomainDiskDefPtr disk) { @@ -2708,7 +2732,7 @@ qemuProcessSetupPRDaemons(virDomainObjPtr vm) ret = 0; cleanup: if (ret < 0) - qemuProcessKillPRDaemons(vm); + qemuProcessKillPRDaemons(vm, NULL, true); return ret; } @@ -6816,7 +6840,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, VIR_FREE(vm->def->seclabels[i]->imagelabel); } - qemuProcessKillPRDaemons(vm); + qemuProcessKillPRDaemons(vm, NULL, true); qemuHostdevReAttachDomainDevices(driver, vm->def); diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index 274111567..ab09a7f30 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -203,4 +203,11 @@ int qemuProcessRefreshDisks(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob); +int qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk); + +void qemuProcessKillPRDaemons(virDomainObjPtr vm, + const char *socketPath, + bool force); + #endif /* __QEMU_PROCESS_H__ */ -- 2.16.1

On Wed, Feb 21, 2018 at 07:11:35PM +0100, Michal Privoznik wrote:
Surprisingly, nothing special is happening here. If we are the first to use the managed helper then spawn it. If not, we're almost done.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 38 +++++++++++++++++++++----- src/qemu/qemu_process.h | 7 +++++ 3 files changed, 110 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index f28006e3c..2ebb68fbc 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -348,6 +348,58 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, }
+static int +qemuBuildPRDefInfoProps(virDomainObjPtr vm, + virDomainDiskDefPtr disk, + virJSONValuePtr *prmgrProps, + const char **prAlias, + const char **prPath) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + qemuDomainStorageSourcePrivatePtr srcPriv; + virJSONValuePtr props = NULL; + int ret = -1; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + + *prmgrProps = NULL; + + if (priv->prPid != (pid_t) -1 || + !srcPriv->prd || + !srcPriv->prd->alias) + return 0; +
If !srcPriv->prd is NULL, you should not dereference it. Jan

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Surprisingly, nothing special is happening here. If we are the first to use the managed helper then spawn it. If not, we're almost done.
But wasn't there a very special reason to start it between fork and exec? Does that still hold true? That is why can we start the daemon after exec now, but we couldn't before in the previous patch? It feels quite "disjoint" to have the unplug in a subsequent patch too. Considering them both in one just seems "better", but it's not a deal breaker.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 38 +++++++++++++++++++++----- src/qemu/qemu_process.h | 7 +++++ 3 files changed, 110 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index f28006e3c..2ebb68fbc 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -348,6 +348,58 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, }
+static int +qemuBuildPRDefInfoProps(virDomainObjPtr vm,
qemuBuild? Why not qemuDomainAdd?
+ virDomainDiskDefPtr disk, + virJSONValuePtr *prmgrProps, + const char **prAlias, + const char **prPath) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + qemuDomainStorageSourcePrivatePtr srcPriv; + virJSONValuePtr props = NULL; + int ret = -1; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src);
ohh, umm, in qemuDomainAttachDiskGeneric there's a : srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); if (srcPriv) { secinfo = srcPriv->secinfo; encinfo = srcPriv->encinfo; } Which makes light dawn that it "was" possible for srcPriv == NULL and the "thought" that the subsequent deref is going to fail rather spectacularly. See commit '8056721cb' (and a couple of others). Although it seems patch 4 and your change to qemuDomainPrepareDiskSource to call qemuDomainStorageSourcePrivateNew instead of having it called in qemuDomainSecretStorageSourcePrepare would seem to ensure that disk source privateData is always allocated now - meaning a number of other places where srcPriv is/was checked are no longer necessary. Maybe that one change needs to be "extracted" out to make things obvious. Not required, but just thinking and typing thoughts.
+ + *prmgrProps = NULL; + + if (priv->prPid != (pid_t) -1 || + !srcPriv->prd || + !srcPriv->prd->alias) + return 0; + + if (virJSONValueObjectCreate(&props, + "s:path", srcPriv->prd->path, + NULL) < 0) + goto cleanup; +> + if (qemuProcessSetupOnePRDaemon(vm, disk) < 0) + goto cleanup;> + + *prAlias = srcPriv->prd->alias; + *prPath = srcPriv->prd->path; + *prmgrProps = props; + props = NULL; + ret = 0; + cleanup: + virJSONValueFree(props); + return ret; +} + + +static void +qemuDestroyPRDefObject(virDomainObjPtr vm,
qemuDestroy? Why not qemuDomainDel?
+ const char *alias, + const char *path) +{ + if (!alias) + return;
BTW: This is where I'd expect to remove the object and then...
+ + qemuProcessKillPRDaemons(vm, path, false);
Just unlink the path... See some thoughts below...
+} + + /** * qemuDomainAttachDiskGeneric: * @@ -365,12 +417,16 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, char *devstr = NULL; char *drivestr = NULL; char *drivealias = NULL; + const char *prAlias = NULL; + const char *prPath = NULL; bool driveAdded = false; bool secobjAdded = false; bool encobjAdded = false; + bool prmgrAdded = false; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); virJSONValuePtr secobjProps = NULL; virJSONValuePtr encobjProps = NULL; + virJSONValuePtr prmgrProps = NULL; qemuDomainStorageSourcePrivatePtr srcPriv; qemuDomainSecretInfoPtr secinfo = NULL; qemuDomainSecretInfoPtr encinfo = NULL; @@ -403,6 +459,9 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, disk->info.alias) < 0) goto error;
+ if (qemuBuildPRDefInfoProps(vm, disk, &prmgrProps, &prAlias, &prPath) < 0) + goto error; + if (!(drivestr = qemuBuildDriveStr(disk, false, priv->qemuCaps))) goto error;
@@ -435,6 +494,15 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, encobjAdded = true; }
+ if (prmgrProps) { + rv = qemuMonitorAddObject(priv->mon, "pr-manager-helper", prAlias, + prmgrProps); + prmgrProps = NULL; /* qemuMonitorAddObject consumes */ + if (rv < 0) + goto exit_monitor; + prmgrAdded = true; + } +
So for one of the managed modes (as noted much earlier) we could be creating a object with a duplicated alias - does that work? I thought id= has to be unique.
if (qemuMonitorAddDrive(priv->mon, drivestr) < 0) goto exit_monitor; driveAdded = true; @@ -455,6 +523,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, cleanup: virJSONValueFree(secobjProps); virJSONValueFree(encobjProps); + virJSONValueFree(prmgrProps); qemuDomainSecretDiskDestroy(disk); VIR_FREE(devstr); VIR_FREE(drivestr); @@ -472,6 +541,8 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, secinfo->s.aes.alias)); if (encobjAdded) ignore_value(qemuMonitorDelObject(priv->mon, encinfo->s.aes.alias)); + if (prmgrAdded) + ignore_value(qemuMonitorDelObject(priv->mon, prAlias)); if (qemuDomainObjExitMonitor(driver, vm) < 0) ret = -2; virErrorRestore(&orig_err); @@ -481,6 +552,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, error: qemuDomainDelDiskSrcTLSObject(driver, vm, disk->src); ignore_value(qemuHotplugPrepareDiskAccess(driver, vm, disk, NULL, true)); + qemuDestroyPRDefObject(vm, prAlias, prPath);
oh, I see, you're mixing the way TLS object tear down occurred and how other objects happen. If you're going to mimic TLS, then change qemuBuildPRDefInfoProps to be something like qemuDomainAddPRDefObject which would do the EnterMonitorAsync, Props mgmt, AddObject, and ExitMonitor. That way qemuDestroyPRDefObject changes to qemuDomainDelPRDefObject which would handle the failure path. Then I believe you don't have to pass around prAlias and prPath since they'd be "obtainable".
goto cleanup; }
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 33e0ad30c..3a914b6c6 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2556,16 +2556,40 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, }
This subsequent hunk feels like it could have gone in for the previous patch. Or at the very least some function that would unlink the socket path for each of the sockets in ndisks that were created. Then that single unlink API gets exposed.
-static void -qemuProcessKillPRDaemons(virDomainObjPtr vm) +void +qemuProcessKillPRDaemons(virDomainObjPtr vm, + const char *socketPath, + bool force)
The only time force == true is when socketPath == NULL... and that's only in the shutdown path. When socketPath != NULL, force == false, and we're going to unplug the socket, right? Perhaps this would be cleaner if only @socketPath was in play. If NULL, then walk the ndisks looking for paths that you'll unlink first before killing the daemon. I actually think having a separate unlink the socket would be just fine. Does it really matter if it's the "last" one to stop the helper daemon? All I can picture is some test that continually adds and removes one socket and that just thrashing (or eventually failing) because of the startup processing. IMO: Once you start the daemon because there was a lun wanting reservations - just keep it running even if the last possible consumer was unplugged from the guest. But yes, I do understand why you would stop it if it was the last one, I'm just thinking it may not be necessary. I could be off in the weeds - as this PR processing still isn't as clear to me as it should be by this point. John
{ qemuDomainObjPrivatePtr priv = vm->privateData; + size_t nmanaged = 0; + size_t i;
if (priv->prPid == (pid_t) -1) return;
- virProcessKillPainfully(priv->prPid, true); - priv->prPid = (pid_t) -1; + for (i = 0; i < vm->def->ndisks; i++) { + qemuDomainStorageSourcePrivatePtr srcPriv; + virDomainDiskDefPtr disk = vm->def->disks[i]; + + if (!virStoragePRDefIsManaged(disk->src->pr)) + continue; + + nmanaged++; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + if (!socketPath) + socketPath = srcPriv->prd->path; + } + + if (force || nmanaged <= 1) { + virProcessKillPainfully(priv->prPid, true); + priv->prPid = (pid_t) -1; + if (socketPath && + unlink(socketPath) < 0 && + errno != ENOENT) + VIR_WARN("Unable to remove pr helper socket %s", socketPath); + } }
@@ -2593,7 +2617,7 @@ qemuProcessSetupOnePRDaemonHook(void *opaque) }
-static int +int qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, virDomainDiskDefPtr disk) { @@ -2708,7 +2732,7 @@ qemuProcessSetupPRDaemons(virDomainObjPtr vm) ret = 0; cleanup: if (ret < 0) - qemuProcessKillPRDaemons(vm); + qemuProcessKillPRDaemons(vm, NULL, true); return ret; }
@@ -6816,7 +6840,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, VIR_FREE(vm->def->seclabels[i]->imagelabel); }
- qemuProcessKillPRDaemons(vm); + qemuProcessKillPRDaemons(vm, NULL, true);
qemuHostdevReAttachDomainDevices(driver, vm->def);
diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index 274111567..ab09a7f30 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -203,4 +203,11 @@ int qemuProcessRefreshDisks(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob);
+int qemuProcessSetupOnePRDaemon(virDomainObjPtr vm, + virDomainDiskDefPtr disk); + +void qemuProcessKillPRDaemons(virDomainObjPtr vm, + const char *socketPath, + bool force); + #endif /* __QEMU_PROCESS_H__ */

On 03/02/2018 06:55 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Surprisingly, nothing special is happening here. If we are the first to use the managed helper then spawn it. If not, we're almost done.
But wasn't there a very special reason to start it between fork and exec?
No. If you are referring to previous patch, the very special reason applies to calling qemuProcessSetupOnePRDaemonHook() which places the qemu-pr-helper process (well, at this stage it's still just our own fork) into the namespace of qemu process.
Does that still hold true? That is why can we start the daemon after exec now, but we couldn't before in the previous patch?
It feels quite "disjoint" to have the unplug in a subsequent patch too. Considering them both in one just seems "better", but it's not a deal breaker.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 38 +++++++++++++++++++++----- src/qemu/qemu_process.h | 7 +++++ 3 files changed, 110 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index f28006e3c..2ebb68fbc 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -348,6 +348,58 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, }
+static int +qemuBuildPRDefInfoProps(virDomainObjPtr vm,
qemuBuild? Why not qemuDomainAdd?
Dunno. Other functions have the same prefix, e.g. qemuBuildSecretInfoProps().
+ virDomainDiskDefPtr disk, + virJSONValuePtr *prmgrProps, + const char **prAlias, + const char **prPath) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + qemuDomainStorageSourcePrivatePtr srcPriv; + virJSONValuePtr props = NULL; + int ret = -1; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src);
ohh, umm, in qemuDomainAttachDiskGeneric there's a :
srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); if (srcPriv) { secinfo = srcPriv->secinfo; encinfo = srcPriv->encinfo; }
Which makes light dawn that it "was" possible for srcPriv == NULL and the "thought" that the subsequent deref is going to fail rather spectacularly. See commit '8056721cb' (and a couple of others).
Although it seems patch 4 and your change to qemuDomainPrepareDiskSource to call qemuDomainStorageSourcePrivateNew instead of having it called in qemuDomainSecretStorageSourcePrepare would seem to ensure that disk source privateData is always allocated now - meaning a number of other places where srcPriv is/was checked are no longer necessary.
Yes. There are such places.
Maybe that one change needs to be "extracted" out to make things obvious. Not required, but just thinking and typing thoughts.
Okay, I can try that. Also try removing those unnecessary checks, but as I was proven many times, touching this part of libvirt code always ends bad with me.
+ + *prmgrProps = NULL; + + if (priv->prPid != (pid_t) -1 || + !srcPriv->prd || + !srcPriv->prd->alias) + return 0; + + if (virJSONValueObjectCreate(&props, + "s:path", srcPriv->prd->path, + NULL) < 0) + goto cleanup; +> + if (qemuProcessSetupOnePRDaemon(vm, disk) < 0) + goto cleanup;> + + *prAlias = srcPriv->prd->alias; + *prPath = srcPriv->prd->path; + *prmgrProps = props; + props = NULL; + ret = 0; + cleanup: + virJSONValueFree(props); + return ret; +} + + +static void +qemuDestroyPRDefObject(virDomainObjPtr vm,
qemuDestroy? Why not qemuDomainDel?
It's counterpart. qemuBuildPRDef.. qemuDestroyPRDef..
+ const char *alias, + const char *path) +{ + if (!alias) + return;
BTW: This is where I'd expect to remove the object and then...
+ + qemuProcessKillPRDaemons(vm, path, false);
Just unlink the path... See some thoughts below...
+} + + /** * qemuDomainAttachDiskGeneric: * @@ -365,12 +417,16 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, char *devstr = NULL; char *drivestr = NULL; char *drivealias = NULL; + const char *prAlias = NULL; + const char *prPath = NULL; bool driveAdded = false; bool secobjAdded = false; bool encobjAdded = false; + bool prmgrAdded = false; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); virJSONValuePtr secobjProps = NULL; virJSONValuePtr encobjProps = NULL; + virJSONValuePtr prmgrProps = NULL; qemuDomainStorageSourcePrivatePtr srcPriv; qemuDomainSecretInfoPtr secinfo = NULL; qemuDomainSecretInfoPtr encinfo = NULL; @@ -403,6 +459,9 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, disk->info.alias) < 0) goto error;
+ if (qemuBuildPRDefInfoProps(vm, disk, &prmgrProps, &prAlias, &prPath) < 0) + goto error; + if (!(drivestr = qemuBuildDriveStr(disk, false, priv->qemuCaps))) goto error;
@@ -435,6 +494,15 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, encobjAdded = true; }
+ if (prmgrProps) { + rv = qemuMonitorAddObject(priv->mon, "pr-manager-helper", prAlias, + prmgrProps); + prmgrProps = NULL; /* qemuMonitorAddObject consumes */ + if (rv < 0) + goto exit_monitor; + prmgrAdded = true; + } +
So for one of the managed modes (as noted much earlier) we could be creating a object with a duplicated alias - does that work? I thought id= has to be unique.
How come? The first thing that qemuBuildPRDefInfoProps() does is it checks 'priv->prPid != (pid_t) -1'. The fact that this pid is not -1 means that there is a pr-helper process already running.
if (qemuMonitorAddDrive(priv->mon, drivestr) < 0) goto exit_monitor; driveAdded = true; @@ -455,6 +523,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, cleanup: virJSONValueFree(secobjProps); virJSONValueFree(encobjProps); + virJSONValueFree(prmgrProps); qemuDomainSecretDiskDestroy(disk); VIR_FREE(devstr); VIR_FREE(drivestr); @@ -472,6 +541,8 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, secinfo->s.aes.alias)); if (encobjAdded) ignore_value(qemuMonitorDelObject(priv->mon, encinfo->s.aes.alias)); + if (prmgrAdded) + ignore_value(qemuMonitorDelObject(priv->mon, prAlias)); if (qemuDomainObjExitMonitor(driver, vm) < 0) ret = -2; virErrorRestore(&orig_err); @@ -481,6 +552,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, error: qemuDomainDelDiskSrcTLSObject(driver, vm, disk->src); ignore_value(qemuHotplugPrepareDiskAccess(driver, vm, disk, NULL, true)); + qemuDestroyPRDefObject(vm, prAlias, prPath);
oh, I see, you're mixing the way TLS object tear down occurred and how other objects happen. If you're going to mimic TLS, then change qemuBuildPRDefInfoProps to be something like qemuDomainAddPRDefObject which would do the EnterMonitorAsync, Props mgmt, AddObject, and ExitMonitor.
That way qemuDestroyPRDefObject changes to qemuDomainDelPRDefObject which would handle the failure path.
Then I believe you don't have to pass around prAlias and prPath since they'd be "obtainable".
Actually, I'm mimicing qemuBuildSecretInfoProps().
goto cleanup; }
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 33e0ad30c..3a914b6c6 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2556,16 +2556,40 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, }
This subsequent hunk feels like it could have gone in for the previous patch. Or at the very least some function that would unlink the socket path for each of the sockets in ndisks that were created. Then that single unlink API gets exposed.
Well, the only socket we can unlink is for the managed pr-helper. Otherwise we might be unlinking a socket that is still active. And since I need to use those static functions only in this patch I'm exposing them here.
-static void -qemuProcessKillPRDaemons(virDomainObjPtr vm) +void +qemuProcessKillPRDaemons(virDomainObjPtr vm, + const char *socketPath, + bool force)
The only time force == true is when socketPath == NULL... and that's only in the shutdown path. When socketPath != NULL, force == false, and we're going to unplug the socket, right?
Yes.
Perhaps this would be cleaner if only @socketPath was in play. If NULL, then walk the ndisks looking for paths that you'll unlink first before killing the daemon.
I actually think having a separate unlink the socket would be just fine. Does it really matter if it's the "last" one to stop the helper daemon?
I think it does, because if we did not stop it I can see people complaining immediately. We would be leaving a SUID process around for longer than needed. Which increases the attack surface (the process has a socket which is writable to UID:GID of the corresponding qemu process and can do RAWIO operations). Michal

On 03/06/2018 12:31 PM, Michal Privoznik wrote:
On 03/02/2018 06:55 PM, John Ferlan wrote:
On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Surprisingly, nothing special is happening here. If we are the first to use the managed helper then spawn it. If not, we're almost done.
But wasn't there a very special reason to start it between fork and exec?
No. If you are referring to previous patch, the very special reason applies to calling qemuProcessSetupOnePRDaemonHook() which places the qemu-pr-helper process (well, at this stage it's still just our own fork) into the namespace of qemu process.
Does that still hold true? That is why can we start the daemon after exec now, but we couldn't before in the previous patch?
It feels quite "disjoint" to have the unplug in a subsequent patch too. Considering them both in one just seems "better", but it's not a deal breaker.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 38 +++++++++++++++++++++----- src/qemu/qemu_process.h | 7 +++++ 3 files changed, 110 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index f28006e3c..2ebb68fbc 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -348,6 +348,58 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, }
+static int +qemuBuildPRDefInfoProps(virDomainObjPtr vm,
qemuBuild? Why not qemuDomainAdd?
Dunno. Other functions have the same prefix, e.g. qemuBuildSecretInfoProps().
OK - I see... Still qemuBuildSecretInfoProps is in qemu_command - there are not qemuBuild API's in qemu_hotplug.
+ virDomainDiskDefPtr disk, + virJSONValuePtr *prmgrProps, + const char **prAlias, + const char **prPath) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + qemuDomainStorageSourcePrivatePtr srcPriv; + virJSONValuePtr props = NULL; + int ret = -1; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src);
ohh, umm, in qemuDomainAttachDiskGeneric there's a :
srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); if (srcPriv) { secinfo = srcPriv->secinfo; encinfo = srcPriv->encinfo; }
Which makes light dawn that it "was" possible for srcPriv == NULL and the "thought" that the subsequent deref is going to fail rather spectacularly. See commit '8056721cb' (and a couple of others).
Although it seems patch 4 and your change to qemuDomainPrepareDiskSource to call qemuDomainStorageSourcePrivateNew instead of having it called in qemuDomainSecretStorageSourcePrepare would seem to ensure that disk source privateData is always allocated now - meaning a number of other places where srcPriv is/was checked are no longer necessary.
Yes. There are such places.
And Peter also made changes in the domain private code that you'll have to merge with. Seeing this here and thinking about the recent history of this code caused me to "elaborate".
Maybe that one change needs to be "extracted" out to make things obvious. Not required, but just thinking and typing thoughts.
Okay, I can try that. Also try removing those unnecessary checks, but as I was proven many times, touching this part of libvirt code always ends bad with me.
We're in the same boat. Usually what happens is something I take as a given during review process bites me later on when something else I was depending on is changed as part of the review, but I neglect to remember to go back to the subsequent patches that may rely on the earlier assumption. Someone usually finds those pretty quick though.
+ + *prmgrProps = NULL; + + if (priv->prPid != (pid_t) -1 || + !srcPriv->prd || + !srcPriv->prd->alias) + return 0; + + if (virJSONValueObjectCreate(&props, + "s:path", srcPriv->prd->path, + NULL) < 0) + goto cleanup; +> + if (qemuProcessSetupOnePRDaemon(vm, disk) < 0) + goto cleanup;> + + *prAlias = srcPriv->prd->alias; + *prPath = srcPriv->prd->path; + *prmgrProps = props; + props = NULL; + ret = 0; + cleanup: + virJSONValueFree(props); + return ret; +} + + +static void +qemuDestroyPRDefObject(virDomainObjPtr vm,
qemuDestroy? Why not qemuDomainDel?
It's counterpart. qemuBuildPRDef.. qemuDestroyPRDef..
+ const char *alias, + const char *path) +{ + if (!alias) + return;
BTW: This is where I'd expect to remove the object and then...
+ + qemuProcessKillPRDaemons(vm, path, false);
Just unlink the path... See some thoughts below...
+} + + /** * qemuDomainAttachDiskGeneric: * @@ -365,12 +417,16 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, char *devstr = NULL; char *drivestr = NULL; char *drivealias = NULL; + const char *prAlias = NULL; + const char *prPath = NULL; bool driveAdded = false; bool secobjAdded = false; bool encobjAdded = false; + bool prmgrAdded = false; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); virJSONValuePtr secobjProps = NULL; virJSONValuePtr encobjProps = NULL; + virJSONValuePtr prmgrProps = NULL; qemuDomainStorageSourcePrivatePtr srcPriv; qemuDomainSecretInfoPtr secinfo = NULL; qemuDomainSecretInfoPtr encinfo = NULL; @@ -403,6 +459,9 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, disk->info.alias) < 0) goto error;
+ if (qemuBuildPRDefInfoProps(vm, disk, &prmgrProps, &prAlias, &prPath) < 0) + goto error; + if (!(drivestr = qemuBuildDriveStr(disk, false, priv->qemuCaps))) goto error;
@@ -435,6 +494,15 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, encobjAdded = true; }
+ if (prmgrProps) { + rv = qemuMonitorAddObject(priv->mon, "pr-manager-helper", prAlias, + prmgrProps); + prmgrProps = NULL; /* qemuMonitorAddObject consumes */ + if (rv < 0) + goto exit_monitor; + prmgrAdded = true; + } +
So for one of the managed modes (as noted much earlier) we could be creating a object with a duplicated alias - does that work? I thought id= has to be unique.
How come? The first thing that qemuBuildPRDefInfoProps() does is it checks 'priv->prPid != (pid_t) -1'. The fact that this pid is not -1 means that there is a pr-helper process already running.
Not sure why I'm under the impression that each object needs a unique id value. Whether two objects could actually be created because of some other check - well I guess not all the dots are connecting for me yet.
if (qemuMonitorAddDrive(priv->mon, drivestr) < 0) goto exit_monitor; driveAdded = true; @@ -455,6 +523,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, cleanup: virJSONValueFree(secobjProps); virJSONValueFree(encobjProps); + virJSONValueFree(prmgrProps); qemuDomainSecretDiskDestroy(disk); VIR_FREE(devstr); VIR_FREE(drivestr); @@ -472,6 +541,8 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, secinfo->s.aes.alias)); if (encobjAdded) ignore_value(qemuMonitorDelObject(priv->mon, encinfo->s.aes.alias)); + if (prmgrAdded) + ignore_value(qemuMonitorDelObject(priv->mon, prAlias)); if (qemuDomainObjExitMonitor(driver, vm) < 0) ret = -2; virErrorRestore(&orig_err); @@ -481,6 +552,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver, error: qemuDomainDelDiskSrcTLSObject(driver, vm, disk->src); ignore_value(qemuHotplugPrepareDiskAccess(driver, vm, disk, NULL, true)); + qemuDestroyPRDefObject(vm, prAlias, prPath);
oh, I see, you're mixing the way TLS object tear down occurred and how other objects happen. If you're going to mimic TLS, then change qemuBuildPRDefInfoProps to be something like qemuDomainAddPRDefObject which would do the EnterMonitorAsync, Props mgmt, AddObject, and ExitMonitor.
That way qemuDestroyPRDefObject changes to qemuDomainDelPRDefObject which would handle the failure path.
Then I believe you don't have to pass around prAlias and prPath since they'd be "obtainable".
Actually, I'm mimicing qemuBuildSecretInfoProps().
And teardown of that isn't in the "error" label other than of course the TLS objects (including those secinfo things). And that's because the decision was to build those objects wholly/separately with an Enter/Exit Monitor before we get into the guts of the Attach code. I think you may miss an error path by putting it here, but I forget exactly what I was thinking about when I wrote the comment.
goto cleanup; }
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 33e0ad30c..3a914b6c6 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2556,16 +2556,40 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, }
This subsequent hunk feels like it could have gone in for the previous patch. Or at the very least some function that would unlink the socket path for each of the sockets in ndisks that were created. Then that single unlink API gets exposed.
Well, the only socket we can unlink is for the managed pr-helper. Otherwise we might be unlinking a socket that is still active. And since I need to use those static functions only in this patch I'm exposing them here.
Yeah - this is definitely a part I'm not understanding - socket mgmt with the daemon.
-static void -qemuProcessKillPRDaemons(virDomainObjPtr vm) +void +qemuProcessKillPRDaemons(virDomainObjPtr vm, + const char *socketPath, + bool force)
The only time force == true is when socketPath == NULL... and that's only in the shutdown path. When socketPath != NULL, force == false, and we're going to unplug the socket, right?
Yes.
Perhaps this would be cleaner if only @socketPath was in play. If NULL, then walk the ndisks looking for paths that you'll unlink first before killing the daemon.
I actually think having a separate unlink the socket would be just fine. Does it really matter if it's the "last" one to stop the helper daemon?
I think it does, because if we did not stop it I can see people complaining immediately. We would be leaving a SUID process around for longer than needed. Which increases the attack surface (the process has a socket which is writable to UID:GID of the corresponding qemu process and can do RAWIO operations).
Oh, well then, yes a SUID process shouldn't stick around if not necessary... Of course you know QE will come up with a way to start/stop incessantly and break something. John

Just line in previous commit, if we are the last ones that are using the pr-manager delete it and also kill the pr-helper process. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 19 +++++++++++++++++++ src/qemu/qemu_domain.h | 3 +++ src/qemu/qemu_hotplug.c | 22 ++++++++++++++++++++++ 3 files changed, 44 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index f7da62dba..acfa1d88c 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11650,6 +11650,25 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, } +size_t +qemuDomainGetPRDManagedCount(const virDomainDef *def) +{ + size_t i; + size_t nmanaged = 0; + + for (i = 0; i < def->ndisks; i++) { + virDomainDiskDefPtr disk = def->disks[i]; + + if (!virStoragePRDefIsManaged(disk->src->pr)) + continue; + + nmanaged++; + } + + return nmanaged; +} + + static int qemuDomainPrepareDiskPRD(qemuDomainObjPrivatePtr priv, virDomainDiskDefPtr disk) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 418b47153..c2f67f379 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -1007,6 +1007,9 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob); +size_t +qemuDomainGetPRDManagedCount(const virDomainDef *def); + int qemuDomainPrepareDiskSource(virDomainDiskDefPtr disk, qemuDomainObjPrivatePtr priv, diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 2ebb68fbc..d8460cdb4 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -400,6 +400,13 @@ qemuDestroyPRDefObject(virDomainObjPtr vm, } +static bool +qemuDomainDiskNeedRemovePR(virDomainObjPtr vm) +{ + return qemuDomainGetPRDManagedCount(vm->def) == 1; +} + + /** * qemuDomainAttachDiskGeneric: * @@ -3812,6 +3819,8 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, char *drivestr; char *objAlias = NULL; char *encAlias = NULL; + const char *prAlias = NULL; + const char *prPath = NULL; VIR_DEBUG("Removing disk %s from domain %p %s", disk->info.alias, vm, vm->def->name); @@ -3849,6 +3858,14 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } } + if (qemuDomainDiskNeedRemovePR(vm)) { + qemuDomainStorageSourcePrivatePtr srcPriv; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + prAlias = srcPriv->prd->alias; + prPath = srcPriv->prd->path; + } + qemuDomainObjEnterMonitor(driver, vm); qemuMonitorDriveDel(priv->mon, drivestr); @@ -3864,6 +3881,9 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, encAlias)); VIR_FREE(encAlias); + if (prAlias) + ignore_value(qemuMonitorDelObject(priv->mon, prAlias)); + if (disk->src->haveTLS) ignore_value(qemuMonitorDelObject(priv->mon, disk->src->tlsAlias)); @@ -3882,6 +3902,8 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } } + qemuDestroyPRDefObject(vm, prAlias, prPath); + qemuDomainReleaseDeviceAddress(vm, &disk->info, src); if (qemuSecurityRestoreDiskLabel(driver, vm, disk) < 0) -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
Just line in previous commit, if we are the last ones that are using the pr-manager delete it and also kill the pr-helper process.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 19 +++++++++++++++++++ src/qemu/qemu_domain.h | 3 +++ src/qemu/qemu_hotplug.c | 22 ++++++++++++++++++++++ 3 files changed, 44 insertions(+)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index f7da62dba..acfa1d88c 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11650,6 +11650,25 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, }
+size_t +qemuDomainGetPRDManagedCount(const virDomainDef *def) +{ + size_t i; + size_t nmanaged = 0; + + for (i = 0; i < def->ndisks; i++) { + virDomainDiskDefPtr disk = def->disks[i]; + + if (!virStoragePRDefIsManaged(disk->src->pr)) + continue; + + nmanaged++; + } + + return nmanaged; +} + +
The above feels like it could have gone in much earlier and could have been useful for the qemuProcessKillPRDaemons changes.
static int qemuDomainPrepareDiskPRD(qemuDomainObjPrivatePtr priv, virDomainDiskDefPtr disk) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 418b47153..c2f67f379 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -1007,6 +1007,9 @@ qemuDomainCheckMigrationCapabilities(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob);
+size_t +qemuDomainGetPRDManagedCount(const virDomainDef *def); + int qemuDomainPrepareDiskSource(virDomainDiskDefPtr disk, qemuDomainObjPrivatePtr priv, diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 2ebb68fbc..d8460cdb4 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -400,6 +400,13 @@ qemuDestroyPRDefObject(virDomainObjPtr vm, }
+static bool +qemuDomainDiskNeedRemovePR(virDomainObjPtr vm) +{ + return qemuDomainGetPRDManagedCount(vm->def) == 1; +} + +> /** * qemuDomainAttachDiskGeneric: * @@ -3812,6 +3819,8 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, char *drivestr; char *objAlias = NULL; char *encAlias = NULL; + const char *prAlias = NULL; + const char *prPath = NULL;
VIR_DEBUG("Removing disk %s from domain %p %s", disk->info.alias, vm, vm->def->name); @@ -3849,6 +3858,14 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } }
+ if (qemuDomainDiskNeedRemovePR(vm)) { + qemuDomainStorageSourcePrivatePtr srcPriv; + + srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src); + prAlias = srcPriv->prd->alias; + prPath = srcPriv->prd->path; + } +
As noted in previous patch - should we really care? It's mostly a thrashing concern, but here you don't really even remove the object if this isn't the last one using PR.
qemuDomainObjEnterMonitor(driver, vm);
qemuMonitorDriveDel(priv->mon, drivestr); @@ -3864,6 +3881,9 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, ignore_value(qemuMonitorDelObject(priv->mon, encAlias)); VIR_FREE(encAlias);
+ if (prAlias) + ignore_value(qemuMonitorDelObject(priv->mon, prAlias)); +
Since you only set prAlias when you'll be needing to remove the PR, thus if there was more than one lun w/ PR, you wouldn't delete the object.
if (disk->src->haveTLS) ignore_value(qemuMonitorDelObject(priv->mon, disk->src->tlsAlias));
@@ -3882,6 +3902,8 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } }
+ qemuDestroyPRDefObject(vm, prAlias, prPath); +
If the two patches were combined it'd be more easier to see, but this would also not unlink the socket path... If there were 2 luns using PR and just 1 was removed, we wouldn't delete the object and prAlias would still be NULL meaning qemuDestroyPRDefObject returns immediately and nothing really happens. Again, the whole socket path and managed vs. unmanaged is not very clear at least to me. John
qemuDomainReleaseDeviceAddress(vm, &disk->info, src);
if (qemuSecurityRestoreDiskLabel(driver, vm, disk) < 0)

The capability tracks if qemu has pr-manager-helper object. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d0005c71d..21fbecbf6 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -1695,6 +1695,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "sclplmconsole", QEMU_CAPS_DEVICE_SCLPLMCONSOLE }, { "isa-serial", QEMU_CAPS_DEVICE_ISA_SERIAL }, { "pl011", QEMU_CAPS_DEVICE_PL011 }, + { "pr-manager-helper", QEMU_CAPS_PR_MANAGER_HELPER }, }; static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBalloon[] = { -- 2.16.1

On 02/21/2018 01:11 PM, Michal Privoznik wrote:
The capability tracks if qemu has pr-manager-helper object.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 1 + 1 file changed, 1 insertion(+)
As noted earlier - this should just be merged with patch 3. John
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d0005c71d..21fbecbf6 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -1695,6 +1695,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "sclplmconsole", QEMU_CAPS_DEVICE_SCLPLMCONSOLE }, { "isa-serial", QEMU_CAPS_DEVICE_ISA_SERIAL }, { "pl011", QEMU_CAPS_DEVICE_PL011 }, + { "pr-manager-helper", QEMU_CAPS_PR_MANAGER_HELPER }, };
static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBalloon[] = {
participants (4)
-
John Ferlan
-
Ján Tomko
-
Michal Privoznik
-
Peter Krempa