[libvirt] [PATCH v3 0/3] Alter qemu live/cold device attach algorithm

v2: https://www.redhat.com/archives/libvir-list/2018-July/msg00361.html Differences to v2: Patch1: NEW - As a result of code review the suggestion was to utilize the virDomainDefCompatibleDevice in order to make the check more generic to include <disk>'s which were also afflicted with the same problem that I was trying to solve with the former patch1 just for <hostdev>'s. However, this led me down into the abyss of more changes since <disk>'s have multiple <address type='drive'...> target bus types (IDE and SCSI). That means we need to have a mechanism to pass the target bus along so that a SCSI drive address doesn't inadvertently match an IDE drive address. All that is complicated by the way virDomainDefHasDeviceAddress iterates through all the device lists. Whether there are more similar devices I'm assuming will fall out of code review. Patch2: This moves the virDomainDefHasDeviceAddress into the more common config checking virDomainDefCompatibleDevice method, but now needs to also account for the disk bus issue. This theoretically could be combined with Patch1, but keeping them separate I would hope makes for simpler code review. I could also move code out of virDomainDefHasDeviceAddressIterator into patch2, but if just felt better in patch1. Patch3: No changes were made (amazingly so). In the end quite a bit more complicated John Ferlan (3): conf: Add @target_bus to virDomainDefHasDeviceAddress qemu: Check for existing address when cold attach device qemu: Use the correct vm def on cold attach src/conf/domain_conf.c | 54 +++++++++++++++++++++++++++++++++++--- src/conf/domain_conf.h | 3 ++- src/qemu/qemu_driver.c | 59 ++++++++++++++++-------------------------- 3 files changed, 75 insertions(+), 41 deletions(-) -- 2.17.1

Add the @target_bus argument which will allow a caller to pass the virDomainDiskBus onto which the @info (<address>) would be placed. This will allow logic to provide the bus for cold plugged devices to determine whether the about to be added device <address> is already present on the @bus. Just passing the @info isn't sufficient since, for example, ADDRESS_TYPE_DRIVE is used for both SCSI and IDE <disk>'s as well as 'scsi' and 'scsi_host' <hostdev>'s. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/conf/domain_conf.c | 32 ++++++++++++++++++++++++++++---- src/conf/domain_conf.h | 3 ++- src/qemu/qemu_driver.c | 2 +- 3 files changed, 31 insertions(+), 6 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 7396616eda..82df8012af 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3688,13 +3688,31 @@ virDomainDeviceGetInfo(virDomainDeviceDefPtr device) } +struct virDomainDefHasDeviceAddressIteratorData { + int target_bus; /* virDomainDiskBus or -1 */ + virDomainDeviceInfoPtr info; +}; + static int virDomainDefHasDeviceAddressIterator(virDomainDefPtr def ATTRIBUTE_UNUSED, - virDomainDeviceDefPtr dev ATTRIBUTE_UNUSED, + virDomainDeviceDefPtr dev, virDomainDeviceInfoPtr info, void *opaque) { - virDomainDeviceInfoPtr needle = opaque; + struct virDomainDefHasDeviceAddressIteratorData *data = opaque; + int target_bus = data->target_bus; + virDomainDeviceInfoPtr needle = data->info; + + /* If the target_bus of the about to be cold plugged device needs + * to be checked and the currently to be matched device is a disk, + * then compare it's target bus against the new device. If they don't + * match, then no need to compare. For disks this ensures addresses + * using drive won't erroneously match if one is IDE and another is SCSI. + * Likewise, for SCSI hostdev's this ensures the new hostdev doesn't + * erroneously match an IDE for the address comparison. */ + if (target_bus != -1 && dev->type == VIR_DOMAIN_DEVICE_DISK && + dev->data.disk->bus != target_bus) + return 0; /* break iteration if the info was found */ if (virDomainDeviceInfoAddressIsEqual(info, needle)) @@ -3933,12 +3951,18 @@ virDomainDeviceInfoIterate(virDomainDefPtr def, bool virDomainDefHasDeviceAddress(virDomainDefPtr def, + int target_bus, virDomainDeviceInfoPtr info) { + struct virDomainDefHasDeviceAddressIteratorData data = { + .target_bus = target_bus, + .info = info, + }; + if (virDomainDeviceInfoIterateInternal(def, virDomainDefHasDeviceAddressIterator, true, - info) < 0) + &data) < 0) return true; return false; @@ -17508,7 +17532,7 @@ virDomainMemoryInsert(virDomainDefPtr def, int id = def->nmems; if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && - virDomainDefHasDeviceAddress(def, &mem->info)) { + virDomainDefHasDeviceAddress(def, -1, &mem->info)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("Domain already contains a device with the same " "address")); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 0f10e242fd..82231161c6 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2912,8 +2912,9 @@ int virDomainDeviceInfoIterate(virDomainDefPtr def, void *opaque); bool virDomainDefHasDeviceAddress(virDomainDefPtr def, + int target_bus, virDomainDeviceInfoPtr info) - ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK; + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) ATTRIBUTE_RETURN_CHECK; void virDomainDefFree(virDomainDefPtr vm); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8fae46370e..5f91d463ae 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8082,7 +8082,7 @@ qemuDomainAttachDeviceConfig(virDomainDefPtr vmdef, case VIR_DOMAIN_DEVICE_RNG: if (dev->data.rng->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && - virDomainDefHasDeviceAddress(vmdef, &dev->data.rng->info)) { + virDomainDefHasDeviceAddress(vmdef, -1, &dev->data.rng->info)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("a device with the same address already exists ")); return -1; -- 2.17.1

Now that virDomainDefCompatibleDevice accepts an action, let's use that to generically determine whether a to be cold plugged device with a defined <address> would conflict with any existing address for the domain config. This replaces the RNG check in qemuDomainAttachDeviceConfig. This type of check would need to be made before a device is inserted into its domain device list (disk, rng, hostdev, etc.). This type of check cannot be done during the post parse processing because the check would conflict with itself as the device would be placed onto the its device list prior to post parse. By comparison this is similar to the validation phase checks in virDomainDefCheckDuplicateDriveAddresses that occur during define/startup processing, but are not run during config attach of a live guest. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/conf/domain_conf.c | 22 ++++++++++++++++++++++ src/qemu/qemu_driver.c | 7 ------- 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 82df8012af..fd5bc10c82 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -28270,11 +28270,25 @@ virDomainDefCompatibleDevice(virDomainDefPtr def, virDomainDeviceAction action, bool live) { + int target_bus = -1; virDomainCompatibleDeviceData data = { .newInfo = virDomainDeviceGetInfo(dev), .oldInfo = NULL, }; + if (dev->type == VIR_DOMAIN_DEVICE_DISK) { + target_bus = dev->data.disk->bus; + } else if (dev->type == VIR_DOMAIN_DEVICE_HOSTDEV && + dev->data.hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) { + virDomainHostdevSubsysPtr subsys = &dev->data.hostdev->source.subsys; + + /* If using 'scsi' or 'scsi_host', then the device can be placed + * on the same bus as a <disk>, so account for that */ + if (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI || + subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) + target_bus = VIR_DOMAIN_DISK_BUS_SCSI; + } + if (oldDev) data.oldInfo = virDomainDeviceGetInfo(oldDev); @@ -28288,6 +28302,14 @@ virDomainDefCompatibleDevice(virDomainDefPtr def, return -1; } + if (action == VIR_DOMAIN_DEVICE_ACTION_ATTACH && !live && data.newInfo && + data.newInfo->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainDefHasDeviceAddress(def, target_bus, data.newInfo)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("a device with the same address already exists")); + return -1; + } + if (!virDomainDefHasUSB(def) && def->os.type != VIR_DOMAIN_OSTYPE_EXE && virDomainDeviceIsUSB(dev)) { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 5f91d463ae..7d519c0714 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8081,13 +8081,6 @@ qemuDomainAttachDeviceConfig(virDomainDefPtr vmdef, break; case VIR_DOMAIN_DEVICE_RNG: - if (dev->data.rng->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && - virDomainDefHasDeviceAddress(vmdef, -1, &dev->data.rng->info)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("a device with the same address already exists ")); - return -1; - } - if (VIR_APPEND_ELEMENT(vmdef->rngs, vmdef->nrngs, dev->data.rng) < 0) return -1; dev->data.rng = NULL; -- 2.17.1

https://bugzilla.redhat.com/show_bug.cgi?id=1559867 When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in. Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses. In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address. Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and performing cleanup for both options as necessary. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 +++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 29 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7d519c0714..3dbd5c62d2 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8473,7 +8473,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, { virDomainDefPtr vmdef = NULL; virQEMUDriverConfigPtr cfg = NULL; - virDomainDeviceDefPtr dev = NULL, dev_copy = NULL; + virDomainDeviceDefPtr devConf = NULL; + virDomainDeviceDefPtr devLive = NULL; int ret = -1; virCapsPtr caps = NULL; unsigned int parse_flags = VIR_DOMAIN_DEF_PARSE_INACTIVE | @@ -8487,49 +8488,43 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; - dev = dev_copy = virDomainDeviceDefParse(xml, vm->def, - caps, driver->xmlopt, - parse_flags); - if (dev == NULL) - goto cleanup; - - if (virDomainDeviceValidateAliasForHotplug(vm, dev, flags) < 0) - goto cleanup; - - if (flags & VIR_DOMAIN_AFFECT_CONFIG && - flags & VIR_DOMAIN_AFFECT_LIVE) { - /* If we are affecting both CONFIG and LIVE - * create a deep copy of device as adding - * to CONFIG takes one instance. - */ - dev_copy = virDomainDeviceDefCopy(dev, vm->def, caps, driver->xmlopt); - if (!dev_copy) - goto cleanup; - } - + /* The config and live post processing address auto-generation algorithms + * rely on the correct vm->def or vm->newDef being passed, so call the + * device parse based on which definition is in use */ if (flags & VIR_DOMAIN_AFFECT_CONFIG) { - /* Make a copy for updated domain. */ vmdef = virDomainObjCopyPersistentDef(vm, caps, driver->xmlopt); if (!vmdef) goto cleanup; - if (virDomainDefCompatibleDevice(vmdef, dev, NULL, + if (!(devConf = virDomainDeviceDefParse(xml, vmdef, caps, + driver->xmlopt, parse_flags))) + goto cleanup; + + if (virDomainDefCompatibleDevice(vmdef, devConf, NULL, VIR_DOMAIN_DEVICE_ACTION_ATTACH, false) < 0) goto cleanup; - if ((ret = qemuDomainAttachDeviceConfig(vmdef, dev, caps, + + if ((ret = qemuDomainAttachDeviceConfig(vmdef, devConf, caps, parse_flags, driver->xmlopt)) < 0) goto cleanup; } if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (virDomainDefCompatibleDevice(vm->def, dev_copy, NULL, + if (!(devLive = virDomainDeviceDefParse(xml, vm->def, caps, + driver->xmlopt, parse_flags))) + goto cleanup; + + if (virDomainDeviceValidateAliasForHotplug(vm, devLive, flags) < 0) + goto cleanup; + + if (virDomainDefCompatibleDevice(vm->def, devLive, NULL, VIR_DOMAIN_DEVICE_ACTION_ATTACH, true) < 0) goto cleanup; - if ((ret = qemuDomainAttachDeviceLive(vm, dev_copy, driver)) < 0) + if ((ret = qemuDomainAttachDeviceLive(vm, devLive, driver)) < 0) goto cleanup; /* * update domain status forcibly because the domain status may be @@ -8553,9 +8548,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, cleanup: virDomainDefFree(vmdef); - if (dev != dev_copy) - virDomainDeviceDefFree(dev_copy); - virDomainDeviceDefFree(dev); + virDomainDeviceDefFree(devConf); + virDomainDeviceDefFree(devLive); virObjectUnref(cfg); virObjectUnref(caps); -- 2.17.1

On 07/16/2018 11:14 PM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1559867
When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in.
Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses.
In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address.
Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and performing cleanup for both options as necessary.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 +++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 29 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7d519c0714..3dbd5c62d2 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8473,7 +8473,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, { virDomainDefPtr vmdef = NULL; virQEMUDriverConfigPtr cfg = NULL; - virDomainDeviceDefPtr dev = NULL, dev_copy = NULL; + virDomainDeviceDefPtr devConf = NULL; + virDomainDeviceDefPtr devLive = NULL; int ret = -1; virCapsPtr caps = NULL; unsigned int parse_flags = VIR_DOMAIN_DEF_PARSE_INACTIVE | @@ -8487,49 +8488,43 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup;
- dev = dev_copy = virDomainDeviceDefParse(xml, vm->def, - caps, driver->xmlopt, - parse_flags); - if (dev == NULL) - goto cleanup; - - if (virDomainDeviceValidateAliasForHotplug(vm, dev, flags) < 0) - goto cleanup; - - if (flags & VIR_DOMAIN_AFFECT_CONFIG && - flags & VIR_DOMAIN_AFFECT_LIVE) { - /* If we are affecting both CONFIG and LIVE - * create a deep copy of device as adding - * to CONFIG takes one instance. - */ - dev_copy = virDomainDeviceDefCopy(dev, vm->def, caps, driver->xmlopt); - if (!dev_copy) - goto cleanup; - } - + /* The config and live post processing address auto-generation algorithms + * rely on the correct vm->def or vm->newDef being passed, so call the + * device parse based on which definition is in use */ if (flags & VIR_DOMAIN_AFFECT_CONFIG) { - /* Make a copy for updated domain. */ vmdef = virDomainObjCopyPersistentDef(vm, caps, driver->xmlopt); if (!vmdef) goto cleanup;
- if (virDomainDefCompatibleDevice(vmdef, dev, NULL, + if (!(devConf = virDomainDeviceDefParse(xml, vmdef, caps, + driver->xmlopt, parse_flags))) + goto cleanup; + + if (virDomainDefCompatibleDevice(vmdef, devConf, NULL, VIR_DOMAIN_DEVICE_ACTION_ATTACH, false) < 0) goto cleanup; - if ((ret = qemuDomainAttachDeviceConfig(vmdef, dev, caps, + + if ((ret = qemuDomainAttachDeviceConfig(vmdef, devConf, caps, parse_flags, driver->xmlopt)) < 0) goto cleanup; }
if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (virDomainDefCompatibleDevice(vm->def, dev_copy, NULL, + if (!(devLive = virDomainDeviceDefParse(xml, vm->def, caps, + driver->xmlopt, parse_flags))) + goto cleanup; + + if (virDomainDeviceValidateAliasForHotplug(vm, devLive, flags) < 0) + goto cleanup; + + if (virDomainDefCompatibleDevice(vm->def, devLive, NULL, VIR_DOMAIN_DEVICE_ACTION_ATTACH, true) < 0) goto cleanup;
- if ((ret = qemuDomainAttachDeviceLive(vm, dev_copy, driver)) < 0) + if ((ret = qemuDomainAttachDeviceLive(vm, devLive, driver)) < 0) goto cleanup; /* * update domain status forcibly because the domain status may be @@ -8553,9 +8548,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm,
cleanup: virDomainDefFree(vmdef); - if (dev != dev_copy) - virDomainDeviceDefFree(dev_copy); - virDomainDeviceDefFree(dev); + virDomainDeviceDefFree(devConf); + virDomainDeviceDefFree(devLive); virObjectUnref(cfg); virObjectUnref(caps);
I'm failing to see why the other patches are necessary. I'm unable to reproduce the bug after I applied only this single patch. It also makes sense, because the source of error is that when parsing device XML wrong vmdef was passed. Therefore, when postParse callback tried to fill device address it was looking at wrong data (thinking there's no such drive address) and returned the next free address which was the same all the time. ACK to this patch. Michal

On 07/25/2018 06:40 AM, Michal Privoznik wrote:
On 07/16/2018 11:14 PM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1559867
When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in.
Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses.
In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address.
Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and performing cleanup for both options as necessary.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 +++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 29 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7d519c0714..3dbd5c62d2 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8473,7 +8473,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, { virDomainDefPtr vmdef = NULL; virQEMUDriverConfigPtr cfg = NULL; - virDomainDeviceDefPtr dev = NULL, dev_copy = NULL; + virDomainDeviceDefPtr devConf = NULL; + virDomainDeviceDefPtr devLive = NULL; int ret = -1; virCapsPtr caps = NULL; unsigned int parse_flags = VIR_DOMAIN_DEF_PARSE_INACTIVE | @@ -8487,49 +8488,43 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup;
- dev = dev_copy = virDomainDeviceDefParse(xml, vm->def, - caps, driver->xmlopt, - parse_flags); - if (dev == NULL) - goto cleanup; - - if (virDomainDeviceValidateAliasForHotplug(vm, dev, flags) < 0) - goto cleanup; - - if (flags & VIR_DOMAIN_AFFECT_CONFIG && - flags & VIR_DOMAIN_AFFECT_LIVE) { - /* If we are affecting both CONFIG and LIVE - * create a deep copy of device as adding - * to CONFIG takes one instance. - */ - dev_copy = virDomainDeviceDefCopy(dev, vm->def, caps, driver->xmlopt); - if (!dev_copy) - goto cleanup; - } - + /* The config and live post processing address auto-generation algorithms + * rely on the correct vm->def or vm->newDef being passed, so call the + * device parse based on which definition is in use */ if (flags & VIR_DOMAIN_AFFECT_CONFIG) { - /* Make a copy for updated domain. */ vmdef = virDomainObjCopyPersistentDef(vm, caps, driver->xmlopt); if (!vmdef) goto cleanup;
- if (virDomainDefCompatibleDevice(vmdef, dev, NULL, + if (!(devConf = virDomainDeviceDefParse(xml, vmdef, caps, + driver->xmlopt, parse_flags))) + goto cleanup; + + if (virDomainDefCompatibleDevice(vmdef, devConf, NULL, VIR_DOMAIN_DEVICE_ACTION_ATTACH, false) < 0) goto cleanup; - if ((ret = qemuDomainAttachDeviceConfig(vmdef, dev, caps, + + if ((ret = qemuDomainAttachDeviceConfig(vmdef, devConf, caps, parse_flags, driver->xmlopt)) < 0) goto cleanup; }
if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (virDomainDefCompatibleDevice(vm->def, dev_copy, NULL, + if (!(devLive = virDomainDeviceDefParse(xml, vm->def, caps, + driver->xmlopt, parse_flags))) + goto cleanup; + + if (virDomainDeviceValidateAliasForHotplug(vm, devLive, flags) < 0) + goto cleanup; + + if (virDomainDefCompatibleDevice(vm->def, devLive, NULL, VIR_DOMAIN_DEVICE_ACTION_ATTACH, true) < 0) goto cleanup;
- if ((ret = qemuDomainAttachDeviceLive(vm, dev_copy, driver)) < 0) + if ((ret = qemuDomainAttachDeviceLive(vm, devLive, driver)) < 0) goto cleanup; /* * update domain status forcibly because the domain status may be @@ -8553,9 +8548,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm,
cleanup: virDomainDefFree(vmdef); - if (dev != dev_copy) - virDomainDeviceDefFree(dev_copy); - virDomainDeviceDefFree(dev); + virDomainDeviceDefFree(devConf); + virDomainDeviceDefFree(devLive); virObjectUnref(cfg); virObjectUnref(caps);
I'm failing to see why the other patches are necessary. I'm unable to reproduce the bug after I applied only this single patch. It also makes sense, because the source of error is that when parsing device XML wrong vmdef was passed. Therefore, when postParse callback tried to fill device address it was looking at wrong data (thinking there's no such drive address) and returned the next free address which was the same all the time.
ACK to this patch.
Michal
OK - fair enough. It's been a few weeks, but I thought I had gone through some testing with only this patch applied and attempting the attach of a <hostdev> without an <address> defined and was getting the same unit#, but now of course with just this applied I cannot. So much for short term memory. Thanks for the review - I'll drop 1 and 2 and just push 3 - John

On Wed, Jul 25, 2018 at 08:39:26AM -0400, John Ferlan wrote:
On 07/25/2018 06:40 AM, Michal Privoznik wrote:
On 07/16/2018 11:14 PM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1559867
When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in.
This effectively reduces AttachDevice(AFFECT_LIVE | AFFECT_CONFIG) to two subsequent AttachDevice calls, thus possibly attaching two different devices, making the AFFECT_CONFIG option pointless. E.g. when hotplugging a network interface, it might end up both on a different PCI slot and with a different MAC address, which I'm afraid might break some use cases.
Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses.
If suitablity for both live and persistent definition is a problem, the address allocation code should take both domain definitions into account and generate a single address for both device copies.
In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address.
Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and performing cleanup for both options as necessary.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 +++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 29 deletions(-)
As I said in my review of v2 I still consider this a waste of energy and we might possibly end up having to revert this change because it breaks someone's use case. Jano

On 07/25/2018 09:51 AM, Ján Tomko wrote:
On Wed, Jul 25, 2018 at 08:39:26AM -0400, John Ferlan wrote:
On 07/25/2018 06:40 AM, Michal Privoznik wrote:
On 07/16/2018 11:14 PM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1559867
When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in.
This effectively reduces AttachDevice(AFFECT_LIVE | AFFECT_CONFIG) to two subsequent AttachDevice calls, thus possibly attaching two different devices, making the AFFECT_CONFIG option pointless.
Ironically, the v2 response you gave was: " IMO this is a step into the right direction - rather than tuning up the device to be compatible with the live definition and hoping it will work in the persistent definition is just naive. " Still what I did doesn't change the AttachDevice calls. Prior to my changes and after my changes there is a qemuDomainAttachDeviceConfig for CONFIG and a qemuDomainAttachDeviceLive for LIVE. What the change does is separate the virDomainDeviceDefParse calls such that the "config" one is made with the persistent def rather than the live def for both. Thus if some previous change updates persistent, then the next change to just config doesn't miss that and perhaps duplicate something not provided - in this case the <address... unit=0> was duplicated which results in subsequent start failure. So looking at the old and new code side by side - how exactly is new code pointless? What subtle point am I missing?
E.g. when hotplugging a network interface, it might end up both on a different PCI slot and with a different MAC address, which I'm afraid might break some use cases.
An example would have greatly helped... So, I assume you mean this: # cat n1.xml <interface type='network'> <source network='default'/> <model type='virtio'/> </interface> # virsh attach-device $dom n1.xml --live --config Device attached successfully # virsh dumpxml $dom | grep interface -A 10 ... <interface type='network'> <mac address='52:54:00:8a:bb:ea'/> <source network='default' bridge='virbr0'/> <target dev='vnet1'/> <model type='virtio'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0d' function='0x0'/> </interface> ... # virsh dumpxml $dom --inactive | grep interface -A 10 ... <interface type='network'> <mac address='52:54:00:be:29:3c'/> <source network='default'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0d' function='0x0'/> </interface> ... whereas, previously the same MAC address would have been generated for both live and config based on the current live def. Is that it? One other consideration. Prior to my changes. Assume the same n1.xml, then attach-device --config only, check out the resulting address using slot=0xd. Now use the --config --live - for the config guest you'll see that some MAC gets added at slot=0x0e while the same MAC gets added to slot=0x0d for the live guest. So I would contend this is no different than my changes other than with the changes the next guest wouldn't have the duplicated MAC from the live guest so the MAC being at a different address shouldn't matter. So the downside to libvirt generating a different MAC for live and config ends up being is what? The guest will boot the next time, but with a different MAC for the network device. If though a consumer doesn't provide one, how much of a problem is that compared to not being able to boot the guest because the <address ... unit=#> conflict for disks. This is certainly one of those "competing interests" type problems for generated data. I really don't know the "perfect answer" other than telling customers to provide the specific things they want to avoid these conflicts. One could argue that if one wants a predictable MAC, then they should provide it. Similar argument for the <address>. The one difference between the two for me being it's not possible to boot the machine with address conflicts; whereas, a change in an unprovided MAC is kind of a WYSIWYG type problem.
Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses.
If suitablity for both live and persistent definition is a problem, the address allocation code should take both domain definitions into account and generate a single address for both device copies.
Could be a real challenge to do both - the algorithms aren't exactly "simple" now, especially when one has to take into account SCSI disks and SCSI hostdevs "share" the same bus. Not sure it'll work, but perhaps a different option - don't assign an address to a device that's being cold plugged while the domain is active. Although there's enough different types of addresses and devices that I'm not sure that's feasible for each type.
In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address.
Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and performing cleanup for both options as necessary.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 +++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 29 deletions(-)
As I said in my review of v2 I still consider this a waste of energy and we might possibly end up having to revert this change because it breaks someone's use case.
Jano
For reference, the v2 review: https://www.redhat.com/archives/libvir-list/2018-June/msg01012.html And there were differing opinions related to whether it was a waste of energy. In the long run, IDC if this patch is reverted and the bz is closed as impossible or won't fix or whatever magic wording needs to be used if that's what you prefer/desire. John

ping. Tks, John On 07/16/2018 05:14 PM, John Ferlan wrote:
v2: https://www.redhat.com/archives/libvir-list/2018-July/msg00361.html
Differences to v2:
Patch1: NEW - As a result of code review the suggestion was to utilize the virDomainDefCompatibleDevice in order to make the check more generic to include <disk>'s which were also afflicted with the same problem that I was trying to solve with the former patch1 just for <hostdev>'s. However, this led me down into the abyss of more changes since <disk>'s have multiple <address type='drive'...> target bus types (IDE and SCSI). That means we need to have a mechanism to pass the target bus along so that a SCSI drive address doesn't inadvertently match an IDE drive address. All that is complicated by the way virDomainDefHasDeviceAddress iterates through all the device lists. Whether there are more similar devices I'm assuming will fall out of code review.
Patch2: This moves the virDomainDefHasDeviceAddress into the more common config checking virDomainDefCompatibleDevice method, but now needs to also account for the disk bus issue.
This theoretically could be combined with Patch1, but keeping them separate I would hope makes for simpler code review. I could also move code out of virDomainDefHasDeviceAddressIterator into patch2, but if just felt better in patch1.
Patch3: No changes were made (amazingly so).
In the end quite a bit more complicated
John Ferlan (3): conf: Add @target_bus to virDomainDefHasDeviceAddress qemu: Check for existing address when cold attach device qemu: Use the correct vm def on cold attach
src/conf/domain_conf.c | 54 +++++++++++++++++++++++++++++++++++--- src/conf/domain_conf.h | 3 ++- src/qemu/qemu_driver.c | 59 ++++++++++++++++-------------------------- 3 files changed, 75 insertions(+), 41 deletions(-)
participants (3)
-
John Ferlan
-
Ján Tomko
-
Michal Privoznik