[libvirt] [PATCH v2 0/2] Alter qemu live/cold device attach algorithm

v1: https://www.redhat.com/archives/libvir-list/2018-June/msg00465.html The first patch is essentially a repeat of the v1 patch, but with a more correct commit message. During testing if the same address was supplied for the --config it wasn't checked. The second patch rearranges qemuDomainAttachDeviceLiveAndConfig to perform CONFIG and LIVE actions separately since the generated @devConf or @devLive could be different depending on whether the incoming XML has an <address> or not and whether the corresponding domain parse and post parse algorithms need to generate some sort of address for the device. John Ferlan (2): qemu: Check for existing hostdev address for cold attach device qemu: Use the correct vm def on cold attach src/qemu/qemu_driver.c | 58 +++++++++++++++++++++++++------------------------- 1 file changed, 29 insertions(+), 29 deletions(-) -- 2.14.4

Add a check during qemuDomainAttachDeviceConfig whether the new/incoming <hostdev ...> device would have an existing <address> already and if so fail the attach. This can happen if two hostdev's are added supplying the same address or if the new hostdev address could possibly be a duplicate of an existing SCSI <disk>. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f0fb806fcd..ae8e0e898a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8015,6 +8015,12 @@ qemuDomainAttachDeviceConfig(virDomainDefPtr vmdef, _("device is already in the domain configuration")); return -1; } + if (dev->data.hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainDefHasDeviceAddress(vmdef, dev->data.hostdev->info)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("a device with the same address already exists ")); + return -1; + } if (virDomainHostdevInsert(vmdef, hostdev)) return -1; dev->data.hostdev = NULL; -- 2.14.4

On Tue, Jun 12, 2018 at 10:32:06AM -0400, John Ferlan wrote:
Add a check during qemuDomainAttachDeviceConfig whether the new/incoming <hostdev ...> device would have an existing <address> already and if so fail the attach. This can happen if two hostdev's are added supplying the same address or if the new hostdev address could possibly be a duplicate of an existing SCSI <disk>.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f0fb806fcd..ae8e0e898a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8015,6 +8015,12 @@ qemuDomainAttachDeviceConfig(virDomainDefPtr vmdef, _("device is already in the domain configuration")); return -1; } + if (dev->data.hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainDefHasDeviceAddress(vmdef, dev->data.hostdev->info)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("a device with the same address already exists ")); + return -1; + }
This check feels out of place here. We should be checking for that in postParse. Do we have the same problem on domain startup? Jano
if (virDomainHostdevInsert(vmdef, hostdev)) return -1; dev->data.hostdev = NULL; -- 2.14.4
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 06/13/2018 04:15 AM, Ján Tomko wrote:
On Tue, Jun 12, 2018 at 10:32:06AM -0400, John Ferlan wrote:
Add a check during qemuDomainAttachDeviceConfig whether the new/incoming <hostdev ...> device would have an existing <address> already and if so fail the attach. This can happen if two hostdev's are added supplying the same address or if the new hostdev address could possibly be a duplicate of an existing SCSI <disk>.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f0fb806fcd..ae8e0e898a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8015,6 +8015,12 @@ qemuDomainAttachDeviceConfig(virDomainDefPtr vmdef, _("device is already in the domain configuration")); return -1; } + if (dev->data.hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainDefHasDeviceAddress(vmdef, dev->data.hostdev->info)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("a device with the same address already exists ")); + return -1; + }
This check feels out of place here. We should be checking for that in postParse. Do we have the same problem on domain startup?
I've avoided new post parse checks due to the domain disappearing phenomena. But no, I don't believe the same problem exists there. How would you suggest reproducing that? If one virsh edit's their domain adding two hostdev's with the same <address>, then virDomainDefValidateInternal catches that with the call to virDomainDefCheckDuplicateDriveAddresses. With this patch if one calls attach-device --config with a duplicated <address>, then this check will catch that. This is similar to VIR_DOMAIN_DEVICE_RNG processing in the same code. John
Jano
if (virDomainHostdevInsert(vmdef, hostdev)) return -1; dev->data.hostdev = NULL; -- 2.14.4
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 06/13/2018 07:16 AM, John Ferlan wrote:
On 06/13/2018 04:15 AM, Ján Tomko wrote:
Add a check during qemuDomainAttachDeviceConfig whether the new/incoming <hostdev ...> device would have an existing <address> already and if so fail the attach. This can happen if two hostdev's are added supplying the same address or if the new hostdev address could possibly be a duplicate of an existing SCSI <disk>.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f0fb806fcd..ae8e0e898a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8015,6 +8015,12 @@ qemuDomainAttachDeviceConfig(virDomainDefPtr vmdef, _("device is already in the domain configuration")); return -1; } + if (dev->data.hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainDefHasDeviceAddress(vmdef, dev->data.hostdev->info)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("a device with the same address already exists ")); + return -1; + } This check feels out of place here. We should be checking for that in
On Tue, Jun 12, 2018 at 10:32:06AM -0400, John Ferlan wrote: postParse. Do we have the same problem on domain startup?
I've avoided new post parse checks due to the domain disappearing phenomena.
The difference with this is that any existing domain with this error would have failed to start qemu in previous versions, so it surely would have been fixed. And assignment of addresses to PCI devices that are "coldplugged" (ie those that are added individually with AttachDevice, but that are only added to the persistent config, *not* to the running guest, ie "CONFIG") is handled by a manually added call to virDomainDefPostParse() that is called directly from qemuDomainAttachDeviceConfig: qemuDomainAttachDeviceConfig virDomainDefPostParse qemuDomainDefAssignAddresses (aka assignAddressCallback) qemuDomainAssignAddresses Is there a reason why SCSI addresses aren't assigned by the same function. If we did that then at least everyone would have the same set of problems :-P) On the other hand, for hotplugged PCI devices ("LIVE"), both assigning addresses to devices with no supplied address, and checking for duplicate addresses is handled by qemuDomainEnsurePCIAddress (which calls virDomainPCIAddressEnsureAddr. No, I did not name either of them) - that is called directly from qemuDomainAttach${Blah}Device in qemu_hotplug.c. So in both cases, nothing associated with *PCI* device addresses is done during the *Device* post parse callback, although in the case of CONFIG, addresses are assigned/checked by a "POST-postparse" manual call to the *Domain* PostParse callback. Sigh. I was hoping to arrive at some nugget of good information that would point to the best solution. Instead I've just pointed out how convoluted everything is, and given myself more ideas of other possible bugs. Maybe if I look at it again in the morning with a fresh brain...
But no, I don't believe the same problem exists there. How would you suggest reproducing that?
If one virsh edit's their domain adding two hostdev's with the same <address>, then virDomainDefValidateInternal catches that with the call to virDomainDefCheckDuplicateDriveAddresses. With this patch if one calls attach-device --config with a duplicated <address>, then this check will catch that.
This is similar to VIR_DOMAIN_DEVICE_RNG processing in the same code.
John
Jano
if (virDomainHostdevInsert(vmdef, hostdev)) return -1; dev->data.hostdev = NULL; -- 2.14.4
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 06/13/2018 09:42 PM, Laine Stump wrote:
On 06/13/2018 07:16 AM, John Ferlan wrote:
On 06/13/2018 04:15 AM, Ján Tomko wrote:
Add a check during qemuDomainAttachDeviceConfig whether the new/incoming <hostdev ...> device would have an existing <address> already and if so fail the attach. This can happen if two hostdev's are added supplying the same address or if the new hostdev address could possibly be a duplicate of an existing SCSI <disk>.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f0fb806fcd..ae8e0e898a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8015,6 +8015,12 @@ qemuDomainAttachDeviceConfig(virDomainDefPtr vmdef, _("device is already in the domain configuration")); return -1; } + if (dev->data.hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainDefHasDeviceAddress(vmdef, dev->data.hostdev->info)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("a device with the same address already exists ")); + return -1; + } This check feels out of place here. We should be checking for that in
On Tue, Jun 12, 2018 at 10:32:06AM -0400, John Ferlan wrote: postParse. Do we have the same problem on domain startup?
I've avoided new post parse checks due to the domain disappearing phenomena.
The difference with this is that any existing domain with this error would have failed to start qemu in previous versions, so it surely would have been fixed. And assignment of addresses to PCI devices that are "coldplugged" (ie those that are added individually with AttachDevice, but that are only added to the persistent config, *not* to the running guest, ie "CONFIG") is handled by a manually added call to virDomainDefPostParse() that is called directly from qemuDomainAttachDeviceConfig:
qemuDomainAttachDeviceConfig virDomainDefPostParse qemuDomainDefAssignAddresses (aka assignAddressCallback) qemuDomainAssignAddresses
Is there a reason why SCSI addresses aren't assigned by the same function. If we did that then at least everyone would have the same set of problems :-P)
Not 100% sure of any reason other than perhaps the "relationship" between SCSI hostdev and SCSI disk. The generated address for a <disk> is based on the def->dst (see virDiskNameToIndex). A dst of "sda" would equate to controller=0 and unit=0. A <hostdev> will "find" a spot, but defaults to c=0 & u=0 (virDomainHostdevAssignAddress). There is some collision detection (virDomainDriveAddressIsUsedBy{Disk|Hostdev} and virDomainDefCheckDuplicateDriveAddresses), but the timing of when those checks can be called and when the above check is made is the issue. During parse virDomainHostdevAssignAddress is called if an address is not provided. That calls virDomainControllerSCSINextUnit to ensure no duplicate address is generated by checking both existing SCSI disk and hostdev's. If an address is provided, the virDomainHostdevDefPostParse can only check existing SCSI disks since virDomainHostdevInsert called during qemuDomainAttachDeviceConfig would have put that address on the hostdev list and calling virDomainSCSIDriveAddressIsUsed during post parse instead of just virDomainDriveAddressIsUsedByDisk would result in a conflict with itself. So prior to inserting into the list calling virDomainDefHasDeviceAddress (just like VIR_DOMAIN_DEVICE_RNG does in the same method) ensures the about to be added hostdev doesn't conflict with any existing hostdev because the parse code didn't make that check and checking for that conflict in post parse would cause And yes, if a duplicate was successfully inserted without the check, then startup would have a failure since the validate path would call virDomainDefCheckDuplicateDriveAddresses. If the "decision" is that we can allow the start to fail when someone provides duplicate addresses, then that's fine I can drop this patch. Although perhaps a comment in qemuDomainAttachDeviceConfig would help the next person (or even myself, because I'm sure to forget and see this again).
On the other hand, for hotplugged PCI devices ("LIVE"), both assigning addresses to devices with no supplied address, and checking for duplicate addresses is handled by qemuDomainEnsurePCIAddress (which calls virDomainPCIAddressEnsureAddr. No, I did not name either of them) - that is called directly from qemuDomainAttach${Blah}Device in qemu_hotplug.c.
So in both cases, nothing associated with *PCI* device addresses is done during the *Device* post parse callback, although in the case of CONFIG, addresses are assigned/checked by a "POST-postparse" manual call to the *Domain* PostParse callback.
Sigh. I was hoping to arrive at some nugget of good information that would point to the best solution. Instead I've just pointed out how convoluted everything is, and given myself more ideas of other possible bugs. Maybe if I look at it again in the morning with a fresh brain...
Address generation and checking just feels like one of those bolted onto the side type things when it was realized that we had to do something. PCI took one approach and SCSI a different approach. John
But no, I don't believe the same problem exists there. How would you suggest reproducing that?
If one virsh edit's their domain adding two hostdev's with the same <address>, then virDomainDefValidateInternal catches that with the call to virDomainDefCheckDuplicateDriveAddresses. With this patch if one calls attach-device --config with a duplicated <address>, then this check will catch that.
This is similar to VIR_DOMAIN_DEVICE_RNG processing in the same code.
John
Jano
if (virDomainHostdevInsert(vmdef, hostdev)) return -1; dev->data.hostdev = NULL; -- 2.14.4
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

https://bugzilla.redhat.com/show_bug.cgi?id=1559867 When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in. Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses. In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address. Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and peforming cleanup for both options as necessary. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 ++++++++++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 29 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ae8e0e898a..494141af4a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8473,7 +8473,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, { virDomainDefPtr vmdef = NULL; virQEMUDriverConfigPtr cfg = NULL; - virDomainDeviceDefPtr dev = NULL, dev_copy = NULL; + virDomainDeviceDefPtr devConf = NULL; + virDomainDeviceDefPtr devLive = NULL; int ret = -1; virCapsPtr caps = NULL; unsigned int parse_flags = VIR_DOMAIN_DEF_PARSE_INACTIVE | @@ -8487,45 +8488,39 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; - dev = dev_copy = virDomainDeviceDefParse(xml, vm->def, - caps, driver->xmlopt, - parse_flags); - if (dev == NULL) - goto cleanup; - - if (virDomainDeviceValidateAliasForHotplug(vm, dev, flags) < 0) - goto cleanup; - - if (flags & VIR_DOMAIN_AFFECT_CONFIG && - flags & VIR_DOMAIN_AFFECT_LIVE) { - /* If we are affecting both CONFIG and LIVE - * create a deep copy of device as adding - * to CONFIG takes one instance. - */ - dev_copy = virDomainDeviceDefCopy(dev, vm->def, caps, driver->xmlopt); - if (!dev_copy) - goto cleanup; - } - + /* The config and live post processing address auto-generation algorithms + * rely on the correct vm->def or vm->newDef being passed, so call the + * device parse based on which definition is in use */ if (flags & VIR_DOMAIN_AFFECT_CONFIG) { - /* Make a copy for updated domain. */ vmdef = virDomainObjCopyPersistentDef(vm, caps, driver->xmlopt); if (!vmdef) goto cleanup; - if (virDomainDefCompatibleDevice(vmdef, dev, NULL) < 0) + if (!(devConf = virDomainDeviceDefParse(xml, vmdef, caps, + driver->xmlopt, parse_flags))) + goto cleanup; + + if (virDomainDefCompatibleDevice(vmdef, devConf, NULL) < 0) goto cleanup; - if ((ret = qemuDomainAttachDeviceConfig(vmdef, dev, caps, + + if ((ret = qemuDomainAttachDeviceConfig(vmdef, devConf, caps, parse_flags, driver->xmlopt)) < 0) goto cleanup; } if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (virDomainDefCompatibleDevice(vm->def, dev_copy, NULL) < 0) + if (!(devLive = virDomainDeviceDefParse(xml, vm->def, caps, + driver->xmlopt, parse_flags))) goto cleanup; - if ((ret = qemuDomainAttachDeviceLive(vm, dev_copy, driver)) < 0) + if (virDomainDeviceValidateAliasForHotplug(vm, devLive, flags) < 0) + goto cleanup; + + if (virDomainDefCompatibleDevice(vm->def, devLive, NULL) < 0) + goto cleanup; + + if ((ret = qemuDomainAttachDeviceLive(vm, devLive, driver)) < 0) goto cleanup; /* * update domain status forcibly because the domain status may be @@ -8549,9 +8544,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, cleanup: virDomainDefFree(vmdef); - if (dev != dev_copy) - virDomainDeviceDefFree(dev_copy); - virDomainDeviceDefFree(dev); + virDomainDeviceDefFree(devConf); + virDomainDeviceDefFree(devLive); virObjectUnref(cfg); virObjectUnref(caps); -- 2.14.4

On 06/12/2018 10:32 AM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1559867
When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in.
Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses.
In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address.
Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and peforming cleanup for both options as necessary.
So what happens if someone requests only LIVE for the hotplug of one device, and then both LIVE and CONFIG for another? Does the 2nd device get a different address in the running guest than it has in the persistent config?
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 ++++++++++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 29 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ae8e0e898a..494141af4a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8473,7 +8473,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, { virDomainDefPtr vmdef = NULL; virQEMUDriverConfigPtr cfg = NULL; - virDomainDeviceDefPtr dev = NULL, dev_copy = NULL; + virDomainDeviceDefPtr devConf = NULL; + virDomainDeviceDefPtr devLive = NULL; int ret = -1; virCapsPtr caps = NULL; unsigned int parse_flags = VIR_DOMAIN_DEF_PARSE_INACTIVE | @@ -8487,45 +8488,39 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup;
- dev = dev_copy = virDomainDeviceDefParse(xml, vm->def, - caps, driver->xmlopt, - parse_flags); - if (dev == NULL) - goto cleanup; - - if (virDomainDeviceValidateAliasForHotplug(vm, dev, flags) < 0) - goto cleanup; - - if (flags & VIR_DOMAIN_AFFECT_CONFIG && - flags & VIR_DOMAIN_AFFECT_LIVE) { - /* If we are affecting both CONFIG and LIVE - * create a deep copy of device as adding - * to CONFIG takes one instance. - */ - dev_copy = virDomainDeviceDefCopy(dev, vm->def, caps, driver->xmlopt); - if (!dev_copy) - goto cleanup; - } - + /* The config and live post processing address auto-generation algorithms + * rely on the correct vm->def or vm->newDef being passed, so call the + * device parse based on which definition is in use */ if (flags & VIR_DOMAIN_AFFECT_CONFIG) { - /* Make a copy for updated domain. */ vmdef = virDomainObjCopyPersistentDef(vm, caps, driver->xmlopt); if (!vmdef) goto cleanup;
- if (virDomainDefCompatibleDevice(vmdef, dev, NULL) < 0) + if (!(devConf = virDomainDeviceDefParse(xml, vmdef, caps, + driver->xmlopt, parse_flags))) + goto cleanup; + + if (virDomainDefCompatibleDevice(vmdef, devConf, NULL) < 0) goto cleanup; - if ((ret = qemuDomainAttachDeviceConfig(vmdef, dev, caps, + + if ((ret = qemuDomainAttachDeviceConfig(vmdef, devConf, caps, parse_flags, driver->xmlopt)) < 0) goto cleanup; }
if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (virDomainDefCompatibleDevice(vm->def, dev_copy, NULL) < 0) + if (!(devLive = virDomainDeviceDefParse(xml, vm->def, caps, + driver->xmlopt, parse_flags))) goto cleanup;
- if ((ret = qemuDomainAttachDeviceLive(vm, dev_copy, driver)) < 0) + if (virDomainDeviceValidateAliasForHotplug(vm, devLive, flags) < 0) + goto cleanup; + + if (virDomainDefCompatibleDevice(vm->def, devLive, NULL) < 0) + goto cleanup; + + if ((ret = qemuDomainAttachDeviceLive(vm, devLive, driver)) < 0) goto cleanup; /* * update domain status forcibly because the domain status may be @@ -8549,9 +8544,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm,
cleanup: virDomainDefFree(vmdef); - if (dev != dev_copy) - virDomainDeviceDefFree(dev_copy); - virDomainDeviceDefFree(dev); + virDomainDeviceDefFree(devConf); + virDomainDeviceDefFree(devLive); virObjectUnref(cfg); virObjectUnref(caps);

On 06/12/2018 06:06 PM, Laine Stump wrote:
On 06/12/2018 10:32 AM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1559867
When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in.
Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses.
In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address.
Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and peforming cleanup for both options as necessary.
So what happens if someone requests only LIVE for the hotplug of one device, and then both LIVE and CONFIG for another? Does the 2nd device get a different address in the running guest than it has in the persistent config?
Hmmm... good question. For the live, then live+config case: my test live domain has: ... <adapter name='scsi_host5'/> ... <address type='drive' controller='0' bus='0' target='0' unit='0'/> ... <adapter name='scsi_host6'/> ... <address type='drive' controller='0' bus='0' target='0' unit='1'/> config domain has: ... <adapter name='scsi_host6'/> ... <address type='drive' controller='0' bus='0' target='0' unit='0'/> ... So, yes, different, but then again, unsupplied. We have a chicken and egg problem I think... Still if someone added scsi_host5 to the config afterwards it'd get unit='1'. Do we guarantee anything if someone doesn't supply the address? The virsh man page says: "Note: using of partial device definition XML files may lead to unexpected results as some fields may be autogenerated and thus match devices other than expected." WYSIWYG. John
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 ++++++++++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 29 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ae8e0e898a..494141af4a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8473,7 +8473,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, { virDomainDefPtr vmdef = NULL; virQEMUDriverConfigPtr cfg = NULL; - virDomainDeviceDefPtr dev = NULL, dev_copy = NULL; + virDomainDeviceDefPtr devConf = NULL; + virDomainDeviceDefPtr devLive = NULL; int ret = -1; virCapsPtr caps = NULL; unsigned int parse_flags = VIR_DOMAIN_DEF_PARSE_INACTIVE | @@ -8487,45 +8488,39 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm, if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup;
- dev = dev_copy = virDomainDeviceDefParse(xml, vm->def, - caps, driver->xmlopt, - parse_flags); - if (dev == NULL) - goto cleanup; - - if (virDomainDeviceValidateAliasForHotplug(vm, dev, flags) < 0) - goto cleanup; - - if (flags & VIR_DOMAIN_AFFECT_CONFIG && - flags & VIR_DOMAIN_AFFECT_LIVE) { - /* If we are affecting both CONFIG and LIVE - * create a deep copy of device as adding - * to CONFIG takes one instance. - */ - dev_copy = virDomainDeviceDefCopy(dev, vm->def, caps, driver->xmlopt); - if (!dev_copy) - goto cleanup; - } - + /* The config and live post processing address auto-generation algorithms + * rely on the correct vm->def or vm->newDef being passed, so call the + * device parse based on which definition is in use */ if (flags & VIR_DOMAIN_AFFECT_CONFIG) { - /* Make a copy for updated domain. */ vmdef = virDomainObjCopyPersistentDef(vm, caps, driver->xmlopt); if (!vmdef) goto cleanup;
- if (virDomainDefCompatibleDevice(vmdef, dev, NULL) < 0) + if (!(devConf = virDomainDeviceDefParse(xml, vmdef, caps, + driver->xmlopt, parse_flags))) + goto cleanup; + + if (virDomainDefCompatibleDevice(vmdef, devConf, NULL) < 0) goto cleanup; - if ((ret = qemuDomainAttachDeviceConfig(vmdef, dev, caps, + + if ((ret = qemuDomainAttachDeviceConfig(vmdef, devConf, caps, parse_flags, driver->xmlopt)) < 0) goto cleanup; }
if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (virDomainDefCompatibleDevice(vm->def, dev_copy, NULL) < 0) + if (!(devLive = virDomainDeviceDefParse(xml, vm->def, caps, + driver->xmlopt, parse_flags))) goto cleanup;
- if ((ret = qemuDomainAttachDeviceLive(vm, dev_copy, driver)) < 0) + if (virDomainDeviceValidateAliasForHotplug(vm, devLive, flags) < 0) + goto cleanup; + + if (virDomainDefCompatibleDevice(vm->def, devLive, NULL) < 0) + goto cleanup; + + if ((ret = qemuDomainAttachDeviceLive(vm, devLive, driver)) < 0) goto cleanup; /* * update domain status forcibly because the domain status may be @@ -8549,9 +8544,8 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm,
cleanup: virDomainDefFree(vmdef); - if (dev != dev_copy) - virDomainDeviceDefFree(dev_copy); - virDomainDeviceDefFree(dev); + virDomainDeviceDefFree(devConf); + virDomainDeviceDefFree(devLive); virObjectUnref(cfg); virObjectUnref(caps);

On Tue, Jun 12, 2018 at 06:06:02PM -0400, Laine Stump wrote:
On 06/12/2018 10:32 AM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1559867
When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in.
Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses.
In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address.
Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and peforming cleanup for both options as necessary.
IMO this is a step into the right direction - rather than tuning up the device to be compatible with the live definition and hoping it will work in the persistent definition is just naive. But I feel like I should read the whole parser and post-parser to confidently review this change. Also, even though it's 2018 we still generate the MAC address for interfaces in the parser, so this would result in two different interfaces being added into the definitions.
So what happens if someone requests only LIVE for the hotplug of one device, and then both LIVE and CONFIG for another? Does the 2nd device get a different address in the running guest than it has in the persistent config?
For that case, we should be considering both domain definitions when generating the addreess - at least for PCI addresses, we rely on calling virDomainDefPostParse in qemuDomainAttachDeviceConfig (this in turn calls the per-domain address assignment callback), which seems an overkill if we only added one device. Maybe we need a way to assign the address to just one device that would take both definitions into consideration? That would also let us get rid of that virDomainDefPostParse call. Or we could focus our energy elsehwere. Jano
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 ++++++++++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 29 deletions(-)

On 06/13/2018 07:15 AM, Ján Tomko wrote:
On Tue, Jun 12, 2018 at 06:06:02PM -0400, Laine Stump wrote:
On 06/12/2018 10:32 AM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1559867
When attaching a device to the domain we need to be sure to use the correct domain definition (vm->def or vm->newDef) when calling virDomainDeviceDefParse because the post parse processing algorithms that may assign an address for the device will use whatever domain definition was passed in.
Additionally, some devices (SCSI hostdev and SCSI disk) use algorithms that rely on knowing what already exists of the other type when generating the new device's address. Using the wrong VM definition could result in duplicated addresses.
In the case of the bz, two hostdev's with no domain address provided were added to the running domain's config only. However, the parsing algorithm used the live domain in order to figure out the host device address resulting in the same address being used and a subsequent start failing due to duplicate address.
Fix this by separating the checks/code into CONFIG and LIVE processing using the correct definition for each block and peforming cleanup for both options as necessary.
IMO this is a step into the right direction - rather than tuning up the device to be compatible with the live definition and hoping it will work in the persistent definition is just naive.
But I feel like I should read the whole parser and post-parser to confidently review this change. Also, even though it's 2018 we still generate the MAC address for interfaces in the parser, so this would result in two different interfaces being added into the definitions.
As noted in my response to Laine - the virsh documentation states: "Note: using of partial device definition XML files may lead to unexpected results as some fields may be autogenerated and thus match devices other than expected." So, IOW, WYSIWYG. If a consumer *chooses* to update LIVE for one device and LIVE and CONFIG for another, then can we "assume" that they know what they are doing and that they don't want the first device in the "next" guest. To a degree we are merely doing what we're told. Of course we could shift gears here and require the address to be provided, but for someone that might be an unwelcome change in philosophy. I don't think avoiding address assignment for attach-device --config would be a good idea especially considering the complex relationship with the disk attachment by name that could then conflict with a hostdev.
So what happens if someone requests only LIVE for the hotplug of one device, and then both LIVE and CONFIG for another? Does the 2nd device get a different address in the running guest than it has in the persistent config?
For that case, we should be considering both domain definitions when generating the addreess - at least for PCI addresses, we rely on calling virDomainDefPostParse in qemuDomainAttachDeviceConfig (this in turn calls the per-domain address assignment callback), which seems an overkill if we only added one device. Maybe we need a way to assign the address to just one device that would take both definitions into consideration? That would also let us get rid of that virDomainDefPostParse call.
While possible to consider both, it could be impractical. For some devices there's lots of considerations to be made - for others it's a simple does this already exist. In the long run it's all about understanding each device's particular conflict resolution and capacity comparison issues.
Or we could focus our energy elsehwere.
I like this option... Fix the one offs when/if they show up and move on. If someone, some day has the desire to write patches that will revamp device address assignment, good luck! John
Jano
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 52 ++++++++++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 29 deletions(-)
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Wed, Jun 13, 2018 at 08:26:40AM -0400, John Ferlan wrote:
Or we could focus our energy elsehwere.
I like this option... Fix the one offs when/if they show up and move on. If someone, some day has the desire to write patches that will revamp device address assignment, good luck!
I meant not bothering with fixing this bug either. Jano

On 06/13/2018 08:36 AM, Ján Tomko wrote:
On Wed, Jun 13, 2018 at 08:26:40AM -0400, John Ferlan wrote:
Or we could focus our energy elsehwere.
I like this option... Fix the one offs when/if they show up and move on. If someone, some day has the desire to write patches that will revamp device address assignment, good luck!
I meant not bothering with fixing this bug either.
Jano
That's an option as well, IDC really. It didn't take that much time in order to propose the initial solution or the followup. Probably spent more energy and time typing and considering things based on review questions or comments. If you feel compelled to close the bz, then by all means do that. Another viable solution is proposed here. John

On 06/13/2018 09:00 AM, John Ferlan wrote:
On 06/13/2018 08:36 AM, Ján Tomko wrote:
On Wed, Jun 13, 2018 at 08:26:40AM -0400, John Ferlan wrote:
Or we could focus our energy elsehwere.
I like this option... Fix the one offs when/if they show up and move on. If someone, some day has the desire to write patches that will revamp device address assignment, good luck!
I meant not bothering with fixing this bug either.
Jano That's an option as well, IDC really. It didn't take that much time in order to propose the initial solution or the followup. Probably spent more energy and time typing and considering things based on review questions or comments. If you feel compelled to close the bz, then by all means do that. Another viable solution is proposed here.
Nah, I think fixing the current BZ is a reasonable thing to do - current behavior is clearly wrong, but we don't need to worry about the case I mentioned until someone else complains (it's been like that for a *very* long time, and I think I may be the only one who noticed). My apologies for just dive bombing in with a new corner case without actually reviewing your code - it was the end of the day and I was tired, but also remembered running across that problem a long time ago (and deciding it wasn't worth the effort to fix then either) and just wanted to make other people aware of it before I forgot again :-P. I'll go back and actually review the patches now.

On 06/12/2018 10:32 AM, John Ferlan wrote:
v1: https://www.redhat.com/archives/libvir-list/2018-June/msg00465.html
The first patch is essentially a repeat of the v1 patch, but with a more correct commit message. During testing if the same address was supplied for the --config it wasn't checked.
The second patch rearranges qemuDomainAttachDeviceLiveAndConfig to perform CONFIG and LIVE actions separately since the generated @devConf or @devLive could be different depending on whether the incoming XML has an <address> or not and whether the corresponding domain parse and post parse algorithms need to generate some sort of address for the device.
John Ferlan (2): qemu: Check for existing hostdev address for cold attach device qemu: Use the correct vm def on cold attach
src/qemu/qemu_driver.c | 58 +++++++++++++++++++++++++------------------------- 1 file changed, 29 insertions(+), 29 deletions(-)
ping? Can we come to a consensus either way on this? There's 2 that feel fixing the underlying problem is worth it (myself and Laine) and 1 that would not bother fixing the bug (Jano). Tks - John

ping^2? No other opinions? Tks, John On 06/20/2018 07:14 PM, John Ferlan wrote:
On 06/12/2018 10:32 AM, John Ferlan wrote:
v1: https://www.redhat.com/archives/libvir-list/2018-June/msg00465.html
The first patch is essentially a repeat of the v1 patch, but with a more correct commit message. During testing if the same address was supplied for the --config it wasn't checked.
The second patch rearranges qemuDomainAttachDeviceLiveAndConfig to perform CONFIG and LIVE actions separately since the generated @devConf or @devLive could be different depending on whether the incoming XML has an <address> or not and whether the corresponding domain parse and post parse algorithms need to generate some sort of address for the device.
John Ferlan (2): qemu: Check for existing hostdev address for cold attach device qemu: Use the correct vm def on cold attach
src/qemu/qemu_driver.c | 58 +++++++++++++++++++++++++------------------------- 1 file changed, 29 insertions(+), 29 deletions(-)
ping?
Can we come to a consensus either way on this? There's 2 that feel fixing the underlying problem is worth it (myself and Laine) and 1 that would not bother fixing the bug (Jano).
Tks -
John
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 06/26/2018 12:36 PM, John Ferlan wrote:
ping^2?
No other opinions?
Tks,
John
On 06/20/2018 07:14 PM, John Ferlan wrote:
On 06/12/2018 10:32 AM, John Ferlan wrote:
v1: https://www.redhat.com/archives/libvir-list/2018-June/msg00465.html
The first patch is essentially a repeat of the v1 patch, but with a more correct commit message. During testing if the same address was supplied for the --config it wasn't checked.
The second patch rearranges qemuDomainAttachDeviceLiveAndConfig to perform CONFIG and LIVE actions separately since the generated @devConf or @devLive could be different depending on whether the incoming XML has an <address> or not and whether the corresponding domain parse and post parse algorithms need to generate some sort of address for the device.
John Ferlan (2): qemu: Check for existing hostdev address for cold attach device qemu: Use the correct vm def on cold attach
src/qemu/qemu_driver.c | 58 +++++++++++++++++++++++++------------------------- 1 file changed, 29 insertions(+), 29 deletions(-)
ping?
Can we come to a consensus either way on this? There's 2 that feel fixing the underlying problem is worth it (myself and Laine) and 1 that would not bother fixing the bug (Jano).
I think it's worth fixing too. Michal
participants (4)
-
John Ferlan
-
Ján Tomko
-
Laine Stump
-
Michal Privoznik