[libvirt] [PATCH] hotplug: Fix libvirtd crash on qemu-attached guest

https://bugzilla.redhat.com/show_bug.cgi?id=1141621 Using qemu-attach to attach to a qemu created process and then attempting to virsh detach-interface caused a libvirtd crash since the assumption was that device aliases were in place and that assumption is not necessarily true for the qemu-attach environment. Add some more verbiage to the virsh man page. Signed-off-by: John Ferlan <jferlan@redhat.com> --- Notes: * The bz also mentions a failure for the hotplug/virsh attach-interface, but that has been resolved by commit id 'ba7468db' * For the bz, the attempt to attach-interface/hotplug still is not successful, since device aliases are not set for the PCI controller. * I did try adding a call to qemuAssignDeviceAliases() during the qemu- attach code; however, that failed as well (perhaps for a different reason) as the attach to PCI slot 3 function 0 was not available for rtl8139 since it was being used by e1000 * If it's desired to add the alias call in that's fine, then that could be added in qemuDomainQemuAttach prior to the qemuDomainAssignAddresses call similar to qemuConnectDomainXMLToNative. src/qemu/qemu_hotplug.c | 3 ++- tools/virsh.pod | 5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 7bc19cd..b7514ce 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3520,7 +3520,8 @@ qemuDomainDetachNetDevice(virConnectPtr conn, qemuDomainMarkDeviceForRemoval(vm, &detach->info); qemuDomainObjEnterMonitor(driver, vm); - if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE)) { + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE) && + detach->info.alias) { if (qemuMonitorDelDevice(priv->mon, detach->info.alias) < 0) { qemuDomainObjExitMonitor(driver, vm); virDomainAuditNet(vm, detach, NULL, "detach", false); diff --git a/tools/virsh.pod b/tools/virsh.pod index 9919f92..947adaf 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -3686,8 +3686,9 @@ using the UNIX driver. Ideally the process will also have had the Not all functions of libvirt are expected to work reliably after attaching to an externally launched QEMU process. There may be -issues with the guest ABI changing upon migration, and hotunplug -may not work. +issues with the guest ABI changing upon migration and device hotplug +or hotunplug may not work. The attached environment should be considered +primarily read-only. =item B<qemu-monitor-command> I<domain> { [I<--hmp>] | [I<--pretty>] } I<command>... -- 1.9.3

On 09/22/2014 02:49 PM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1141621
Using qemu-attach to attach to a qemu created process and then attempting to virsh detach-interface caused a libvirtd crash since the assumption was that device aliases were in place and that assumption is not necessarily true for the qemu-attach environment.
Hmm. I'm wondering whether it is better to patch qemu-attach to populate alias information at attach time rather than having to chase all the places that will later fail when the assumption is broken. While your patch is certainly more appealing for being smaller, it makes me wonder if it is the only such place bitten by the assumptions.
Add some more verbiage to the virsh man page.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- Notes: * The bz also mentions a failure for the hotplug/virsh attach-interface, but that has been resolved by commit id 'ba7468db' * For the bz, the attempt to attach-interface/hotplug still is not successful, since device aliases are not set for the PCI controller. * I did try adding a call to qemuAssignDeviceAliases() during the qemu- attach code; however, that failed as well (perhaps for a different reason) as the attach to PCI slot 3 function 0 was not available for rtl8139 since it was being used by e1000
That is, I'm wondering if this approach should be used after all, and we have yet another problem to solve first.
* If it's desired to add the alias call in that's fine, then that could be added in qemuDomainQemuAttach prior to the qemuDomainAssignAddresses call similar to qemuConnectDomainXMLToNative.
src/qemu/qemu_hotplug.c | 3 ++- tools/virsh.pod | 5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 7bc19cd..b7514ce 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3520,7 +3520,8 @@ qemuDomainDetachNetDevice(virConnectPtr conn, qemuDomainMarkDeviceForRemoval(vm, &detach->info);
qemuDomainObjEnterMonitor(driver, vm); - if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE)) { + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE) && + detach->info.alias) { if (qemuMonitorDelDevice(priv->mon, detach->info.alias) < 0) { qemuDomainObjExitMonitor(driver, vm); virDomainAuditNet(vm, detach, NULL, "detach", false); diff --git a/tools/virsh.pod b/tools/virsh.pod index 9919f92..947adaf 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -3686,8 +3686,9 @@ using the UNIX driver. Ideally the process will also have had the
Not all functions of libvirt are expected to work reliably after attaching to an externally launched QEMU process. There may be -issues with the guest ABI changing upon migration, and hotunplug -may not work. +issues with the guest ABI changing upon migration and device hotplug +or hotunplug may not work. The attached environment should be considered +primarily read-only.
This doc hunk is fine to push now, whether or not we have consensus on the code side. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 09/22/2014 05:23 PM, Eric Blake wrote:
On 09/22/2014 02:49 PM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1141621
Using qemu-attach to attach to a qemu created process and then attempting to virsh detach-interface caused a libvirtd crash since the assumption was that device aliases were in place and that assumption is not necessarily true for the qemu-attach environment.
Hmm. I'm wondering whether it is better to patch qemu-attach to populate alias information at attach time rather than having to chase all the places that will later fail when the assumption is broken. While your patch is certainly more appealing for being smaller, it makes me wonder if it is the only such place bitten by the assumptions.
I took this approach mainly because it seemed that generating aliases in the qemu-attach path was explicitly not done for a reason while it was done for the qemuConnectDomainXMLToNative and qemuDomainQemuAttach paths. For each of the other attach paths, there are the separate calls to the specific qemuAssignDevice*Alias functions. So it does seem that this qemu-attach path is the only current one without the aliases set. If there is some path in the future that doesn't set the alias properly, then at the very least we don't have a libvirtd crash at least for this path. FWIW: If I modify the patch as follows: - if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE) && - detach->info.alias) { + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE)) { + if (!detach->info.alias) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + ("device alias not found: cannot detach net " + "device not properly attached")); + qemuDomainObjExitMonitor(driver, vm); + goto cleanup; + } then rerun the sequence of steps, I get: error: Failed to detach interface error: operation failed: device alias not found: cannot detach net device not properly attached If I then add qemuAssignDeviceAliases() as described below and rerun, the error changes to: error: Failed to detach interface error: operation failed: detaching net0 device failed: Device 'net0' not found Since the original start didn't use the "-netdev/-device" syntax that would be expected if the DRIVE capability was set. So no worse than we were before. Should I just post the additional patch calling qemuAssignDeviceAliases and go from there?
Add some more verbiage to the virsh man page.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- Notes: * The bz also mentions a failure for the hotplug/virsh attach-interface, but that has been resolved by commit id 'ba7468db' * For the bz, the attempt to attach-interface/hotplug still is not successful, since device aliases are not set for the PCI controller. * I did try adding a call to qemuAssignDeviceAliases() during the qemu- attach code; however, that failed as well (perhaps for a different reason) as the attach to PCI slot 3 function 0 was not available for rtl8139 since it was being used by e1000
That is, I'm wondering if this approach should be used after all, and we have yet another problem to solve first.
Using virsh/libvirt, I can define/start a guest without a network and add one... So perhaps the qemu-attach path is "unique" with respect to how qemu defines things and perhaps there's something "hidden"/unknown causing e1000 to grab PCI slot 3. I didn't dig into that. John
* If it's desired to add the alias call in that's fine, then that could be added in qemuDomainQemuAttach prior to the qemuDomainAssignAddresses call similar to qemuConnectDomainXMLToNative.
src/qemu/qemu_hotplug.c | 3 ++- tools/virsh.pod | 5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 7bc19cd..b7514ce 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3520,7 +3520,8 @@ qemuDomainDetachNetDevice(virConnectPtr conn, qemuDomainMarkDeviceForRemoval(vm, &detach->info);
qemuDomainObjEnterMonitor(driver, vm); - if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE)) { + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE) && + detach->info.alias) { if (qemuMonitorDelDevice(priv->mon, detach->info.alias) < 0) { qemuDomainObjExitMonitor(driver, vm); virDomainAuditNet(vm, detach, NULL, "detach", false); diff --git a/tools/virsh.pod b/tools/virsh.pod index 9919f92..947adaf 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -3686,8 +3686,9 @@ using the UNIX driver. Ideally the process will also have had the
Not all functions of libvirt are expected to work reliably after attaching to an externally launched QEMU process. There may be -issues with the guest ABI changing upon migration, and hotunplug -may not work. +issues with the guest ABI changing upon migration and device hotplug +or hotunplug may not work. The attached environment should be considered +primarily read-only.
This doc hunk is fine to push now, whether or not we have consensus on the code side.

On Wed, Sep 24, 2014 at 06:19:14PM -0400, John Ferlan wrote:
On 09/22/2014 05:23 PM, Eric Blake wrote:
On 09/22/2014 02:49 PM, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1141621
Using qemu-attach to attach to a qemu created process and then attempting to virsh detach-interface caused a libvirtd crash since the assumption was that device aliases were in place and that assumption is not necessarily true for the qemu-attach environment.
Hmm. I'm wondering whether it is better to patch qemu-attach to populate alias information at attach time rather than having to chase all the places that will later fail when the assumption is broken. While your patch is certainly more appealing for being smaller, it makes me wonder if it is the only such place bitten by the assumptions.
I took this approach mainly because it seemed that generating aliases in the qemu-attach path was explicitly not done for a reason while it was done for the qemuConnectDomainXMLToNative and qemuDomainQemuAttach paths. For each of the other attach paths, there are the separate calls to the specific qemuAssignDevice*Alias functions. So it does seem that this qemu-attach path is the only current one without the aliases set.
The difficulty in assigning aliases is that the QEMU cli args may be in a format we don't expect. That said, when attaching we already split the args into two groups - those we can parse into something we understand and those we just leave as passthrough args. I guess we could make setting an alias be a prerequisite for the former group.
If there is some path in the future that doesn't set the alias properly, then at the very least we don't have a libvirtd crash at least for this path.
I think it is a good idea to be paranoid about missing alias values, though I wouldn't be surprised if there are many, many cases where a missing alias could cause us trouble.
FWIW: If I modify the patch as follows:
- if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE) && - detach->info.alias) { + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE)) { + if (!detach->info.alias) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + ("device alias not found: cannot detach net " + "device not properly attached")); + qemuDomainObjExitMonitor(driver, vm); + goto cleanup; + }
then rerun the sequence of steps, I get:
error: Failed to detach interface error: operation failed: device alias not found: cannot detach net device not properly attached
If I then add qemuAssignDeviceAliases() as described below and rerun, the error changes to:
error: Failed to detach interface error: operation failed: detaching net0 device failed: Device 'net0' not found
Since the original start didn't use the "-netdev/-device" syntax that would be expected if the DRIVE capability was set. So no worse than we were before.
Should I just post the additional patch calling qemuAssignDeviceAliases and go from there?
So you're testing with a setup where the user had used legacy -net setup. It is reasonable to say we will never expect that to succeeed at hotunplug, so from that point of view it doesn't matter whether we try assigning aliases or just catch the missing alias. If, however, we look at it from the POV of a user who *has* used the correct -netdev + -device syntax with qemu-attach, then it is reasonable to expect that hotunplug should succeeed. So for that scenario, calling the function qemuAssignDeviceAliases is the only approach that will work. So I guess this is the approach we should take, since it makes the good scenarios work and is no worse for the bad scenarios. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (3)
-
Daniel P. Berrange
-
Eric Blake
-
John Ferlan