[libvirt] ambiguous ret of qemuDomainDetachVirtioDiskDevice

static int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr detach) { ....... rc = qemuDomainWaitForDeviceRemoval(vm); if (rc == 0 || rc == 1) ret = qemuDomainRemoveDiskDevice(driver, vm, detach); else ret = 0; /*the return value of 2 is dismissed here, which refers to ETIMEOUT.*/ ........ } ------------------------------------ If it timeouts when qemu tries to del the device, the return value would be modified from 2 to 0 in function qemuDomainDetachVirtioDiskDevice(), which means that, the users would be misleaded that the device has been deleted, however, the device maybe probably failed to be detached after timeout and still in use. That is to say, the function qemuDomainDetachVirtioDiskDevice()'s return value is ambiguous when it's 0, maybe successful, or timeout. Will it be better to pass ETIMEOUT to user? or any other advises? for example, let users themselves dumpxml the guest to check whether the device has been actually detached or not? -- Oscar oscar.zhangbo@huawei.com

On Tue, Jul 28, 2015 at 04:25:13PM +0800, zhang bo wrote:
static int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr detach) { .......
rc = qemuDomainWaitForDeviceRemoval(vm); if (rc == 0 || rc == 1) ret = qemuDomainRemoveDiskDevice(driver, vm, detach); else ret = 0; /*the return value of 2 is dismissed here, which refers to ETIMEOUT.*/ ........ }
------------------------------------
If it timeouts when qemu tries to del the device, the return value would be modified from 2 to 0 in function qemuDomainDetachVirtioDiskDevice(), which means that, the users would be misleaded that the device has been deleted, however, the device maybe probably failed to be detached after timeout and still in use.
This is intentional and documented: http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainDetachDeviceFla... Unplugging a disk requires guest cooperation, so the best we can do is ask qemu to unplug it and wait for a while.
That is to say, the function qemuDomainDetachVirtioDiskDevice()'s return value is ambiguous when it's 0, maybe successful, or timeout. Will it be better to pass ETIMEOUT to user? or any other advises? for example, let users themselves dumpxml the guest to check whether the device has been actually detached or not?
Either dump the XML, or wait for the VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED event, as the API documentation suggests. Jan

On 2015/7/28 16:33, Ján Tomko wrote:
On Tue, Jul 28, 2015 at 04:25:13PM +0800, zhang bo wrote:
static int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr detach) { .......
rc = qemuDomainWaitForDeviceRemoval(vm); if (rc == 0 || rc == 1) ret = qemuDomainRemoveDiskDevice(driver, vm, detach); else ret = 0; /*the return value of 2 is dismissed here, which refers to ETIMEOUT.*/ ........ }
------------------------------------
If it timeouts when qemu tries to del the device, the return value would be modified from 2 to 0 in function qemuDomainDetachVirtioDiskDevice(), which means that, the users would be misleaded that the device has been deleted, however, the device maybe probably failed to be detached after timeout and still in use.
This is intentional and documented: http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainDetachDeviceFla...
Unplugging a disk requires guest cooperation, so the best we can do is ask qemu to unplug it and wait for a while.
That is to say, the function qemuDomainDetachVirtioDiskDevice()'s return value is ambiguous when it's 0, maybe successful, or timeout. Will it be better to pass ETIMEOUT to user? or any other advises? for example, let users themselves dumpxml the guest to check whether the device has been actually detached or not?
Either dump the XML, or wait for the VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED event, as the API documentation suggests.
Jan
It seems to have fixed the problem by dumping the XML or wait for the DEVICE_REMOVED event. However, it seems to make nova or other apps to do more checking work, they need to dump XML or wait the event even if the device has already been successfully removed. which is unnecessary. I think, it's better to return ETIMEOUT and let nova to dumpxml or wait the event at this situation, rather than always doing that job. It maybe a better design, what's your opinion? -- Oscar oscar.zhangbo@huawei.com

On 2015/7/30 17:41, zhang bo wrote:
On 2015/7/28 16:33, Ján Tomko wrote:
On Tue, Jul 28, 2015 at 04:25:13PM +0800, zhang bo wrote:
static int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr detach) { .......
rc = qemuDomainWaitForDeviceRemoval(vm); if (rc == 0 || rc == 1) ret = qemuDomainRemoveDiskDevice(driver, vm, detach); else ret = 0; /*the return value of 2 is dismissed here, which refers to ETIMEOUT.*/ ........ }
------------------------------------
If it timeouts when qemu tries to del the device, the return value would be modified from 2 to 0 in function qemuDomainDetachVirtioDiskDevice(), which means that, the users would be misleaded that the device has been deleted, however, the device maybe probably failed to be detached after timeout and still in use.
This is intentional and documented: http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainDetachDeviceFla...
Unplugging a disk requires guest cooperation, so the best we can do is ask qemu to unplug it and wait for a while.
That is to say, the function qemuDomainDetachVirtioDiskDevice()'s return value is ambiguous when it's 0, maybe successful, or timeout. Will it be better to pass ETIMEOUT to user? or any other advises? for example, let users themselves dumpxml the guest to check whether the device has been actually detached or not?
Either dump the XML, or wait for the VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED event, as the API documentation suggests.
Jan
It seems to have fixed the problem by dumping the XML or wait for the DEVICE_REMOVED event. However, it seems to make nova or other apps to do more checking work, they need to dump XML or wait the event even if the device has already been successfully removed. which is unnecessary.
I think, it's better to return ETIMEOUT and let nova to dumpxml or wait the event at this situation, rather than always doing that job.
It maybe a better design, what's your opinion?
After thinking twice, it's an async job, thus returning 0 is acceptable, right? -- Oscar oscar.zhangbo@huawei.com

On Fri, Jul 31, 2015 at 08:12:01AM +0800, zhang bo wrote:
On 2015/7/30 17:41, zhang bo wrote:
On 2015/7/28 16:33, Ján Tomko wrote:
On Tue, Jul 28, 2015 at 04:25:13PM +0800, zhang bo wrote:
static int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr detach) { .......
rc = qemuDomainWaitForDeviceRemoval(vm); if (rc == 0 || rc == 1) ret = qemuDomainRemoveDiskDevice(driver, vm, detach); else ret = 0; /*the return value of 2 is dismissed here, which refers to ETIMEOUT.*/ ........ }
------------------------------------
If it timeouts when qemu tries to del the device, the return value would be modified from 2 to 0 in function qemuDomainDetachVirtioDiskDevice(), which means that, the users would be misleaded that the device has been deleted, however, the device maybe probably failed to be detached after timeout and still in use.
This is intentional and documented: http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainDetachDeviceFla...
Unplugging a disk requires guest cooperation, so the best we can do is ask qemu to unplug it and wait for a while.
That is to say, the function qemuDomainDetachVirtioDiskDevice()'s return value is ambiguous when it's 0, maybe successful, or timeout. Will it be better to pass ETIMEOUT to user? or any other advises? for example, let users themselves dumpxml the guest to check whether the device has been actually detached or not?
Either dump the XML, or wait for the VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED event, as the API documentation suggests.
Jan
It seems to have fixed the problem by dumping the XML or wait for the DEVICE_REMOVED event. However, it seems to make nova or other apps to do more checking work, they need to dump XML or wait the event even if the device has already been successfully removed. which is unnecessary.
I think, it's better to return ETIMEOUT and let nova to dumpxml or wait the event at this situation, rather than always doing that job.
The app would have to listen to the event before issuing the API - otherwise the event might arrive after the client app processes the ETIMEOUT return, but before it registers the event. I think in this case it's simpler to process the event regardless of the return value.
It maybe a better design, what's your opinion?
After thinking twice, it's an async job, thus returning 0 is acceptable, right?
Yes, that was the intention of the patch adding waiting for the event. But for the most cases, where the guest unplugs the device under 5 seconds, the API is synchronnous. Before that, the API returned 0 regardless of whether the device was unplugged or not, so this did not make matters any worse. Jan
participants (2)
-
Ján Tomko
-
zhang bo