
On Wed, Dec 17, 2014 at 01:58:28PM +0100, Ján Tomko wrote:
On 12/17/2014 01:21 PM, Daniel P. Berrange wrote:
On Wed, Dec 17, 2014 at 01:18:41PM +0100, Michal Privoznik wrote:
On 16.12.2014 17:41, Ján Tomko wrote:
The domain might disappear during the time in monitor when the virDomainObjPtr is unlocked, so the caller needs to check if it's still alive.
Introduce a helper function for the callers that want to report an error in this case. --- src/qemu/qemu_domain.c | 17 +++++++++++++++++ src/qemu/qemu_domain.h | 3 +++ 2 files changed, 20 insertions(+)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8dd427a..d074429 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1606,6 +1606,23 @@ void qemuDomainObjExitMonitor(virQEMUDriverPtr driver, qemuDomainObjExitMonitorInternal(driver, obj); }
+/* obj must NOT be locked before calling + * + * Same as qemuDomainObjExitMonitor, but reports an error and + * returns -1 if the domain is no longer alive after exiting the monitor
Can you state explicitly that this function needs to be called whenever the domain object or vm->def, or privdata (any runtime info held within vm in general) is used after ExitMonitor()?
Right, I can add this to qemu/THREADS.txt as well.
Are there any cases where that is not true ? eg could we just avoid introducing this new function, by making the existing funtion check liveliness every time. That would avoid the class of bugs you outline above entirely.
I was worried about backports, but new code checking the return value would fail to compile with the void function in older libvirt.
We should add ATTRIBUTE_RETURNCHECK to force all callers to check the return status too
There are functions that only call EndJob and return after exiting the monitor. But if the domain crashed before the API even returned, can we really claim success?
There are some APIs where we do if (running) do x else do y even with those though, I think we're justified in reporting an error if the domain stopped at any point in the 'do x' code path - after all we're already doing to be doing that if it happens while we're issuing a mointor API call. So this endjob check is only extending that window of opportunity for errors by a couple more instructions. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|