On Wed, Dec 17, 2014 at 01:58:28PM +0100, Ján Tomko wrote:
On 12/17/2014 01:21 PM, Daniel P. Berrange wrote:
> On Wed, Dec 17, 2014 at 01:18:41PM +0100, Michal Privoznik wrote:
>> On 16.12.2014 17:41, Ján Tomko wrote:
>>> The domain might disappear during the time in monitor when
>>> the virDomainObjPtr is unlocked, so the caller needs to check
>>> if it's still alive.
>>>
>>> Introduce a helper function for the callers that want to report
>>> an error in this case.
>>> ---
>>> src/qemu/qemu_domain.c | 17 +++++++++++++++++
>>> src/qemu/qemu_domain.h | 3 +++
>>> 2 files changed, 20 insertions(+)
>>>
>>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>>> index 8dd427a..d074429 100644
>>> --- a/src/qemu/qemu_domain.c
>>> +++ b/src/qemu/qemu_domain.c
>>> @@ -1606,6 +1606,23 @@ void qemuDomainObjExitMonitor(virQEMUDriverPtr
driver,
>>> qemuDomainObjExitMonitorInternal(driver, obj);
>>> }
>>>
>>> +/* obj must NOT be locked before calling
>>> + *
>>> + * Same as qemuDomainObjExitMonitor, but reports an error and
>>> + * returns -1 if the domain is no longer alive after exiting the monitor
>>
>> Can you state explicitly that this function needs to be called whenever the
>> domain object or vm->def, or privdata (any runtime info held within vm in
>> general) is used after ExitMonitor()?
>
Right, I can add this to qemu/THREADS.txt as well.
> Are there any cases where that is not true ? eg could we just avoid
> introducing this new function, by making the existing funtion check
> liveliness every time. That would avoid the class of bugs you outline
> above entirely.
I was worried about backports, but new code checking the return value would
fail to compile with the void function in older libvirt.
We should add ATTRIBUTE_RETURNCHECK to force all callers to check
the return status too
There are functions that only call EndJob and return after exiting
the
monitor. But if the domain crashed before the API even returned, can we really
claim success?
There are some APIs where we do
if (running)
do x
else
do y
even with those though, I think we're justified in reporting an error
if the domain stopped at any point in the 'do x' code path - after all
we're already doing to be doing that if it happens while we're issuing
a mointor API call. So this endjob check is only extending that window
of opportunity for errors by a couple more instructions.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|