Daniel P. Berrange wrote:
On Thu, Jan 17, 2013 at 10:29:35PM -0700, Jim Fehlig wrote:
> Jim Fehlig wrote:
>
>> I've been investigating some races in the libxl driver and would like to
>> get comments on some potential solutions.
>>
>> The first race is in the fd/timeout event handling code, which maps
>> libxl's osevent interface to libvirt's event loop interface. This
>> mapping opens the possibility for libvirt's event loop to invoke
>> event callbacks after libxl has already deregistered the event,
>> potentially accessing an event object that has already been freed.
>>
>> One solution to this race I've found successful is reference counting the
>> objects associated with the events. When libxl registers an event, an
>> object encapsulating the event is created and it's reference count is set
>> to 1. When the event is injected into libvirt's event loop, another
>> reference is taken on the object. When libxl deregisters the event, it's
>> reference count is decremented. Once the event is removed from libvirt's
>> event loop, the final reference is decremented and the object is disposed.
>>
>>
> While rebasing this solution to use danpb's recent virObjectLockable
> change, I revisited using only a lock in the libxlDomainObject,
> acquiring the lock in the registration, deregistration, modify, and
> callback functions. I recall some problems with this approach before,
> but now find it to be sufficient. Perhaps a misplaced lock drove me to
> the unneeded complexity of this double lock nonsense...
>
>
>> This approach ensures the object is not disposed until it is removed
>> from libvirt's event loop *and* libxl had explicitly deregistered the event.
>> The notion of an event being 'disabled' found in libvirt's event loop
impl
>> also had to be added to the libxl event object, to ensure the driver doesn't
>> call into libxl for a previously deregistered event.
>>
>> The second race is between destroying a vm (i.e., calling
>> privateDataFreeFunc, which frees the libxl ctx) and deregistration/cleanup
>> of all events that have been registered by libxl.
>>
>> One solution for this race is to convert libxlDomainObjPrivate to a
>> virObject and increment its reference for each fd and timeout registration.
>>
>>
> Only change here is using the new virObjectLockable.
>
I'm surprised that you need to have a separate lock for libxlDomainObjPrivate.
The lifetime of that object is tied to the lifetime of the virDomainObjPtr,
which already has a lock. Isn't it sufficient to do locking on the latter,
whenever the libxlDomainObjPrivate need protecting ?
Yeah, I thought it would be sufficient too, but fd/timeout events can
happen during vm operations where the virDomainObj is locked. E.g.
during vm creation, the virDomainObj is locked but fd/timeout events
occur as libxl reads/writes to xenstore, sets watches, reads a
kernel/initrd from host, etc. It seems cleaner to have a lock in the
libxlDomainObjPrivate for accessing libxl's event registrations.
Regards,
Jim