On Mon, Jan 27, 2014 at 06:39:36PM -0700, Jim Fehlig wrote:
[Adding libvirt list...]
Ian Jackson wrote:
> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD
flexibility"):
>
>> BTW, I only see the crash when the save/restore script is running. I
>> stopped the other scripts and domains, running only save/restore on a
>> single domain, and see the crash rather quickly (within 10 iterations).
>>
>
> I'll look at the libvirt code, but:
>
> With a recurring timeout, how can you ever know it's cancelled ?
> There might be threads out there, which don't hold any locks, which
> are in the process of executing a callback for a timeout. That might
> be arbitrarily delayed from the pov of the rest of the program.
>
> E.g.:
>
> Thread A Thread B
>
> invoke some libxl operation
> X do some libxl stuff
> X register timeout (libxl)
> XV record timeout info
> X do some more libxl stuff
> ...
> X do some more libxl stuff
> X deregister timeout (libxl internal)
> X converted to request immediate timeout
> XV record new timeout info
> X release libvirt event loop lock
> entering libvirt event loop
> V observe timeout is immediate
> V need to do callback
> call libxl driver
>
> entering libvirt event loop
> V observe timeout is immediate
> V need to do callback
> call libxl driver
> call libxl
> X libxl sees timeout is live
> X libxl does libxl stuff
> libxl driver deregisters
> V record lack of timeout
> free driver's timeout struct
> call libxl
> X libxl sees timeout is dead
> X libxl does nothing
> libxl driver deregisters
> V CRASH due to deregistering
> V already-deregistered timeout
>
> If this is how things are, then I think there is no sane way to use
> libvirt's timeouts (!)
>
Looking at the libvirt code again, it seems a single thread services the
event loop. See virNetServerRun() in src/util/virnetserver.c. Indeed, I
see the same thread ID in all the timer and fd callbacks. One of the
libvirt core devs can correct me if I'm wrong.
Yes, you are correct. The threading model for libvirtd is that the
process thread leader executes the event loop, dealing with timer
and file descriptor I/O callbacks. There are also 'n' worker threads
which exclusively handle public API calls from libvirt clients. IOW
all your timer callbacks will be in one thread - which also means
you want your timer callbacks to be fast to execute. If you have any
very slow code you'll want to spawn a temporary worker thread from
the timer callback to do the slow work.
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|