On Thu, Jan 15, 2009 at 03:12:11PM +0000, Daniel P. Berrange wrote:
On Thu, Jan 15, 2009 at 04:03:16PM +0100, Daniel Veillard wrote:
> On Tue, Jan 13, 2009 at 05:41:48PM +0000, Daniel P. Berrange wrote:
> > This patch re-writes the code for dispatching RPC calls in the
> > remote driver to allow use from multiple threads. Only one thread
> > is allowed to send/recv on the socket at a time though. If another
> > thread comes along it will put itself on a queue and go to sleep.
> > The first thread may actually get around to transmitting the 2nd
> > thread's request while it is waiting for its own reply. It may
> > even get the 2nd threads reply, if its own RPC call is being really
> > slow. So when a thread wakes up from sleeping, it has to check
> > whether its own RPC call has already been processed. Likewise when
> > a thread owning the socket finishes with its own wor, it may have
> > to pass the buck to another thread. The upshot of this, is that
> > we have mutliple RPC calls executing in parallel, and requests+reply
> > are no longer guarenteed to be FIFO on the wire if talking to a new
> > enough server.
> >
> > This refactoring required use of a self-pipe/poll trick for sync
> > between threads, but fortunately gnulib now provides this on Windows
> > too, so there's no compatability problem there.
>
> The new code is actually a bit easier to read than the old one I
> think, but I didn't grasp all the details I must admit.
> The only worry I have with the "pass the buck" scheme is the
> piling up on recursive calls, I don't think there is any risk with
> the normal libvirt APIs as they are all 'terminal calls' in a sense,
> but I'm wondering what's happening say in conjunction with a high
> flow of events back to a client, the client doing calls as a result
> of the events etc ... Seems we are safe because no direct call from
> within the library is triggered by the reception of an event.
We shouldn't get recursion here. There are two scenarios
1. Thread is in the call() function, when an event arrives
-> The event is put on a queue, and a 0 second timer is
activated in the app's event loop
-> Once call() finishes, the 0 second timer fires, and the
event is dispatched to the app.
2. Event arrives when no one is in call() function.
-> The event is dispatched to app straightaway
Now, the apps' callback which receives the event, may in turn
make libvirt calls. This won't cause any recursion because the
two scenarios above both guarentee that the event callback is
not run from within the contxt of a call() command.
When processing queued events we are also careful to handle our
data structures such that a different thread can still safely
make calls / receive & queue more events.
There's always a possible impl bug in there, but I believe the general
structure / design is correctly coping the neccessary scenarios
Okay, thanks, I agree that the best at this point is to push it to
get broader testing,
Daniel
--
Daniel Veillard | libxml Gnome XML XSLT toolkit
http://xmlsoft.org/
daniel(a)veillard.com | Rpmfind RPM search engine
http://rpmfind.net/
http://veillard.com/ | virtualization library
http://libvirt.org/