New subject: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility

Monday, 27 January 2014

[Adding libvirt list...]

Ian Jackson wrote:
...
 Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork:
SIGCHLD flexibility"):

> BTW, I only see the crash when the save/restore script is running.  I
> stopped the other scripts and domains, running only save/restore on a
> single domain, and see the crash rather quickly (within 10 iterations).
>     

 I'll look at the libvirt code, but:

 With a recurring timeout, how can you ever know it's cancelled ?
 There might be threads out there, which don't hold any locks, which
 are in the process of executing a callback for a timeout.  That might
 be arbitrarily delayed from the pov of the rest of the program.

 E.g.:

  Thread A                                             Thread B

    invoke some libxl operation
 X    do some libxl stuff
 X    register timeout (libxl)
 XV     record timeout info
 X    do some more libxl stuff
      ...
 X    do some more libxl stuff
 X    deregister timeout (libxl internal)
 X     converted to request immediate timeout
 XV     record new timeout info
 X      release libvirt event loop lock
                                             entering libvirt event loop
                                        V     observe timeout is immediate
                                        V      need to do callback
                                                call libxl driver

       entering libvirt event loop
  V     observe timeout is immediate
  V      need to do callback
          call libxl driver
            call libxl
   X          libxl sees timeout is live
   X          libxl does libxl stuff
          libxl driver deregisters
  V         record lack of timeout
          free driver's timeout struct
                                                call libxl
                                       X          libxl sees timeout is dead
                                       X          libxl does nothing
                                              libxl driver deregisters
                                        V       CRASH due to deregistering
                                        V        already-deregistered timeout

 If this is how things are, then I think there is no sane way to use
 libvirt's timeouts (!)

Looking at the libvirt code again, it seems a single thread services the
event loop. See virNetServerRun() in src/util/virnetserver.c. Indeed, I
see the same thread ID in all the timer and fd callbacks. One of the
libvirt core devs can correct me if I'm wrong.

Regards,
Jim

...
 In principle I guess the driver could keep its per-timeout structs
 around forever and remember whether they've been deregistered or not.
 Each one would have to have a lock in it.

 But if you think about it, if you have 10 threads all running the
 event loop and you set a timeout to zero, doesn't that mean that every
 thread's event loop should do the timeout callback as fast as it can ?
 That could be a lot of wasted effort.

 The best solution would appear to be to provide a non-recurring
 callback.

> I'm not so thrilled with the timeout handling code in the libvirt libxl
> driver.  The driver maintains a list of active timeouts because IIRC,
> there were cases when the driver received timeout deregistrations when
> calling libxl_ctx_free, at which point some of the associated structures
> were freed.  The idea was to call libxl_osevent_occurred_timeout on any
> active timeouts before freeing libxlDomainObjPrivate and its contents.
>     

 libxl does deregister fd callbacks in libxl_ctx_free.

 But libxl doesn't currently "deregister" any timeouts in
 libxl_ctx_free; indeed it would be a bit daft for it to do so as at
 libxl_ctx_free there are no aos running so there would be nothing to
 time out.

 But there is a difficulty with timeouts which libxl has set to occur
 immediately but which have not yet actually had the callback.  The the
 application cannot call libxl_ctx_free with such timeouts outstanding,
 because that would imply later calling back into libxl with a stale
 ctx.

 (Looking at the code I see that the "nexi" are never actually freed.
 Bah.)

 Thanks,
 Ian.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility