Re: [libvirt] [PATCH] replace last instances of close()

10 Nov 2010

On 11/10/2010 05:41 AM, Daniel P. Berrange wrote:
...
On Tue, Nov 09, 2010 at 07:43:23PM -0500, Stefan Berger wrote:
...
I am replacing the last instances of close() I found with VIR_CLOSE() /
VIR_FORCE_CLOSE respectively.
The first patch of virsh I missed out on previously.
The 2nd patch I had left out intentionally to look at it more carefully:
The 'closed' variable could be easily removed since it wasn't used
anywhere else. The possible race condition that could result from the
filedescriptor being closed and not set to -1 (and possibly let us write
into 'something' totally different if the fd was allocated by another
thread) seems to be prevented by the qemuMonitorLock() already placed
around the code that reads from or writes to the fd. So the change of
this code as shown in the patch should not have any side-effects.
Signed-off-by: Stefan Berger<stefanb@us.ibm.com>

--- libvirt-acl.orig/src/qemu/qemu_monitor.c
+++ libvirt-acl/src/qemu/qemu_monitor.c
@@ -73,8 +73,6 @@ struct _qemuMonitor {
/* If the monitor EOF callback is currently active (stops more
commands being run) */
      unsigned eofcb: 1;
-    /* If the monitor is in process of shutting down */
-    unsigned closed: 1;
unsigned json: 1;
  };
@@ -692,17 +690,11 @@ void qemuMonitorClose(qemuMonitorPtr mon
      VIR_DEBUG("mon=%p", mon);
qemuMonitorLock(mon);
-    if (!mon->closed) {
+
+    if (mon->fd>= 0) {
          if (mon->watch)
              virEventRemoveHandle(mon->watch);
-        if (mon->fd != -1)
-            close(mon->fd);
-        /* NB: ordinarily one might immediately set mon->watch to -1
-         * and mon->fd to -1, but there may be a callback active
-         * that is still relying on these fields being valid. So
-         * we merely close them, but not clear their values and
-         * use this explicit 'closed' flag to track this state */
-        mon->closed = 1;
+        VIR_FORCE_CLOSE(mon->fd);
      }
Err, the comment you deleted here explains why this change is not
safe.
If I looked at the code correctly then this here is the callback:

static void
qemuMonitorIO(int watch, int fd, int events, void *opaque) {
     qemuMonitorPtr mon = opaque;
     int quit = 0, failed = 0;

     qemuMonitorLock(mon);
     qemuMonitorRef(mon);
#if DEBUG_IO
     VIR_DEBUG("Monitor %p I/O on watch %d fd %d events %d", mon, watch, f
#endif

     if (mon->fd != fd || mon->watch != watch) {
         VIR_ERROR(_("event from unexpected fd %d!=%d / watch %d!=%d"), mo
         failed = 1;
     } else {
[...]


It grabs the lock and then subsequently, with the lock held, calls into 
qemuMonitorIOWrite and qemuMonitorIORead (not shown, this is done in the 
[...]), which then access the mon->fd. Now the above VIR_FORCE_CLOSE() 
function is also being called with the lock held, thus serializes the 
access to the file descriptor (mon->fd). When the 'if (mon->fd != fd)' 
is evaluate (while the lock is held) after the fd was closed and now set 
to -1, then we won't read from a non-open file descriptor anymore, which 
would otherwise occur in the else branch. Otherwise, if fd->mon != -1, 
it would do the reads and writes as intended and the above close could 
not happen while it is doing that.

When I made the above change I first introduced another lock 
'mon->monFDLock' to serialize the access to the fd, but then I saw that 
the qemuMonitorLock() was already doing that as well, so I ended up 
removing it again entirely. Well, to me this was convincing so that the 
above changes seemed 'safe'. Actually, I think the previous code could 
write and read to a file descriptor that doesn't communicate with the 
monitor anymore but may belong to another thread now (since the mon->fd 
wasn't set to -1, but kept as-is). So, to me this is actually an 
improvement.

Regards,
    Stefan
...
Regards,
Daniel