On 11/10/2010 07:43 AM, Stefan Berger wrote:
On 11/10/2010 05:41 AM, Daniel P. Berrange wrote:
> On Tue, Nov 09, 2010 at 07:43:23PM -0500, Stefan Berger wrote:
>> I am replacing the last instances of close() I found with VIR_CLOSE() /
>> VIR_FORCE_CLOSE respectively.
>>
>> The first patch of virsh I missed out on previously.
>>
>> The 2nd patch I had left out intentionally to look at it more
>> carefully:
>> The 'closed' variable could be easily removed since it wasn't used
>> anywhere else. The possible race condition that could result from the
>> filedescriptor being closed and not set to -1 (and possibly let us
>> write
>> into 'something' totally different if the fd was allocated by another
>> thread) seems to be prevented by the qemuMonitorLock() already placed
>> around the code that reads from or writes to the fd. So the change of
>> this code as shown in the patch should not have any side-effects.
>>
>> Signed-off-by: Stefan Berger<stefanb(a)us.ibm.com>
>> --- libvirt-acl.orig/src/qemu/qemu_monitor.c
>> +++ libvirt-acl/src/qemu/qemu_monitor.c
>> @@ -73,8 +73,6 @@ struct _qemuMonitor {
>>
>> /* If the monitor EOF callback is currently active (stops more
>> commands being run) */
>> unsigned eofcb: 1;
>> - /* If the monitor is in process of shutting down */
>> - unsigned closed: 1;
>>
>> unsigned json: 1;
>> };
>> @@ -692,17 +690,11 @@ void qemuMonitorClose(qemuMonitorPtr mon
>> VIR_DEBUG("mon=%p", mon);
>>
>> qemuMonitorLock(mon);
>> - if (!mon->closed) {
>> +
>> + if (mon->fd>= 0) {
>> if (mon->watch)
>> virEventRemoveHandle(mon->watch);
>> - if (mon->fd != -1)
>> - close(mon->fd);
>> - /* NB: ordinarily one might immediately set mon->watch to -1
>> - * and mon->fd to -1, but there may be a callback active
>> - * that is still relying on these fields being valid. So
>> - * we merely close them, but not clear their values and
>> - * use this explicit 'closed' flag to track this state */
>> - mon->closed = 1;
>> + VIR_FORCE_CLOSE(mon->fd);
>> }
> Err, the comment you deleted here explains why this change is not
> safe.
If I looked at the code correctly then this here is the callback:
static void
qemuMonitorIO(int watch, int fd, int events, void *opaque) {
qemuMonitorPtr mon = opaque;
int quit = 0, failed = 0;
qemuMonitorLock(mon);
qemuMonitorRef(mon);
#if DEBUG_IO
VIR_DEBUG("Monitor %p I/O on watch %d fd %d events %d", mon, watch, f
#endif
if (mon->fd != fd || mon->watch != watch) {
VIR_ERROR(_("event from unexpected fd %d!=%d / watch %d!=%d"), mo
failed = 1;
} else {
[...]
It grabs the lock and then subsequently, with the lock held, calls
into qemuMonitorIOWrite and qemuMonitorIORead (not shown, this is done
in the [...]), which then access the mon->fd. Now the above
VIR_FORCE_CLOSE() function is also being called with the lock held,
thus serializes the access to the file descriptor (mon->fd). When the
'if (mon->fd != fd)' is evaluate (while the lock is held) after the fd
was closed and now set to -1, then we won't read from a non-open file
descriptor anymore, which would otherwise occur in the else branch.
Otherwise, if fd->mon != -1, it would do the reads and writes as
intended and the above close could not happen while it is doing that.
When I made the above change I first introduced another lock
'mon->monFDLock' to serialize the access to the fd, but then I saw
that the qemuMonitorLock() was already doing that as well, so I ended
up removing it again entirely. Well, to me this was convincing so that
the above changes seemed 'safe'. Actually, I think the previous code
could write and read to a file descriptor that doesn't communicate
with the monitor anymore but may belong to another thread now (since
the mon->fd wasn't set to -1, but kept as-is). So, to me this is
actually an improvement.
Regards,
Stefan
Did I convince you?
Regards,
Stefan