Re: [libvirt] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

Wednesday, 25 July 2012

On 07/25/2012 03:43 PM, Eric Blake wrote:
...
 On 07/23/2012 07:08 AM, Corey Bryant wrote:
> When qemu_open is passed a filename of the "/dev/fdset/nnn"
> format (where nnn is the fdset ID), an fd with matching access
> mode flags will be searched for within the specified monitor
> fd set.  If the fd is found, a dup of the fd will be returned
> from qemu_open.
>
> Each fd set has a reference count.  The purpose of the reference
> count is to determine if an fd set contains file descriptors that
> have open dup() references that have not yet been closed.  It is
> incremented on qemu_open and decremented on qemu_close.  It is
> not until the refcount is zero that file desriptors in an fd set
> can be closed.  If an fd set has dup() references open, then we
> must keep the other fds in the fd set open in case a reopen
> of the file occurs that requires an fd with a different access
> mode.
>

> +++ b/monitor.c
> @@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool
in_use)
>       }
>   }
>
> +void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
> +{
> +    mon_fdset_t *mon_fdset;
> +
> +    if (!mon) {
> +        return;
> +    }

 Am I reading this code right by stating that 'if there is no monitor, we
 don't increment the refcount'?  How does a monitor reattach affect
 things?  Or am I missing something fundamental about the cases when
 'mon==NULL' will exist?

Yes you're reading this correctly.

I'm pretty sure that mon will only be NULL if QEMU is started without a 
monitor.

If QEMU has a monitor, and libvirt disconnects it's connection to the 
qemu monitor, then I believe mon will remain non-NULL.

I'll plan on testing this out to verify though.  (I'm out most of this 
week and will be back full time starting next Tues.)

...
> +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int
flags)
> +{
> +    mon_fdset_t *mon_fdset;
> +    mon_fdset_fd_t *mon_fdset_fd;
> +    int mon_fd_flags;
> +
> +    if (!mon) {
> +        errno = ENOENT;
> +        return -1;
> +    }
> +
> +    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> +        if (mon_fdset->id != fdset_id) {
> +            continue;
> +        }
> +        QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
> +            if (mon_fdset_fd->removed) {
> +                continue;
> +            }
> +
> +            mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL);
> +            if (mon_fd_flags == -1) {
> +                return -1;

 This says we fail on the first fcntl() failure, instead of trying other
 fds in the set.  Granted, an fcntl() failure is probably the sign of a
 bigger bug (such as closing an fd at the wrong point in time), so I
 guess trying to go on doesn't make much sense once we already know we
 are hosed.

I think I'll stick with it the way it is.  If fcntl() fails we might 
have a tainted fd set so I think we should fail.

...
> +            }
> +
> +            switch (flags & O_ACCMODE) {
> +            case O_RDWR:
> +                if ((mon_fd_flags & O_ACCMODE) == O_RDWR) {
> +                    return mon_fdset_fd->fd;
> +                }
> +                break;
> +            case O_RDONLY:
> +                if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) {
> +                    return mon_fdset_fd->fd;
> +                }
> +                break;

 Do we want to allow the case where the caller asked for O_RDONLY, but
 the set only has O_RDWR?  After all, the caller is getting a compatible
 subset of what the set offers.

I don't see a problem with it.

...
> +            case O_WRONLY:
> +                if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) {
> +                    return mon_fdset_fd->fd;
> +                }
> +                break;

 Likewise, should we allow a caller asking for O_WRONLY when the set
 provides only O_RDWR?

I don't see a problem with it.

...
>
> +/*
> + * Dups an fd and sets the flags
> + */
> +static int qemu_dup(int fd, int flags)
> +{
> +    int i;
> +    int ret;
> +    int serrno;
> +    int dup_flags;
> +    int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
> +                          O_NONBLOCK, 0 };
> +
> +    if (flags & O_CLOEXEC) {
> +        ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);

 F_DUPFD_CLOEXEC is required by POSIX but not implemented on all modern
 OS yet; you probably need some #ifdef and/or configure guards.

Ok

...
> +        if (ret == -1 && errno == EINVAL) {
> +            ret = dup(fd);
> +            if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) {

 You _can't_ call F_SETFL with O_CLOEXEC.  O_CLOEXEC only causes open()
 to set FD_CLOEXEC; thereafter, including in the case of this dup, what
 you want to do is instead set FD_CLOEXEC via F_SETFD (aka call
 qemu_set_cloexec, not fcntl_setfl).

I know, this is a mistake.  I'm planning to replace fcntl_setfl() here 
with qemu_set_cloexec().

...
> +                goto fail;
> +            }
> +        }
> +    } else {
> +        ret = dup(fd);
> +    }
> +
> +    if (ret == -1) {
> +        goto fail;
> +    }
> +
> +    dup_flags = fcntl(ret, F_GETFL);
> +    if (dup_flags == -1) {
> +        goto fail;
> +    }
> +
> +    if ((flags & O_SYNC) != (dup_flags & O_SYNC)) {
> +        errno = EINVAL;
> +        goto fail;
> +    }
> +
> +    /* Set/unset flags that we can with fcntl */
> +    i = 0;
> +    while (setfl_flags[i] != 0) {
> +        if (flags & setfl_flags[i]) {
> +            dup_flags = (dup_flags | setfl_flags[i]);
> +        } else {
> +            dup_flags = (dup_flags & ~setfl_flags[i]);
> +        }
> +        i++;
> +    }

 Rather than looping one bit at a time, you should do this as a mask
 operation.

I agree.

...
> +
> +    if (fcntl(ret, F_SETFL, dup_flags) == -1) {
> +        goto fail;
> +    }
> +
> +    /* Truncate the file in the cases that open() would truncate it */
> +    if (flags & O_TRUNC ||
> +            ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) {
> +        if (ftruncate(ret, 0) == -1) {
> +            goto fail;
> +        }
> +    }
> +
> +    qemu_set_cloexec(ret);

 If we're going to blindly set FD_CLOEXEC at the end of the day, rather
 than try to honor O_CLOEXEC, then why not simplify the beginning of this
 function: 
This call to qemu_set_cloexec() was a mistake.  I'm planning on removing it.

...

      ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
      if (ret == -1 && errno == EINVAL) {
          ret = dup(fd);
          if (ret != -1) {
              qemu_set_cloexec(ret);
          }
      }
      if (ret == -1) {
          goto fail;
      }

I'll plan on sticking with the existing code in the beginning of this 
function with the modifications mentioned above.

-- 
Regards,
Corey

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets