Re: [libvirt] [Qemu-devel] [PATCH v8 7/7] block: Enable qemu_open/close to work with fd sets

11 Aug 2012


      On 08/10/2012 02:16 AM, Eric Blake wrote:
...
On 08/09/2012 08:10 PM, Corey Bryant wrote:
...
When qemu_open is passed a filename of the "/dev/fdset/nnn"
format (where nnn is the fdset ID), an fd with matching access
mode flags will be searched for within the specified monitor
fd set.  If the fd is found, a dup of the fd will be returned
from qemu_open.
Each fd set has a reference count.  The purpose of the reference
count is to determine if an fd set contains file descriptors that
have open dup() references that have not yet been closed.  It is
incremented on qemu_open and decremented on qemu_close.  It is
not until the refcount is zero that file desriptors in an fd set
s/desriptors/descriptors/
Thanks
...
...
can be closed.  If an fd set has dup() references open, then we
must keep the other fds in the fd set open in case a reopen
of the file occurs that requires an fd with a different access
mode.
...
+int monitor_fdset_get_fd(int64_t fdset_id, int flags)
+{
+    MonFdset *mon_fdset;
+    MonFdsetFd *mon_fdset_fd;
+    int mon_fd_flags;
+
+    QLIST_FOREACH(mon_fdset, &mon_fdsets, next) {
+        if (mon_fdset->id != fdset_id) {
+            continue;
+        }
+        QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
+            if (mon_fdset_fd->removed) {
+                continue;
+            }
Is this right?  According to the commit message, the whole point of
leaving an fd in the set is to allow the fd to be reused as a dup()
target for as long as the fdset is alive, even if the monitor no longer
cares about the existence of the fd.  But this will always skip over an
fd marked for removal.  Maybe this function needs a flag to say whether
this is an initial open driven by an explicit user string (in which
case, honor the removed flag - if the user removed the O_RDWR fd and
then tries a drive_add with the same fdset, the drive_add should fail
because from the user's perspective, there is no O_RDWR fd in the set);
vs. an internal usage due to a reopen (use an fd even if removed is
true, because we may be toggling between O_RDWR and O_RDONLY multiple
times long after the monitor has already removed the fdset, based on
actions that were not drive by an explicit /dev/fdset name.)
Hmm..  something needs to change here, either the commit message or the 
code.

For security purposes, I'm thinking that an fd should no longer be 
available for opening after libvirt issues a remove-fd command, with no 
exceptions.  That allows libvirt to have complete control over usage of 
an fd.  Note that other fds in the fd set could still be used on a 
reopen, assuming remove-fd hasn't been called for them.

Does that work for you?  In that case, the code will remain as-is.

Apologies if I've sent conflicting messages on this in the past.
...
...
+
+            mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL);
+            if (mon_fd_flags == -1) {
+                return -1;
+            }
+
+            if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) {
+                return mon_fdset_fd->fd;
+            }
I still wonder if a request for O_RDONLY should be satisfied by an
existing O_RDWR fd, especially in light of the fact that libvirt would
rather pass in only one RDWR fd but qemu block opening currently opens
twice during probing.  But if it turns out to be a problem in practice,
and if libvirt can't really manage to pass two fds into the set, we can
hack that in later.  Meanwhile, I'm okay with this first round patch
requiring an exact match.
I see your point but I think allowing subset access mode matches could 
allow for a client to get lazy and cause security implications (client 
only adds RW fd's and QEMU only needs R access in some cases).  So I'll 
keep this as is.
...
...
@@ -87,6 +151,39 @@ int qemu_open(const char *name, int flags, ...)
      int ret;
      int mode = 0;
+#ifndef _WIN32
+    const char *fdset_id_str;
+
+    /* Attempt dup of fd from fd set */
+    if (strstart(name, "/dev/fdset/", &fdset_id_str)) {
+        int64_t fdset_id;
+        int fd, dupfd;
+
+        fdset_id = qemu_parse_fdset(fdset_id_str);
+        if (fdset_id == -1) {
+            errno = EINVAL;
+            return -1;
+        }
+
+        fd = monitor_fdset_get_fd(fdset_id, flags);
+        if (fd == -1) {
+            return -1;
+        }
+
+        dupfd = qemu_dup(fd, flags);
+        if (fd == -1) {
+            return -1;
+        }
+
+        ret = monitor_fdset_dup_fd_add(fdset_id, dupfd);
+        if (ret == -1) {
+            return -1;
Leaks dupfd (admittedly only on a corner-case failure, but still worth
addressing).
Thanks, I'll fix this.

-- 
Regards,
Corey