On 06/15/2012 12:16 PM, Corey Bryant wrote:
> I think you need to honor flags so that the end use of the fd
will be as
> if qemu had directly opened the file, rather than just doing a blind dup
> with a resulting fd that is in a different state than the caller
> expected. I can think of at least the following cases (there may be
> more):
I was thinking libvirt would handle all the flag settings on open
(obviously since that's how I coded it). I think you're right with this
approach though as QEMU will re-open the same file various times with
different flags.
How will libvirt know whether qemu wanted to open a file with O_TRUNCATE
vs. reusing an entire file, or with O_NONBLOCK or not, and so forth? I
think there are just too many qmeu_open() calls with different flags to
expect libvirt to know how to pre-open all possible fds in such a manner
that /dev/fd/nnn will be a valid replacement for what qemu would have
done, in the cases where qemu can instead correct flags itself.
There are some flags that I don't think we'll be able to change. For
example: O_RDONLY, O_WRONLY, O_RDWR. I assume libvirt would open all
files O_RDWR.
I agree that this can't be changed, so this is one case where libvirt
will have to know what the file will used for. But it is also a case
where qemu can at least check whether the fd passed in has sufficient
permissions (fcntl(F_GETFL)) for what qemu wanted to use, and should
error out here rather than silently succeed here then have a weird EBADF
failure down the road when the dup'd fd is finally used. You are right
that libvirt should always be safe in passing in an O_RDWR fd for
whatever qemu wants, although that may represent security holes (there's
reasons for O_RDONLY); and in cases where libvirt is instead passing in
one end of a pipe, the fd will necessarily be O_RDONLY or O_WRONLY
(since you can't have an O_RDWR pipe).
> Oh, and are we using MSG_CMSG_CLOEXEC where possible (and where
not
> possible, falling back to fcntl(F_GETFD/F_SETFD) to set FD_CLOEXEC) on
> all fds received by 'getfd' and 'pass-fd'? I can't think of any
reason
> why 'migrate fd:name' would need to be inheritable, and in the case of
> /dev/fd/ parsing, while the dup() result may need to be inheritable, the
> original that we are dup'ing from should most certainly be cloexec.
It doesn't look like we use MSG_CMSG_CLOEXEC anywhere in QEMU. I don't
think we can modify getfd at this point (compatibility) but we could
update pass-fd to set it. It may make more sense to set it with
fcntl(FD_CLOEXEC) in qmp_pass_fd().
That's not atomic. If we don't care about atomicity (for example, if we
can guarantee that no other thread in qemu can possibly fork/exec in
between the monitor thread receiving an fd then converting it to
cloexec, based on how we hold a mutex), then that's fine. OR, if we
make sure _all_ fork() calls sanitize themselves, by closing all fds in
the getfd/pass-fd list prior to calling exec(), then we don't have to
even worry about cloexec (but then you have to worry about locking the
getfd name list, since locking a list to iterate it is not an async-safe
action and probably can't be done between fork() and exec()).
Otherwise, using MSG_CMSG_CLOEXEC is the only safe way to avoid leaking
unintended fds into another thread's child process.
>
> if (flags & O_NONBLOCK)
> use fcntl(F_GETFL/F_SETFL) to set O_NONBLOCK
> else
> use fcntl(F_GETFL/F_SETFL) to clear O_NONBLOCK
>
> or maybe we document that callers of pass-fd must always pass fds with
> O_NONBLOCK clear instead of clearing it ourselves. Or maybe we make
> sure part of the process of tying name with fd in the lookup list of
> named fds is determining the current O_NONBLOCK state in case future
> qemu_open() need it in the opposite state.
Just documenting it seems error-prone. Why not just set/clear it based
on the flag passed to qemu_open?
Yep, back to my argument that making libvirt have to know every context
that qemu will want to open, and what flags it would be using in those
contexts, is painful compared to having qemu just do the right thing in
the first place, or gracefully erroring out right away at the point of
the qemu_open(/dev/fd/nnn).
--
Eric Blake eblake(a)redhat.com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org