[libvirt] FD passing for chardevs and chardev backend multiplexing

Historically libvirt has connected stdout & stderr from QEMU directly to a plain file (/var/log/libvirt/qemu/$GUESTNAME.log). This has worked well enough in general, but is susceptible to a guest denial of service if the guest can cause QEMU to spew messages to stderr. There are enough places in QEMU and SPICE that still print to stderr that this isn't very hard to achieve. So In libvirt 1.3.0 we introduce a new daemon 'virtlogd' which is used for handling log file writing. When this is used, QEMU's stdout/stderr will be connected to an anonymous pipe file descriptor, the other end of which is held by the virtlogd daemon. The virtlogd daemon will only permit a fixed file size to be created before rotating the log file, so we no longer have the possibility of unbounded disk usage, which is nice. I'm now looking to extend the use of 'virtlogd' to also handle character devices. OpenStack has historically configured the primary serial port to log to a file in order to capture kernel boot up messages and later report them to the user via its API. This serial port file backend is of course susceptible to the same disk space denial of service. We don't really want to push file rotation logic into QEMU because that would involve giving QEMU permission to create / rename files, which is undesirable from a security POV. We also prefer a solution that ideally works with existing QEMU builds. So for this my plan is to stop using the QEMU 'file' backend for char devs and instead pass across a pre-opened file descriptor, connected to virtlogd. There is no "officially documented" way to pass in a file descriptor to QEMU chardevs, but since QEMU uses qemu_open(), we can make use of the fdset feature to achieve this. eg eg, consider fd 33 is the write end of a pipe file descriptor I can (in theory) do -add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2 Now in practice this doesn't work, because qmp_chardev_open_file() passes the O_CREAT|O_TRUNC flags in, which means the qemu_open() call will fail when using the pipe FD pased in via fdsets. After more investigation I found it *is* possible to use a socketpair and a pipe backend though... -add-fd set=2,fd=33 -chardev pipe,id=charserial0,path=/dev/fdset/2 ..because for reasons I don't understand, if QEMU can't open $PATH.in and $PATH.out, it'll fallback to just opening $PATH in read-write mode even. AFAICT, this is pretty useless with pipes since they are unidirectional, but, it works nicely with socketpairs, where virtlogd has one of the socketpairs and QEMU gets passed the other via fdset. I can easily check this works for historical QEMU versions back to when fdsets support was added to chardevs, but I'm working if the QEMU maintainers consider this usage acceptable over the long term, and if so, should we explicitly document it as supported ? If not, should we introduce a more explicit syntax for passing in a pre-opened FD for chardevs ? eg -add-fd set=2,fd=33 -chardev fd,id=charserial0,path=/dev/fdset/2 Or just make -chardev file,id=charserial0,path=/dev/fdset/2 actually work ? Or something else ? OpenStack has a further requirement to allow use of the serial port as an interactive console, at the same time that it is logging to a file which is something QEMU can't support at all currently. This essentially means being able to have multiple chardev backends all connected to the same serial frontend - specifically we would need a TCP backend and a file backend concurrently. Again this could be implemented in QEMU, but we'd prefer something that works with existing QEMU. This is not too difficult to achieve with virtlogd really. Instead of using the QEMU 'tcp' or 'unix' chardev protocol, we'd just always pass QEMU a pre-opened socketpair, and then leave the TCP/UNIX socket listening to the virtlogd daemon. This is portable with existing QEMU versions, but the obvious downside with this is extra copies in the interactive console path. So might it be worth exploring the posibility of a chardev multiplexor in QEMU. We would still pass in a pre-opened socketpair to QEMU for the logging side of things, but would leave the TCP/UNIX socket listening upto QEMU still. eg should we make something like this work: -add-fd set=2,fd=33 -chardev pipe,id=charserial0file,path=/dev/fdset/2 -chardev socket,id=charserial0tcp,host=127.0.0.1,port=9999,telnet,server,nowait -chardev multiplex,id=charserial0,muxA=charserial0file,muxB=charserial1 -serial isa-serial,chardev=charserial0,id=serial0 Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 12/08/2015 07:59 AM, Daniel P. Berrange wrote:
So for this my plan is to stop using the QEMU 'file' backend for char devs and instead pass across a pre-opened file descriptor, connected to virtlogd. There is no "officially documented" way to pass in a file descriptor to QEMU chardevs, but since QEMU uses qemu_open(), we can make use of the fdset feature to achieve this. eg
eg, consider fd 33 is the write end of a pipe file descriptor I can (in theory) do
-add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2
Now in practice this doesn't work, because qmp_chardev_open_file() passes the O_CREAT|O_TRUNC flags in, which means the qemu_open() call will fail when using the pipe FD pased in via fdsets.
Is it just the O_TRUNC that is failing? If so, there is a recent patch to add an 'append':true flag that switches O_TRUNC off in favor of O_APPEND: https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg00762.html Or is it that the pipe is one-way, but chardev insists on O_RDWR and fails because it is not two-way?
After more investigation I found it *is* possible to use a socketpair and a pipe backend though...
-add-fd set=2,fd=33 -chardev pipe,id=charserial0,path=/dev/fdset/2
Yes, a socketpair is bi-directional, so it supports O_RDWR opening.
..because for reasons I don't understand, if QEMU can't open $PATH.in and $PATH.out, it'll fallback to just opening $PATH in read-write mode even. AFAICT, this is pretty useless with pipes since they are unidirectional, but, it works nicely with socketpairs, where virtlogd has one of the socketpairs and QEMU gets passed the other via fdset.
Is it something where we'd want to support two pipes, and open /dev/fdset/2 tied to char.in and /dev/fdset/3 tied to char.out, where uni-directional pipes are again okay?
I can easily check this works for historical QEMU versions back to when fdsets support was added to chardevs, but I'm working if the QEMU maintainers consider this usage acceptable over the long term, and if so, should we explicitly document it as supported ?
It seems like a bi-directional socketpair as the single endpoint for a chardev is useful enough to support and document, but I'm not the maintainer to give final say-so.
If not, should we introduce a more explicit syntax for passing in a pre-opened FD for chardevs ? eg
-add-fd set=2,fd=33 -chardev fd,id=charserial0,path=/dev/fdset/2
Difference to the line you tried above:
-add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2
is 'fd' instead of 'file'. But if we're going to add a new protocol, do we even need to go through the "/dev/fdset/..." name, or can we just pass the fd number directly?
Or just make -chardev file,id=charserial0,path=/dev/fdset/2 actually work ?
I'd lean more to this case - the whole point of fdsets was that we don't have to add multiple fd protocols; that everyone that understood file syntax and uses qemu_open() magically gained fd support.
Or something else ?
OpenStack has a further requirement to allow use of the serial port as an interactive console, at the same time that it is logging to a file which is something QEMU can't support at all currently. This essentially means being able to have multiple chardev backends all connected to the same serial frontend - specifically we would need a TCP backend and a file backend concurrently. Again this could be implemented in QEMU, but we'd prefer something that works with existing QEMU.
This is not too difficult to achieve with virtlogd really. Instead of using the QEMU 'tcp' or 'unix' chardev protocol, we'd just always pass QEMU a pre-opened socketpair, and then leave the TCP/UNIX socket listening to the virtlogd daemon.
This is portable with existing QEMU versions, but the obvious downside with this is extra copies in the interactive console path. So might it be worth exploring the posibility of a chardev multiplexor in QEMU. We would still pass in a pre-opened socketpair to QEMU for the logging side of things, but would leave the TCP/UNIX socket listening upto QEMU still.
eg should we make something like this work:
-add-fd set=2,fd=33 -chardev pipe,id=charserial0file,path=/dev/fdset/2 -chardev socket,id=charserial0tcp,host=127.0.0.1,port=9999,telnet,server,nowait -chardev multiplex,id=charserial0,muxA=charserial0file,muxB=charserial1
wouldn't muxB be charserial0tcp (not charserial1)?
-serial isa-serial,chardev=charserial0,id=serial0
But the idea of a multiplex protocol that has multiple data sinks (guest output copied to all sinks) and a single source (at most one source can provide input to the guest) makes sense on the surface. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Tue, Dec 08, 2015 at 10:04:55AM -0700, Eric Blake wrote:
On 12/08/2015 07:59 AM, Daniel P. Berrange wrote:
So for this my plan is to stop using the QEMU 'file' backend for char devs and instead pass across a pre-opened file descriptor, connected to virtlogd. There is no "officially documented" way to pass in a file descriptor to QEMU chardevs, but since QEMU uses qemu_open(), we can make use of the fdset feature to achieve this. eg
eg, consider fd 33 is the write end of a pipe file descriptor I can (in theory) do
-add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2
Now in practice this doesn't work, because qmp_chardev_open_file() passes the O_CREAT|O_TRUNC flags in, which means the qemu_open() call will fail when using the pipe FD pased in via fdsets.
Is it just the O_TRUNC that is failing? If so, there is a recent patch to add an 'append':true flag that switches O_TRUNC off in favor of O_APPEND: https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg00762.html
Yes, it is the ftruncate() call in qemu_dup_flags, called from qemu_open that is failing.
Or is it that the pipe is one-way, but chardev insists on O_RDWR and fails because it is not two-way?
The chardev file: backend wants a O_RDONLY file - it won't accept an O_RDWR file in fact, so we must use a pipe with it.
After more investigation I found it *is* possible to use a socketpair and a pipe backend though...
-add-fd set=2,fd=33 -chardev pipe,id=charserial0,path=/dev/fdset/2
Yes, a socketpair is bi-directional, so it supports O_RDWR opening.
Yep.
..because for reasons I don't understand, if QEMU can't open $PATH.in and $PATH.out, it'll fallback to just opening $PATH in read-write mode even. AFAICT, this is pretty useless with pipes since they are unidirectional, but, it works nicely with socketpairs, where virtlogd has one of the socketpairs and QEMU gets passed the other via fdset.
Is it something where we'd want to support two pipes, and open /dev/fdset/2 tied to char.in and /dev/fdset/3 tied to char.out, where uni-directional pipes are again okay?
In theory we could do, but it would need us to special case the code, as just taking '/dev/fdset/2' and appending '.in' obviously doesn't work. I don't think this really matters though - using a socketpair is just fine.
I can easily check this works for historical QEMU versions back to when fdsets support was added to chardevs, but I'm working if the QEMU maintainers consider this usage acceptable over the long term, and if so, should we explicitly document it as supported ?
It seems like a bi-directional socketpair as the single endpoint for a chardev is useful enough to support and document, but I'm not the maintainer to give final say-so.
If not, should we introduce a more explicit syntax for passing in a pre-opened FD for chardevs ? eg
-add-fd set=2,fd=33 -chardev fd,id=charserial0,path=/dev/fdset/2
Difference to the line you tried above:
-add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2
is 'fd' instead of 'file'. But if we're going to add a new protocol, do we even need to go through the "/dev/fdset/..." name, or can we just pass the fd number directly?
Or just make -chardev file,id=charserial0,path=/dev/fdset/2 actually work ?
I'd lean more to this case - the whole point of fdsets was that we don't have to add multiple fd protocols; that everyone that understood file syntax and uses qemu_open() magically gained fd support.
Yeah, that is a good point about not inventing multiple fd protocols.
From that POV I'd be happy enough if we documented & supported that 'pipe' can be used with a socketpair, and 'file' can be used with an pipe (once append=true support added)
eg should we make something like this work:
-add-fd set=2,fd=33 -chardev pipe,id=charserial0file,path=/dev/fdset/2 -chardev socket,id=charserial0tcp,host=127.0.0.1,port=9999,telnet,server,nowait -chardev multiplex,id=charserial0,muxA=charserial0file,muxB=charserial1
wouldn't muxB be charserial0tcp (not charserial1)?
Yes, silly typo.
-serial isa-serial,chardev=charserial0,id=serial0
But the idea of a multiplex protocol that has multiple data sinks (guest output copied to all sinks) and a single source (at most one source can provide input to the guest) makes sense on the surface.
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 09/12/2015 12:19, Daniel P. Berrange wrote:
Now in practice this doesn't work, because qmp_chardev_open_file() passes the O_CREAT|O_TRUNC flags in, which means the qemu_open() call will fail when using the pipe FD pased in via fdsets.
Is it just the O_TRUNC that is failing? If so, there is a recent patch to add an 'append':true flag that switches O_TRUNC off in favor of O_APPEND: https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg00762.html
Yes, it is the ftruncate() call in qemu_dup_flags, called from qemu_open that is failing.
It will require a little auditing, but I think ignoring EINVAL from that fruncate() is fine. Paolo
participants (3)
-
Daniel P. Berrange
-
Eric Blake
-
Paolo Bonzini