On 05/17/2012 09:14 AM, Eric Blake wrote:
On 05/17/2012 07:42 AM, Stefan Hajnoczi wrote:
>>>
>>> The -open-hook-fd approach allows QEMU to support file descriptor passing
>>> without changing -drive. It also supports snapshot_blkdev and other
commands
>> By the way, How will it support them?
>
> The problem with snapshot_blkdev is that closing a file and opening a
> new file cannot be done by the QEMU process when an SELinux policy is in
> place to prevent opening files.
snapshot_blkdev can take an fd:name instead of a /path/to/file for the
file to open, in which case libvirt can pass in the named fd _prior_ to
the snapshot_blkdev using the 'getfd' monitor command.
>
> The -open-hook-fd approach works even when the QEMU process is not
> allowed to open files since file descriptor passing over a UNIX domain
> socket is used to open files on behalf of QEMU.
The -open-hook-fd approach would indeed allow snapshot_blokdev to ask
for the fd after the fact, but it's much more painful. Consider a case
with a two-disk snapshot:
with the fd:name approach, the sequence is:
libvirt calls getfd:name1 over normal monitor
qemu responds
libvirt calls getfd:name2 over normal monitor
qemu responds
libvirt calls transaction around blockdev-snapshot-sync over normal
monitor, using fd:name1 and fd:name2
qemu responds
but with -open-hook-fd, the approach would be:
libvirt calls transaction
qemu calls open(file1) over hook
libvirt responds
qemu calls open(file2) over hook
libvirt responds
qemu responds to the original transaction
The 'transaction' operation is thus blocked by the time it takes to do
two intermediate opens over a second channel, which kind of defeats the
purpose of making the transaction take effect with minimal guest
downtime.
How are you defining "guest down time"?
It's important to note that code running in QEMU does not equate to guest
visible down time unless QEMU does an explicit vm_stop() which is not happening
here.
Instead, a VCPU may become blocked *if* it attempts to acquire qemu_mute while
QEMU is holding it.
If your concern is qemu_mutex being held while waiting for libvirt, it would be
fairly easy to implement a qemu_open_async() that dropped allowed dropping back
to the main loop and then calling a callback when the open completes.
It would be pretty trivial to convert qmp_transaction to use such a command.
But this is all speculative. There's no reason to believe that an RPC would
have a noticable guest visible latency unless you assume there's lot contention.
I would strongly suspect that the bdrv_flush() is going to be a much greater
source of lock contention than the RPC would be. An RPC is only bound by
scheduler latency whereas synchronous disk I/O is bound spinning a platter.
And libvirt code becomes a lot trickier to deal with the fact
that two channels are in use, and that the channel that issued the
'transaction' command must block while the other channel for handling
hooks must be responsive.
All libvirt needs to do is listen on a socket and delegate access according to a
white list. Whatever is providing fd's needs to have no knowledge of anythign
other than what the guest is allowed to access which shouldn't depend on an
executing command.
Regards,
Anthony Liguori
I'm really disliking the hook-fd approach, when a better solution
is to
make use of 'getfd' in advance of any operation that will need to open
new fds.