On Tue, Nov 09, 2010 at 03:17:23PM -0600, Adam Litke wrote:
I've been working with Anthony Liguori and Stefan Hajnoczi to
enable data
streaming to copy-on-read disk images in qemu. This work is working its way
through peer review and I expect it to be upstream soon as part of the support
for the new QED disk image format.
I would like to enable these commands in libvirt in order to support at least
two compelling use cases:
1) Rapid deployment of domains:
Creating a new domain from a central repository of images can be time consuming
since a local copy of the image must be made before the domain can be started.
With copy-on-read and streaming, up-front copy time is eliminated and the
domain can be started immediately. Streaming can run while the domain runs
to fully populate the disk image.
2) Post-copy live block migration:
A qemu-nbd server is started on the source host and serves the domain's block
device to the destination host. A QED image is created on the destination host
with backing to the nbd server. The domain is migrated as normal. When
migration completes, a stream command is executed to fully populate the
destination QED image. After streaming completes, the qemu-nbd server can
be shut down and the domain (including local storage) is fully independent of
the source host.
Qemu will support two streaming modes: full device and single sector. Full
device streaming is the easiest to use because one command will cause the whole
device to be streamed as fast as possible. Single sector mode can be used if
one wants to throttle streaming to reduce I/O pressure. In this mode, the user
issues individual commands to stream single sectors.
To enable this support in libvirt, I propose the following API...
virDomainStreamDisk() initiates either a full device stream or a single sector
stream (depending on virDomainStreamDiskFlags). For a full device stream, it
returns either 0 or -1. For a single sector stream, it returns an offset that
can be used to continue streaming with a subsequent call to virDomainStreamDisk().
virDomainStreamDiskInfo() returns the status of a currently-running full device
stream (the device name, current streaming position, and total size).
Comments on this design would be greatly appreciated. Thanks!
I'm finding it hard to say whether these APIs are suitable or not
because I can't see what this actually maps to in terms of
implementation.
Do these calls need to be run before the QEMU process is started,
or after QEMU is already running ?
Does the path in the arg actually need to exist on disk before
streaming begins, or do these APIs create the image too ?
If we're streaming the whole disk, is there a way to cancel/abort
it early ?
What happens if qemu-nbd dies before streaming is complete ?
Who/what starts the qemu-nbd process ?
If you have a guest on host A and want to migrate to host B, we presumably
need to start qemu-nbd on host A, while the guest is still running on
host A. eg we end up with 2 processes having the same disk image open on
host A for a while.
How we'd wire qemu-nbd up into the security driver framework is of
particular concern here, because I'd think we'd want qemu-nbd to run
wit hthe same privileges as the qemu, so that its isolated from all
other QEMU processes on the host and can only access the one set of
disks for that VM
Is there any restriction on what can be done while streaming is taking
place ? eg if I'm doing a whole disk stream, can I migrate the QEMU
guest to another host before streaming completes ?
Regards,
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://deltacloud.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|