On 01/31/2012 01:53 PM, Adam Litke wrote:
On Tue, Jan 31, 2012 at 09:28:51AM -0700, Eric Blake wrote:
> Right now, the existing virDomainBlockPull API has a tough limitation -
> it is an all-or-none approach. In all my examples below, I'm starting
> from the following relationship, where '<-' means 'is a backing file
of':
>
> template <- intermediate <- current
>
> Meanwhile, qemu is adding support for a partial block pull
operation,
> still on the current image as the merge destination, but where you can
> now specify an optional argument to limit the pull to just the
> intermediate files and altering the current image to be backed by an
> ancestor file, as in:
>
> merge intermediate into current, creating:
> template <- current
>
> For 0.9.10, I'd like to add the following API:
>
> /**
> * virDomainBlockRebase:
> * @dom: pointer to domain object
> * @disk: path to the block device, or device shorthand
> * @base: new base image, or NULL for entire block pull
> * @bandwidth: (optional) specify copy bandwidth limit in Mbps
> * @flags: extra flags; not used yet, so callers should always pass 0
What is the format of the @base arg? My first thought would be a path, but what
if the desired image file is not directly known to libvirt?
Libvirt already has to know the absolute paths of all disks in the
backing file chain, in order to properly SELinux label them prior to
invoking qemu. So I'm envisioning the absolute path of the backing file
in the chain that will be preserved. That is, with:
touch 10M /path/to/template
qemu-img create -f qcow2 \
-o backing_file /path/to/template /path/to/intermediate
qemu-img create -f qcow2 \
-o backing_file /path/to/intermediate /path/to/current
followed by virDomainBlockRebase(dom, "vda", "/path/to/template", 0)
would result in /path/to/current referring to /path/to/template as its
primary backing file.
I also had an idea down below where, with the addition of a new flags
value, base could refer to a well-formed XML block rather than a single
file name, such that we could then populate that XML block with more
complex instructions; but I'm not proposing doing that extension in the
0.9.10 timeframe, so much as trying to argue that this API is extensible
and we won't need yet a third API for block pull if qemu ever allows
more complex merging scenarios.
> Given that Adam has a pending patch to support a
> VIR_DOMAIN_BLOCK_PULL_ASYNC flag, this same flag would have to be
> supported in virDomainBlockRebase.
That patch only applies to virDomainBlockJobCancel(). The blockJob initiators
(virDomainBlockPull and this new one) already use an async mode of operation
because the call simply starts the block job.
Ah, right. I'm getting slightly confused with all the patches that
still need review :)
virDomainBlockPull has always been asynchronous, so no flag is needed
there or in this new API.
> and backward merge of a non-current image (that is, undoing an
earlier
> snapshot, but by modifying the template rather than the current image):
>
> merge intermediate into base, creating:
> template <- current
Don't these raise some security concerns about modifying a potentially shared
intermediate image?
Yes, the management app has to be careful to not remove backing files
that are in use in other chains. But we already have a lock manager
setup, so of course, part of the libvirt work is integrating this work
so that no image is ever reverted if the lock manager says a file is in
use; libvirt can also check all registered storage pools and prevent a
rebase if the storage pool tracks files that serve as base images to
more than one other images, and prevent modifying base images in those
cases (there's probably still a lot of work to be done in libvirt to
make it bulletproof, but that's okay, since again, my ideas about
reverse merges are post-0.9.10 and would also require more work from qemu).
At any rate, you responded favorably to the first half of my email (the
proposal for what to implement in 0.9.10), even if you got scared by my
musings about possible future extensions at later releases. I'll take
that as a good sign that I have a) come up with a good API worth adding
now, and b) divided things appropriately into what is reasonable to do
now vs. what is complex enough to be worth delaying until we have more
experience with the use cases and ramifications of adding the complexity.
--
Eric Blake eblake(a)redhat.com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org