[libvirt] RFC API proposal: virDomainBlockRebase

31 Jan 2012

      Right now, the existing virDomainBlockPull API has a tough limitation -
it is an all-or-none approach.  In all my examples below, I'm starting
from the following relationship, where '<-' means 'is a backing file of':

template <- intermediate <- current

virDomainBlockPull can only convert things in a forward direction, with
the merge destination being the current image, resulting in:

merge template and intermediate into current, creating:
current

Meanwhile, qemu is adding support for a partial block pull operation,
still on the current image as the merge destination, but where you can
now specify an optional argument to limit the pull to just the
intermediate files and altering the current image to be backed by an
ancestor file, as in:

merge intermediate into current, creating:
template <- current

For 0.9.10, I'd like to add the following API:

/**
 * virDomainBlockRebase:
 * @dom: pointer to domain object
 * @disk: path to the block device, or device shorthand
 * @base: new base image, or NULL for entire block pull
 * @bandwidth: (optional) specify copy bandwidth limit in Mbps
 * @flags: extra flags; not used yet, so callers should always pass 0
 *
 * Populate a disk image with data from its backing image chain, and
 * setting the new backing image to @base, where base is the absolute
 * path of one of the backing images in the chain.  If @base is NULL,
 * then this operation is identical to virDomainBlockPull().  Once all
 * data from its backing image chain has been pulled, the disk no
 * longer depends on those intermediate backing images.  This function
 * pulls data for the entire device in the background.  Progress of the
 * operation can be checked with virDomainGetBlockJobInfo() and
 * the operation can be aborted with virDomainBlockJobAbort().  When
 * finished, an asynchronous event is raised to indicate the final
 * status.
 *
 * The @disk, @bandwidth, and @flags parameters are handled as in
 * virDomainBlockPull().
 *
 * Returns 0 if the operation has started, -1 on failure.
 */
int virDomainBlockRebase(virDomainPtr dom, const char *disk,
                         const char *base,
                         unsigned long bandwidth, unsigned int flags);

Given that Adam has a pending patch to support a
VIR_DOMAIN_BLOCK_PULL_ASYNC flag, this same flag would have to be
supported in virDomainBlockRebase.

I've also been chatting with Federico Simoncelli about how the above
operation would work for VDSM purposes in doing a live block move, while
preserving a common template base file:

start with:
vda: template <- current1

create a disk-only snapshot, with:
 tmpsnap = virDomainSnapshotCreateXML(dom,
 "<domainsnapshot>\n"
 "  <disks>\n"
 "    <disk name='vda'>\n"
 "      <source>/path/to/current2</source>\n"
 "    </disk>\n"
 "  <disks>\n"
 "</domainsnapshot>", VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY)
where the xml calls out the destination file name, resulting in:
vda: template <- current1 <- current2

perform the block rebase, with:
 virDomainBlockRebase(dom, "vda", "/path/to/template",
 VIR_DOMAIN_BLOCK_PULL_ASYNC)
as well as waiting for the event (or polling status) to wait for
completion, resulting in:
vda: template <- current2

delete the disk-only snapshot metadata as no longer useful, with:
 virDomainSnapshotDelete(tmpsnap,
 VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY)

At one point, I thought of creating a single libvirt API that performs
all of those steps in one call; but right now, I'm not proposing that,
because of the fact that qemu has no way to undo a snapshot.  In other
words, without an undo operation, if the snapshot phase succeeds but the
block rebase phase fails, a single API would have to report failure even
though the domain was altered, while the ideal scenario is that
reporting failure means things were in the same state as before the API
started.

Beyond 0.9.10, there are some additional useful merge patterns that
might be worth exposing.  All of these operations are already possible
on offline images, using qemu-img; but none of them are possible on live
images using current qemu, which is why I'm thinking it is something for
another day.  I'm also hoping to someday enhance the set of
virStorageVol APIs to make backing file manipulation of offline images
easier.  At any rate, the addition merge operations are:

forward live merge with a non-current image as the merge destination, as in:

merge template into intermediate, creating:
intermediate <- current

backward merge of a current image (that is, undoing a current snapshot):

merge current into intermediate, creating:
template <- intermediate

and backward merge of a non-current image (that is, undoing an earlier
snapshot, but by modifying the template rather than the current image):

merge intermediate into base, creating:
template <- current

Backward merge of the current image seems like something easy to fit
into my proposed API (add a new flag, maybe called
VIR_DOMAIN_BLOCK_REBASE_BACKWARD).  Manipulations of anything that does
not involve the current image seems tougher, assuming qemu ever even
reaches the point where it exposes those operations on live volumes -
the user has to specify not one, but two backing file names.  But even
that could possibly be fit into my API, by adding a flag that states
that the const char *backing argument is treated as an XML snippet
describing the full details of the merge, with the XML listing which
image is being merged to which destination, rather than as just the name
of the backing file becoming the new base of the current image.  Perhaps
something like:

virDomainBlockRebase(dom, block,
  "<rebase>\n"
  "  <source>/path/to/intermediate</source>\n"
  "  <dest>/path/to/template</dest>\n"
  "</rebase>",
  VIR_DOMAIN_BLOCK_REBASE_XML|VIR_DOMAIN_BLOCK_REBASE_BACKWARD)

as a specification to take the contents of intermediate, merge those
backwards into template, and as well as adjusting the rest of the
backing file chain so that whatever used to be backed by intermediate is
now backed by template.  Or, if qemu ever gives us the ability to merge
non-current images, we may decide at that time that it is worth a new
API to expose those new complexities.

Another thing I have been thinking about is virDomainSnapshotDelete.
The above conversation talks about merging of a single disk, but a live
disk snapshot operation can create backing file chains for multiple
disks at once, all tracked by a snapshot.  Additionally, the current
code allows a snapshot delete of internal snapshots, but refuses to do
anything useful with an external snapshot, because there is currently no
way to specify if the snapshot is removed by merging the base into the
new current, or by undoing the current and merging it backwards into the
base.  Alas, virDomainSnapshotDelete doesn't take any arguments for how
to handle the situation, and use of a flag to make the decision would
limit all disks to be handled in the same manner.  So what I'm thinking
is that when a snapshot is created (or redefined, using redefinition as
the vehicle to add in the new XML), that the snapshot XML itself can
record the preferred direction for undoing the snapshot; for example:

<domainsnapshot>
  <disks>
    <disk name='/path/to/old_vda'>
      <source file='/path/to/new_vda'/>
      <on_delete merge='forward'/>
    </disk>
    <disk name='/path/to/old_vdb'>
      <source file='/path/to/new_vdb'/>
      <on_delete merge='backward'/>
    </disk>
  <disks>
</domainsnapshot>

then when virDomainSnapshotDelete is called on that snapshot, old_vda
would be forward merged into new_vda, while new_vdb would be backward
merged into old_vdb. Again, that's food for thought for post-0.9.10, and
shouldn't get in the way of adding virDomainBlockRebase() now.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Eric Blake

Adam Litke

Eric Blake

Adam Litke

tags

participants (2)