Re: [libvirt] [PATCH v5 3/4] blockcommit: track job type in xml

29 Jul 2014


      On 07/29/2014 07:03 AM, Peter Krempa wrote:
...
On 07/29/14 06:20, Eric Blake wrote:
...
A future patch is going to wire up qemu active block commit jobs;
but as they have similar events and are canceled/pivoted in the
same way as block copy jobs, it is easiest to track all bookkeeping
for the commit job by reusing the <mirror> element.  This patch
adds domain XML to track which job was responsible for creating a
mirroring situation, and adds a job='copy' attribute to all
existing uses of <mirror>.  Along the way, it also massages the
qemu monitor backend to read the new field in order to generate
the correct type of libvirt job (even though it requires a
future patch to actually cause a qemu event that can be reported
as an active commit).  It also prepares to update persistent XML
to match changes made to live XML when a copy completes.
...
...
@@ -1025,6 +1026,9 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
     if (disk) {
         /* Have to generate two variants of the event for old vs. new
          * client callbacks */
+        if (type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT &&
+            disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT)
+            type = disk->mirrorJob;
Hmm, looks like this would be better of in the next patch but one of the
tests probably wouldn't work without this.
Maybe, but then the next commit doesn't look so deceptively simple, and
I _did_ mention it in the commit message :)
...
...
if (disk->mirrorState == VIR_DOMAIN_DISK_MIRROR_PIVOT) {
+                if (vm->newDef) {
+                    int indx = virDomainDiskIndexByName(vm->newDef, disk->dst,
+                                                        false);
...
...
+                    if (copy) {
+                        virStorageSourceFree(persistDisk->src);
+                        persistDisk->src = copy;
+                    }
+                }
Won't this allow us to enable block copy on persistent domains?
It's part of the effort.  The other half is actually being able to
restart a blockcopy operation across qemu restarts (not available in
qemu 2.1, it requires a persistent bitmap; but might be available in
qemu 2.2).  The reason that active commit can be done on a persistent
domain is that you don't lose progress - if you spend 4 minutes getting
80% of the way, then the guest shuts down, then you restart qemu and
restart the commit operation, then you only need 1 more minute for the
remaining 20%.  That is, aborting a commit job is not expensive.  On the
other hand, aborting a copy job at 80% progress, and then starting a new
one, will end up copying all 100% of the file (5 more minutes, for a
total of 9 instead of 5), because without a persistent bitmap, qemu
loses track of how much of the copy has already been made.  So for now,
I'm still going to require transient domain for blockcopy.

That, and I _really_ need a new API that lets you target a <disk> XML
element as the copy destination (I ran out of time for 1.2.7, but 1.2.8
is looking good).  Plus we need a way to expose qemu's drive-backup
operation; the existing blockcopy code is crammed into
virDomainBlockRebase where we don't have good exposure for pre- vs.
post- snapshot; but a new virDomainBlockCopy() with an XML destination
and proper flags would be better for it.  For those less familiar with
the difference between the two qemu commands, here's a timeline:


 A                       B           C
+----------------------------------------------------------
 ^                       ^           ^
 request drive-mirror    |           |
                         sync ready
                                     request complete/pivot

the copy taken by drive-mirror is consistent at point C, which may be
several minutes after the user actually requested the copy.  So whatever
the guest saw at point A may be long gone by the time the copy is
actually finalized; planning for user point-in-time snapshots requires
forethought to actually get a mirror up and running long before the user
has a need for the snapshot.  However, with that planning, the time
between declaring the snapshot complete and actually having the snapshot
to work with is nearly instant.

 A                       B
+------------------------------------------
 ^                       ^
 request drive-backup    |
                         backup complete

the copy taken by drive-backup is not consistent until point B, but its
contents match what the guest saw at point A.  This is much nicer for
point-in-time semantics, although it does mean that the user has to wait
from the time they requested a backup until it is actually ready for
them to use.
...
ACK, although this patch an the next one are a bit borderline now that
libvirt is frozen for the next release.
I'll wait for a third-party opinion, then (maybe DV or danpb will weigh
in before DV tries to cut rc2 in the next 24 hours...)

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org