The 15/03/13, Eric Blake wrote:
On 03/15/2013 06:17 AM, Nicolas Sebrecht wrote:
<...>
> Here is where we are in the workflow (step C) for what we are
talking about:
>
> Step 1: create external snapshot for all VM disks (includes VM state).
> Step 2: do the backups manually while the VM is still running (original disks and
memory state).
During this step, the qcow2 files created in step 1 are getting larger
proportional to the amount of changes done in the guest; obviously, the
faster you can complete it, the smaller the deltas will be, and the
faster your later merge steps will be. Since later merge steps have to
be done while the guest is halted, it's good to keep small size in mind.
More on this thought below...
> Step 3: save and halt the vm state once backups are finished.
Step3: virsh save && virsh destroy.
> Step 4: merge the snapshots (qcow2 disk wrappers) back to their
backing file.
This step is done with raw qemu-img commands at the moment, and takes
time proportional to the size of the qcow2 data.
> Step 5: start the VM.
Step 5: virsh restore.
...As mentioned above, the time taken in step 2 can affect how big
the
delta is, and therefore how long step 4 lasts (while the guest is
offline). If your original disk is huge, and copying it to your backup
takes a long amount of time, it may pay to do an iterative approach:
start with raw image:
raw.img
create the external snapshot at the point you care about
raw.img <- snap1.qcow2
transfer raw.img and vmstate file to backup storage, taking as long as
needed (gigabytes of data, so on the order of minutes, during which the
qcow2 files can build up to megabytes in size)
raw.img <- snap1.qcow2
create another external snapshot, but this time with --disk-only
--no-metadata (we don't plan on reverting to this point in time)
raw.img <- snap1.qcow2 <- snap2.qcow2
use 'virsh blockcommit dom vda --base /path/to/raw --top /path/to/snap1
--wait --verbose'; this takes time for megabytes of storage, but not
gigabytes, so it is faster than the time to copy raw.img, which means
snap2.qcow2 will hold less delta data than snap1.qcow2
raw.img <- snap2.qcow2
now stop the guest, commit snap2.qcow2 into raw.img, and restart the guest
By doing an iteration, you've reduced the size of the file that has to
be committed while the guest is offline; and may be able to achieve a
noticeable reduction in guest downtime.
Looks like a very good optimization for recent enough libvirt. Thanks!
--
Nicolas Sebrecht