[libvirt-users] backup procedure using blockcopy

Monday, 18 March 2013

The 15/03/13, Eric Blake wrote:
...
 On 03/15/2013 06:17 AM, Nicolas Sebrecht wrote: 
...
 > Here are the basics steps. This is still not that simple and
there are
 > tricky parts in the way.
 > 
 > Usual workflow (use case 2)
 > ===========================
 > 
 > Step 1: create external snapshot for all VM disks (includes VM state).
 > Step 2: do the backups manually while the VM is still running (original disks and
memory state).
 > Step 3: save and halt the vm state once backups are finished.
 > Step 4: merge the snapshots (qcow2 disk wrappers) back to their backing file.
 > Step 5: start the VM.

 This involves guest downtime, longer according to how much state changed
 since the snapshot. 
Right.

...
 > Restarting from the backup (use case 1)
 > =======================================
 > 
 > Step A: shutdown the running VM and move it out the way.
 > Step B: restore the backing files and state file from the archives of step 2.
 > Step C: restore the VM. (still not sure on that one, see below)
 > 
 > I wish to provide a more detailed procedure in the future.
 > 
 > 
 >> With new enough libvirt and qemu, it is also possible to use 'virsh
 >> blockcopy' instead of snapshots as a backup mechanism, and THAT works
 >> with raw images without forcing your VM to use qcow2.  But right now, it
 >> only works with transient guests (getting it to work for persistent
 >> guests requires a persistent bitmap feature that has been proposed for
 >> qemu 1.5, along with more libvirt work to take advantage of persistent
 >> bitmaps).
 > 
 > Fine. Sadly, my guests are not transient.

 Guests can be made temporarily transient.  That is, the following
 sequence has absolute minimal guest downtime, and can be done without
 any qcow2 files in the mix.  For a guest with a single disk, there is
 ZERO! downtime:

 virsh dumpxml --security-info dom > dom.xml
 virsh undefine dom
 virsh blockcopy dom vda /path/to/backup --wait --verbose --finish
 virsh define dom.xml

 For a guest with multiple disks, the downtime can be sub-second, if you
 script things correctly (the downtime lasts for the duration between the
 suspend and resume, but the steps done in that time are all fast):

 virsh dumpxml --security-info dom > dom.xml
 virsh undefine dom
 virsh blockcopy dom vda /path/to/backup-vda
 virsh blockcopy dom vdb /path/to/backup-vdb
 polling loop - check periodically until 'virsh blockjob dom vda' and
 'virsh blockjob dom vdb' both show 100% completion
 virsh suspend dom
 virsh blockjob dom vda --abort
 virsh blockjob dom vdb --abort
 virsh resume dom
 virsh define dom.xml

 In other words, 'blockcopy' is my current preferred method of online
 guest backup, even though I'm still waiting for qemu improvements to
 make it even nicer. 
As I understand the man-page, blockcopy (without --shallow) creates a
new disk file of a disk by merging all the current files if there are
more than one.

Unless --finish/--pivot is passed to blockcopy or until
--abort/--pivot/--async is passed to blockjob, the original disks
(before blockcopy started) and the new disk created by blockcopy are
both mirrored.

Only --pivot makes use of the new disk. So with --finish or --abort, we
get a backup of a running guest. Nice! Except maybe that the backup
doesn't include the memory state.

In order to include the memory state to the backup, I guess the
pause/resume is inevitable:

  virsh dumpxml --security-info dom > dom.xml
  virsh undefine dom
  virsh blockcopy dom vda /path/to/backup-vda
  polling loop - check periodically until 'virsh blockjob dom vda'
  shows 100% completion
  virsh suspend dom
  virsh save dom /path/to/memory-backup --running
  virsh blockjob dom vda --abort
  virsh resume dom
  virsh define dom.xml

I'd say that the man page miss the information that these commands can
run with a running guest, dispite the mirroring feature might imply it.

I would also add a "sync" command just after the first command as a
safety mesure to ensure the xml is kept on disk.

The main drawback I can see is that the hypervisor must have at least as
free disk space than the disks to backup... Or have the path/to/backups
as a remote mount point.

Now, I wonder if I change of backup strategy and make the remote hosting
the backup mounted locally on the hypervisor (via nfs, iSCSI, sshfs,
etc), should I expect write performance degradation? I mean, does the
running guest wait for underlying both mirrored disk write (cache is set
to none for the current disks)?

-- 
Nicolas Sebrecht

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

[libvirt-users] backup procedure using blockcopy