On 03/14/2013 06:29 AM, Nicolas Sebrecht wrote:
The 13/03/13, Eric Blake wrote:
> You might want to look into external snapshots as a more efficient way
> of taking guest snapshots.
I have guests with raw disks due to Windows performance issues. It would
very welcome to have minimal downtime as some disks are quiet large
(terabytes) and the allowed downtime window very short. Let's try external
snapshots for guest "VM" while running:
Do be aware that an external snapshot means you are no longer using a
raw image - it forces you to use a qcow2 file that wraps your raw image.
With new enough libvirt and qemu, it is also possible to use 'virsh
blockcopy' instead of snapshots as a backup mechanism, and THAT works
with raw images without forcing your VM to use qcow2. But right now, it
only works with transient guests (getting it to work for persistent
guests requires a persistent bitmap feature that has been proposed for
qemu 1.5, along with more libvirt work to take advantage of persistent
bitmaps).
There's also a proposal on the qemu lists to add a block-backup job,
which I would need to expose in libvirt, which has even nicer backup
semantics than blockcopy, and does not need a persistent bitmap.
# cd virtuals/images
# virsh
virsh> snapshot-create-as VM snap1 "snap1 VM" --memspec
file=VM.save,snapshot=external --diskspec vda,snapshot=external,file=VM-snap1.img
Domain snapshot snap1 created
virsh> exit
# ls VM-snap1.img
ls: cannot access VM-snap1.img: No such file or directory
Specify an absolute patch, not a relative one.
#
Ooch!
<investigating...>
# ls /VM-snap1.img
/VM-snap1.img
# ls /VM.save
/VM.save
#
Surprising! I would have expect files to be stored in virtuals/images. This is
not the point for now, let's continue.
Actually, it would probably be better if libvirtd errored out on
relative path names (relative to what? libvirtd runs in '/', and has no
idea what directory virsh was running in), and therefore virsh should be
nice and convert names to absolute before handing them to libvirtd.
# virsh snapshot-list VM
Name Creation Time State
------------------------------------------------------------
snap1 2013-03-14 12:20:01 +0100 running
#
USE CASE 1: restoring from backing file
=======================================
# virsh shutdown VM
I can't find a snapshot-* doing what I want (snapshot-revert expects to
revert a snapshot), trying restore.
Correct - we still don't have 'snapshot-revert' wired up in libvirt to
revert to an external snapshot - we have ideas on what needs to happen,
but it will take time to get that code into the code base. So for now,
you have to do that manually.
# virsh restore /VM.save
Domain restored from /VM.save
Hmm. This restored the memory state from the point at which the
snapshot was taken, but unless you were careful to check that the saved
state referred to the base file name and not the just-created qcow2
wrapper from when you took the snapshot, then your disks might be in an
inconsistent state with the memory you are loading. Not good. Also,
restoring from the base image means that you are invalidating the
contents of the qcow2 file for everything that took place after the
snapshot was taken.
# LANG=C virsh snapshot-list VM
Name Creation Time State
------------------------------------------------------------
snap1 2013-03-14 12:20:01 +0100 running
#
As we might expect, the snapshot is still there.
# virsh snapshot-delete VM snap1
error: Failed to delete snapshot snap1
error: unsupported configuration: deletion of 1 external disk snapshots not supported
yet
Yeah, again a known limitation. Once you change state behind libvirt's
back (because libvirt doesn't yet have snapshot-revert wired up to do
things properly), you generally have to 'virsh snapshot-delete
--metadata VM snap1' to tell libvirt to forget the snapshot existed, but
without trying to delete any files, since you did the file deletion
manually.
#
A bit annoying. Now, it seems that I have to manually delete garbage. Actually,
I've tried and I had to delete /VM.save, /VM-snap1.img,
/var/lib/libvirt/qemu/snapshot/VM/snap1.xml and restart libvirt (no snapshot-refresh).
You have to delete /VM.save and /VM-snap1.img yourself, but you should
have used 'virsh snapshot-delete --metadata' instead of mucking around
in /var/lib (that directory should not be managed manually).
USE CASE 2: the files are saved in another place, let's merge back the changes
==============================================================================
The idea is to merge VM-snap1.img back to VM.raw with minimal downtime. I can't
find a command for that, let's try manually.
Here, qemu is at fault. They have not yet given us a command to do that
with minimal downtime. They HAVE given us 'virsh blockcommit', but it
is currently limited to reducing chains of length 3 or longer to chains
of at least 2. It IS possible to merge back into a single file while
the guest remains live, by using 'virsh blockpull', but that single file
will end up being qcow2; and it takes the time proportional to the size
of the entire disk, rather than to the size of the changes since the
snapshot was taken. Again, here's hoping that qemu 1.5 gives us live
commit support, for taking a chain of 2 down to a single raw image.
# virsh managedsave VM
# qemu-img commit /VM-snap1.img
# rm /VM-snap1.img /VM.save
# virsh start VM
error: Failed to start domain VM
error: cannot open file 'VM-snap1.img': No such file or directory
You went behind libvirt's back and removed /VM-snap1.img, but failed to
update the managedsave image to record the location of the new filename.
# virsh edit VM
<virsh edit VM to come back to vda -> VM.raw>
Yes, but you still have the managedsave image in the way.
# virsh start VM
error: Failed to start domain VM
error: cannot open file 'VM-snap1.img': No such file or directory
Try 'virsh managedsave-remove VM' to get the broken managedsave image
out of the way. Or, if you are brave, and insist on rebooting from the
memory state at which the managedsave image was taken but are sure you
have tweaked the disks correctly to match the same point in time, then
you can use 'virsh save-image-edit /path/to/managedsave' (assuming you
know where to look in /etc to find where the managedsave file was stored
internally by libvirt). Since modifying files in /etc is not typically
recommended, I will assume that if you can find the right file to edit,
you are already brave enough to take on the consequences of going behind
libvirt's back. At any rate, editing the managed save image to point
back to the correct raw file name, followed by 'virsh start', will let
you resume with memory restored to the point of your managed save (and
hopefully you pointed the disks to the same point in time).
#
Looks like the complain comes from the xml state header.
# virsh save-image-edit /var/lib/libvirt/qemu/save/VM.save
<virsh edit VM to come back to vda -> VM.raw>
error: operation failed : new xml too large to fit in file
#
Aha - so you ARE brave, and DID try to edit the managedsave file. I'm
surprised that you really hit a case where your edits pushed the XML
over a 4096-byte boundary. Can you come up with a way to (temporarily)
use shorter names, such as having /VM-snap1.img be a symlink to the real
file, just long enough for you to get the domain booted again? Also, I
hope that you did your experimentation on a throwaway VM, and not on a
production one, in case you did manage to fubar things to the point of
data corruption by mismatching disk state vs. memory state.
Yes, I know that reverting to snapshots is still very much a work in
progress in libvirt, and that you are not the first to ask these sorts
of questions (reading the list archives will show that this topic comes
up quite frequently).
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org