[libvirt-users] Managing Live Snapshots with Libvirt 1.0.1

Hello, I recently compiled libvirt 1.0.1 and qemu 1.3.0 on Ubuntu 12.04. I have performed live snapshots on VMs using "virsh snapshot-create-as" and then later re-merge the images together using "virsh blockpull". I am wondering how I can do a couple of other operations on the images while the VM is running. For example, VM1 is running from the snap3 image, with the following snapshot history (backing files): [orig] <-- [snap1] <-- [snap2] <-- [snap3] 1. Can I revert VM1 to use snap2 while it is live, or must it be shutdown? After shutting it down, is the best way to revert to snap2 to just edit the xml file and change the block device to point to snap2? Afterwards, I believe snap3 would become unusable and should be deleted? 2. If I would like to start a new VM from snap1, is there a way to extract a copy of this snapshot from the chain, to an independent image file? I tried to use "virsh blockcopy" but it returned this error: # virsh blockcopy VM1 vda snap1.qcow2 --wait --verbose error: Requested operation is not valid: domain is not transient Thanks, Andrew

On 01/31/2013 04:09 PM, Andrew Martin wrote:
Hello,
[Can you convince your mailer to wrap long lines?]
I recently compiled libvirt 1.0.1 and qemu 1.3.0 on Ubuntu 12.04. I have performed live snapshots on VMs using "virsh snapshot-create-as" and then later re-merge the images together using "virsh blockpull". I am wondering how I can do a couple of other operations on the images while the VM is running. For example, VM1 is running from the snap3 image, with the following snapshot history (backing files):
[orig] <-- [snap1] <-- [snap2] <-- [snap3]
Are you using the --disk-only and/or --memspec options when you use snapshot-create-as? Or put another way, are your snapshots internal or external?
1. Can I revert VM1 to use snap2 while it is live, or must it be shutdown? After shutting it down, is the best way to revert to snap2 to just edit the xml file and change the block device to point to snap2? Afterwards, I believe snap3 would become unusable and should be deleted?
For internal snapshots, it should just work; when reverting, it shouldn't matter whether the domain is live or shutdown, because you are going to change state anyway. Internal snapshots are not invalidated (the way qcow2 works, each internal snapshot increments a reference counter on all clusters in use at the time of the snapshot, with no intrinsic relation between snapshots; executing from the point of a given snapshot does a copy-on-write allocation for any cluster with a reference count larger than 1). Peter and I kind of got stalled on the code to do reverts of external snapshots. You may want to take a snapshot just prior to reverting if you want to be able to toggle between snapshot branches rather than abandoning the current state that existed prior to the revert action. With external snapshots, reverting to an earlier point has two possibilities - modify the backing file, which invalidates all later files that were based on the backing file, or create a new qcow2 wrapper around the backing file. One of the features I'd like to add (but did not make into 1.0.2) is the ability to do a revert-and-create atomic operation, where a new external snapshot is created in parallel to the existing atomic snapshot, so that you aren't invalidating other branches of the snapshot tree. The other alternative is that a revert must discard any branches of the snapshot tree that get invalidated by modifying a backing file earlier in the chain. But without full libvirt support for revert of external snapshots, some of these actions will require manipulations of domain xml and/or calls to qemu-img to affect the changes you want (in this case, while the domain is offline).
2. If I would like to start a new VM from snap1, is there a way to extract a copy of this snapshot from the chain, to an independent image file?
If you did an internal snapshot, then the only current way to extract the information from that snapshot is to have the guest offline, then use a series of qemu-img commands to temporarily switch which snapshot is active, clone the image, then undo the temporary switch. I have requested that qemu developers add a way to extract snapshot contents via an NBD server, even for an in-use image, but we aren't there yet (certainly not for qemu 1.4, and probably not even for qemu 1.5). For external snapshots, all you have to do is create a new qcow2 file with the same backing image that you want to branch from.
I tried to use "virsh blockcopy" but it returned this error: # virsh blockcopy VM1 vda snap1.qcow2 --wait --verbose error: Requested operation is not valid: domain is not transient
Yeah, until qemu 1.5 introduces persistent bitmaps, this is an unfortunate limitation in libvirt blockcopy. Libvirt refuses to do a blockcopy on a persistent domain, because current qemu can't restart a blockcopy operation; it is possible to temporarily make your domain transient (virsh undefine), then do the blockcopy, then restore the domain persistence (virsh define). Or wait until persistent blockcopy is added to qemu, and libvirt gains the counterpart code to make use of persistent bitmaps in order to allow a restartable blockcopy even across actions like 'virsh save'. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Hi Eric, ----- Original Message ----- > From: "Eric Blake" <eblake@redhat.com> To: "Andrew Martin" > <amartin@xes-inc.com> Cc: libvirt-users@redhat.com Sent: Thursday, > January 31, 2013 6:25:42 PM Subject: Re: [libvirt-users] Managing Live > Snapshots with Libvirt 1.0.1 > > On 01/31/2013 04:09 PM, Andrew Martin wrote: > > > Hello, > > [Can you convince your mailer to wrap long lines?] Done. :) > > > > I recently compiled libvirt 1.0.1 and qemu 1.3.0 on Ubuntu > > 12.04. I have performed live snapshots on VMs using "virsh > > snapshot-create-as" and then later re-merge the images together > > using "virsh blockpull". I am wondering how I can do a couple of > > other operations on the images while the VM is running. For example, > > VM1 is running from the snap3 image, with the following snapshot > > history (backing files): > > > > [orig] <-- [snap1] <-- [snap2] <-- [snap3] > > Are you using the --disk-only and/or --memspec options when you use > snapshot-create-as? Or put another way, are your snapshots internal > or external? Here's the full command I've been using: virsh snapshot-create-as --domain "$VM_NAME" --name "$SNAP_NAME" --disk-only --atomic Can you elaborate on the difference between internal and external snapshots? The snapshots I created with the above command are contained in separate qcow2 backing files, which I would think to be external, but snapshot-info shows them as internal? >From what I can see there are 3 different types of snapshots: 1. internal (offline) qcow2 snapshots created with qemu-img snapshot 2. external (live) snapshots with --disk-only 3. external (live) snapshots with --memspec and/or --diskspec Or, am I misunderstanding what --disk-only and --diskspec do? > > > > 1. Can I revert VM1 to use snap2 while it is live, or must it be > > shutdown? After shutting it down, is the best way to revert to snap2 > > to just edit the xml file and change the block device to point to > > snap2? Afterwards, I believe snap3 would become unusable and should > > be deleted? > > For internal snapshots, it should just work; when reverting, it > shouldn't matter whether the domain is live or shutdown, because > you are going to change state anyway. Internal snapshots are not > invalidated (the way qcow2 works, each internal snapshot increments a > reference counter on all clusters in use at the time of the snapshot, > with no intrinsic relation between snapshots; executing from the point > of a given snapshot does a copy-on-write allocation for any cluster > with a reference count larger than 1). Since they are separate copy-on-write snapshots, this means I could revert to snap1, make changes, and then later revert to snap3? > > Peter and I kind of got stalled on the code to do reverts of external > snapshots. You may want to take a snapshot just prior to reverting > if you want to be able to toggle between snapshot branches rather > than abandoning the current state that existed prior to the revert > action. With external snapshots, reverting to an earlier point has > two possibilities - modify the backing file, which invalidates all > later files that were based on the backing file, or create a new qcow2 > wrapper around the backing file. > > One of the features I'd like to add (but did not make into 1.0.2) > is the ability to do a revert-and-create atomic operation, where a > new external snapshot is created in parallel to the existing atomic > snapshot, so that you aren't invalidating other branches of the > snapshot tree. The other alternative is that a revert must discard > any branches of the snapshot tree that get invalidated by modifying a > backing file earlier in the chain. > > But without full libvirt support for revert of external snapshots, > some of these actions will require manipulations of domain xml and/or > calls to qemu-img to affect the changes you want (in this case, while > the domain is offline). > > > > 2. If I would like to start a new VM from snap1, is there a way to > > extract a copy of this snapshot from the chain, to an independent > > image file? > > If you did an internal snapshot, then the only current way to extract > the information from that snapshot is to have the guest offline, then > use a series of qemu-img commands to temporarily switch which snapshot > is active, clone the image, then undo the temporary switch. I have > requested that qemu developers add a way to extract snapshot contents > via an NBD server, even for an in-use image, but we aren't there yet > (certainly not for qemu 1.4, and probably not even for qemu 1.5). > > For external snapshots, all you have to do is create a new qcow2 file > with the same backing image that you want to branch from. > > > I tried to use "virsh blockcopy" but it returned this error: # virsh > > blockcopy VM1 vda snap1.qcow2 --wait --verbose error: Requested > > operation is not valid: domain is not transient > > Yeah, until qemu 1.5 introduces persistent bitmaps, this is an > unfortunate limitation in libvirt blockcopy. Libvirt refuses to do a > blockcopy on a persistent domain, because current qemu can't restart > a blockcopy operation; it is possible to temporarily make your domain > transient (virsh undefine), then do the blockcopy, then restore the > domain persistence (virsh define). Or wait until persistent blockcopy > is added to qemu, and libvirt gains the counterpart code to make use > of persistent bitmaps in order to allow a restartable blockcopy even > across actions like 'virsh save'. > > -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization > library http://libvirt.org > > Thanks, Andrew

On 02/01/2013 11:12 AM, Andrew Martin wrote: >> Are you using the --disk-only and/or --memspec options when you use >> snapshot-create-as? Or put another way, are your snapshots internal >> or external? > > Here's the full command I've been using: > virsh snapshot-create-as --domain "$VM_NAME" --name "$SNAP_NAME" --disk-only --atomic That's external. > > Can you elaborate on the difference between internal and external > snapshots? The snapshots I created with the above command are contained > in separate qcow2 backing files, which I would think to be external, but > snapshot-info shows them as internal? Any snapshot created without --disk-only and without --memspec is internal - it resides completely within an existing qcow2 image (no new files are created). All snapshots with --disk-only or --memspec are external; the domain state is now spread across multiple files. If snapshot-info shows a --disk-only snapshot as internal, then that's a bug, and we need to fix it. What version of libvirt did you test? > >>From what I can see there are 3 different types of snapshots: > 1. internal (offline) qcow2 snapshots created with qemu-img snapshot > 2. external (live) snapshots with --disk-only > 3. external (live) snapshots with --memspec and/or --diskspec > > Or, am I misunderstanding what --disk-only and --diskspec do? There's actually: 1. internal offline: internal qcow2 snapshots (disk-only) created with 'snapshot-create-as' with no flags for offline guest [wired to 'qemu-img snapshot' under the hood] 2. internal online: internal qcow2 snapshots (disk and memory) created with 'snapshot-create-as' with no flags for online guest [wired to qemu 'savevm' HMP command] 3. external offline: qcow2 wrapper around any base file type, created with 'snapshot-create-as --disk-only' for offline guest [wired to 'qemu-img create'] 4. external online disk-only: qcow2 wrapper around any base file type, created with 'snapshot-create-as --disk-only' for online guest [wired to qemu 'transaction' QMP command] 5. external online memory: qcow2 wrapper around any base file type plus migration to file, created with 'snapshot-create-as --memspec' for online guest [wired to qemu 'migrate' and 'transaction' QMP commands] 6. external memory only: migration to file, created with 'virsh save' for online guest [wired to qemu 'migrate' QMP command] There's also talk on the upstream qemu list about making even more options for snapshots (for example, 'savevm' is too bulky for creating internal qcow2 snapshots, so the goal is to allow for an internal qcow2 snapshot without also grabbing guest memory; also, live migration to file is a waste of disk space on a guest that takes a long time to converge, so a new file format for storing guest memory would make live snapshots take less disk space). >> For internal snapshots, it should just work; when reverting, it >> shouldn't matter whether the domain is live or shutdown, because >> you are going to change state anyway. Internal snapshots are not >> invalidated (the way qcow2 works, each internal snapshot increments a >> reference counter on all clusters in use at the time of the snapshot, >> with no intrinsic relation between snapshots; executing from the point >> of a given snapshot does a copy-on-write allocation for any cluster >> with a reference count larger than 1). > > Since they are separate copy-on-write snapshots, this means I could > revert to snap1, make changes, and then later revert to snap3? For internal snapshots, yes. For external snapshots, the act of reverting is not yet wired up nicely in libvirt 1.0.2; basically, doing a revert of an external snapshot needs to decide whether to create another snapshot (by adding an atomic revert-and-create action), or to invalidate all existing snapshots that descend below the point being reverted to. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

----- Original Message ----- > From: "Eric Blake" <eblake@redhat.com> > To: "Andrew Martin" <amartin@xes-inc.com> > Cc: libvirt-users@redhat.com > Sent: Friday, February 1, 2013 3:23:03 PM > Subject: Re: [libvirt-users] Managing Live Snapshots with Libvirt 1.0.1 > > On 02/01/2013 11:12 AM, Andrew Martin wrote: > >> Are you using the --disk-only and/or --memspec options when you > >> use > >> snapshot-create-as? Or put another way, are your snapshots > >> internal > >> or external? > > > > Here's the full command I've been using: > > virsh snapshot-create-as --domain "$VM_NAME" --name "$SNAP_NAME" > > --disk-only --atomic > > That's external. > > > > > Can you elaborate on the difference between internal and external > > snapshots? The snapshots I created with the above command are > > contained > > in separate qcow2 backing files, which I would think to be > > external, but > > snapshot-info shows them as internal? > > Any snapshot created without --disk-only and without --memspec is > internal - it resides completely within an existing qcow2 image (no > new > files are created). All snapshots with --disk-only or --memspec are > external; the domain state is now spread across multiple files. > > If snapshot-info shows a --disk-only snapshot as internal, then > that's a > bug, and we need to fix it. What version of libvirt did you test? > I was using 1.0.1. I'm unable to reproduce this now so I guess it isn't a concern. > > > >>From what I can see there are 3 different types of snapshots: > > 1. internal (offline) qcow2 snapshots created with qemu-img > > snapshot > > 2. external (live) snapshots with --disk-only > > 3. external (live) snapshots with --memspec and/or --diskspec > > > > Or, am I misunderstanding what --disk-only and --diskspec do? > > There's actually: > > 1. internal offline: internal qcow2 snapshots (disk-only) created > with > 'snapshot-create-as' with no flags for offline guest [wired to > 'qemu-img > snapshot' under the hood] > 2. internal online: internal qcow2 snapshots (disk and memory) > created > with 'snapshot-create-as' with no flags for online guest [wired to > qemu > 'savevm' HMP command] > 3. external offline: qcow2 wrapper around any base file type, created > with 'snapshot-create-as --disk-only' for offline guest [wired to > 'qemu-img create'] > 4. external online disk-only: qcow2 wrapper around any base file > type, > created with 'snapshot-create-as --disk-only' for online guest [wired > to > qemu 'transaction' QMP command] > 5. external online memory: qcow2 wrapper around any base file type > plus > migration to file, created with 'snapshot-create-as --memspec' for > online guest [wired to qemu 'migrate' and 'transaction' QMP commands] > 6. external memory only: migration to file, created with 'virsh save' > for online guest [wired to qemu 'migrate' QMP command] > > There's also talk on the upstream qemu list about making even more > options for snapshots (for example, 'savevm' is too bulky for > creating > internal qcow2 snapshots, so the goal is to allow for an internal > qcow2 > snapshot without also grabbing guest memory; also, live migration to > file is a waste of disk space on a guest that takes a long time to > converge, so a new file format for storing guest memory would make > live > snapshots take less disk space). Thanks for this clarification. When taking live snapshots, what does the --quiesce option do exactly? Does it call a sync on the guest to make sure the disks are in a consistent state? If so, does it only work for specific drivers, e.g. VirtIO disks? Would it be best to use this option when taking live snapshots to ensure that the snapshot disk is consistent? > > >> For internal snapshots, it should just work; when reverting, it > >> shouldn't matter whether the domain is live or shutdown, because > >> you are going to change state anyway. Internal snapshots are not > >> invalidated (the way qcow2 works, each internal snapshot > >> increments a > >> reference counter on all clusters in use at the time of the > >> snapshot, > >> with no intrinsic relation between snapshots; executing from the > >> point > >> of a given snapshot does a copy-on-write allocation for any > >> cluster > >> with a reference count larger than 1). > > > > Since they are separate copy-on-write snapshots, this means I could > > revert to snap1, make changes, and then later revert to snap3? > > For internal snapshots, yes. For external snapshots, the act of > reverting is not yet wired up nicely in libvirt 1.0.2; basically, > doing > a revert of an external snapshot needs to decide whether to create > another snapshot (by adding an atomic revert-and-create action), or > to > invalidate all existing snapshots that descend below the point being > reverted to. > > -- > Eric Blake eblake redhat com +1-919-301-3266 > Libvirt virtualization library http://libvirt.org > > Thanks, Andrew

On 02/04/2013 02:37 PM, Andrew Martin wrote:
Thanks for this clarification. When taking live snapshots, what does the --quiesce option do exactly? Does it call a sync on the guest to make sure the disks are in a consistent state? If so, does it only work for specific drivers, e.g. VirtIO disks? Would it be best to use this option when taking live snapshots to ensure that the snapshot disk is consistent?
--quiesce says to send a message on the qemu-guest-agent to tell the guest to get its file systems into a consistent state. For it to be useful, you have to: 1. trust your guest (anything that involves guest cooperation can backfire if you don't trust the guest) 2. have the qemu-guest-agent channel wired up in the guest XML (there's a request to make this easier to do; but for now, http://libvirt.org/formatdomain.html#elementCharChannel documents how to set up org.qemu.guest_agent.0) 3. have the guest agent actually running in the guest (so far, it has been ported to at least Windows and Linux, but as recently as Fedora 18, I raised a bug that the default live image build was not installing the guest agent by default. If your guest is running some other OS, then you will probably have to contribute patches to the qemu list to port the guest agent to your choice of guest) Additionally, if you use the guest agent from (not yet released) qemu 1.4 or later, there was a recent addition in upstream qemu that added hooks so you could add additional wrapper actions (such as a database quiesce command) around the agent commands that freeze and rethaw all file systems, for even more stability in the snapshot image. Using --quiesce is therefore optional; if you use it, you are more likely to be able to boot your guest from the snapshot point in time without having inconsistent file systems; if you do not use it, you are more likely to have to run fsck or similar to recover files that were in the middle of being modified at the time of the snapshot (since a disk-only snapshot doesn't include guest RAM state). Meanwhile, with libvirt 1.0.1 and later, an external snapshot including RAM does not support the --quiesce option, for two reasons: 1. --quiesce requires that the guest be running in order to freeze its own file systems, but taking a snapshot of memory at the same time as a snapshot of the disks requires pausing the guest (even if the downtime is limited to the sub-second range by use of the --live flag). 2. if you have the RAM state available, then you have not lost any in-flight I/O operations that were in the guest RAM, so it is no longer quite as essential as it was for disk-only snapshots to get the disks into a stable state. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Andrew Martin
-
Eric Blake