The right way to revert to external disk snapshots

Hello everyone, I'm seeking guidance on *best practices* for virtual machine recovery using external disk snapshots, particularly in a storage environment with ZFS. My current snapshot and recovery *workflow* involves: - Keeping VM disks & state on a ZFS volume; - Creating external KVM/Libvirt disk-only snapshots, resulting in deltas kept on the volume, next to the disk images; - Capturing the entire VM state through ZFS snapshots; - VM recovery through ZFS snapshot clones. I am particularly interested in obtaining an app-consistent recovery, in which I need to revert to the KVM snapshot of the VM, to ensure the possible clean state offered by a quiesced snapshot. Reading other posts from the archive and forums, it is clear for me that I cannot simply revert to the VM's snapshot, if it's a disk-only one, and that I have to manage them manually. Thus, my question is: *what is the best practice in order to recover the VM to the external disk snapshot that we have*? *What I have tried* and worked but I'm not sure is the best practice: on a VM with only one snapshot, I've changed the disk source files (which were pointing to deltas), to the ones pointed by their backingStore source files, effectively making them use the disk state of the snapshot time. This only works for shut-off VMs, as live VMs cannot have their disk sources changed, of course. Thus, for powered on VMs in the use case with only one snapshot, I've chosen to use `virDomainBlockPull` in order to have the app-consistent state pulled on the current disk (which was and still is pointing to the delta). *My concerns* on the approach I took regard, mostly, scalability and the safety of the whole process: - I am not sure how I could revert again to the current snapshot with the operations I did: for powered off VMs, disk images will change once we start using the VM, and for powered on VMs, the blockpull will alter the deltas which the disks were pointing to; - I don't see how I could apply this method in a scalable way, if the VM had more than one snapshot. At least for powered-on VMs. Thus, I thought I should seek some advice from you guys and see if there's another, smarter way that I can do this. Thanks a lot for your time, Alex Serban

On Sat, Jan 25, 2025 at 09:27:04 +0100, Alex Serban wrote:
Hello everyone, I'm seeking guidance on *best practices* for virtual
Hi, your message got a bit malformated (as a big blob of text) so I hope I don't forget any bits to reply.
machine recovery using external disk snapshots, particularly in a storage environment with ZFS. My current snapshot and recovery *workflow* involves: - Keeping VM disks & state on a ZFS volume; - Creating external KVM/Libvirt disk-only snapshots, resulting in deltas kept on the volume, next to the disk images; - Capturing the entire VM state through ZFS snapshots; - VM recovery through ZFS snapshot clones. I am particularly interested in obtaining an app-consistent recovery, in which I need to revert to the KVM snapshot of the VM, to ensure the possible clean state offered by a quiesced snapshot. Reading other posts from the archive and forums, it is clear for me that I cannot simply revert to the VM's snapshot, if it's a disk-only one, and that I have to manage them manually.
Support for external snapshots was recently completed so you should be able to just revert to them. It was like you describe for a long time but shouldn't be the case any longer. Note that with a 'disk-only' snapshot (if it was taken while the VM was running) you're reverting to a disk state but lose the VM state. It will be akin to pulling the power cord from a physical box.
Thus, my question is: *what is the best practice in order to recover the VM to the external disk snapshot that we have*? *What I have tried* and worked but I'm not
As said; the best practice is to use new-enough libvirt that allows reversion.
sure is the best practice: on a VM with only one snapshot, I've changed the disk source files (which were pointing to deltas), to the ones pointed by their backingStore source files, effectively making them use the disk state of the snapshot time. This only works for shut-off VMs, as live VMs cannot have their disk sources changed, of course. Thus, for powered on VMs in the use case with only one snapshot, I've chosen to use `virDomainBlockPull` in order to have the app-consistent state pulled on the current disk (which was and still is pointing to the delta). *My concerns* on the approach I took regard, mostly, scalability and the safety of the whole process: - I am not sure how I could revert again to the current snapshot with the operations I did: for powered off VMs, disk images will change once we
So reversion of external snapshots as done in libvirt simply creates new overlay qcow2 images on top of the original images. that way it doesn't destroy any data.
start using the VM, and for powered on VMs, the blockpull will alter the deltas which the disks were pointing to; - I don't see how I could apply this method in a scalable way, if the VM had more than one snapshot. At least for powered-on VMs. Thus, I thought I should seek some advice from you guys and see if there's another, smarter way that I can do this. Thanks a lot for your time, Alex Serban
Feel free to re-iterate questions if I didn't reply to everything, but please avoid a giant blob of text.

Hello, Thanks for the response and sorry for the big blob of text, I've actually formatted the email with great detail, but I don't really know what happened (I see it formatted well on the generated "attachment.html" on the forum thread). I see the support for external snapshots was added for some time, indeed, from version 9.9, by the end of 2023. But still, what I'm running on is Libvirt `v10.5.0 (2024-07-01)`, and reverting to disk-only snapshots seems to still not be supported, as I am getting the following error when trying to do that: `Invalid target domain state 'disk-snapshot'. Refusing snapshot reversion`. Going to the source code, I've tracked the part of the code raising this error on `qemu_snapshot.c`, and I've also found the following patch: https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/XQIJQ..., which is not yet adopted even in the code of the project's `master` branch. Applying this patch over v10.5 locally, made snapshot reversion on disk-only snapshots work for me. Is the `qemu_snapshot: allow reverting to external disk only snapshot` patch what you were referring to earlier? Is it safe to apply it like I did? All the best, Alex Serban

On Wed, Jan 29, 2025 at 11:48:56 -0000, Alex Serban wrote:
Hello,
Thanks for the response and sorry for the big blob of text, I've actually formatted the email with great detail, but I don't really know what happened (I see it formatted well on the generated "attachment.html" on the forum thread).
I see the support for external snapshots was added for some time, indeed, from version 9.9, by the end of 2023. But still, what I'm running on is Libvirt `v10.5.0 (2024-07-01)`, and reverting to disk-only snapshots seems to still not be supported, as I am getting the following error when trying to do that: `Invalid target domain state 'disk-snapshot'. Refusing snapshot reversion`.
Going to the source code, I've tracked the part of the code raising this error on `qemu_snapshot.c`, and I've also found the following patch: https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/XQIJQ..., which is not yet adopted even in the code of the project's `master` branch. Applying this patch over v10.5 locally, made snapshot reversion on disk-only snapshots work for me.
Is the `qemu_snapshot: allow reverting to external disk only snapshot` patch what you were referring to earlier? Is it safe to apply it like I did?
Oops, that patch fell through the cracks. I've ack'd it. Thanks for actually testing it. Obviously libvirt doesn't do patches to historical releases but it is likely that it will be applied to upstream as is and since you didn't get any trouble applying it to your version it should also work for you.
participants (2)
-
Alex Serban
-
Peter Krempa