[libvirt] [RFC] live migration of VMs with internal snapshots

Hi, As far as I understand, currently there is no way to live migrate qemu VMs that have internal snapshots, because live migration works via qemu drive mirroring, which in turns mirrors only shallow block layer, effectively losing existing embedded snapshots. The problem could be fixed if we created an external delta for all disks with internal snapshots, live migrate such VMs and then auto merge them back into original images. That said, I would like to know very much your opinion on the matter and would also like to know if this approach is affordable. If so, I or one of my colleagues will send a pathset to fix this, otherwise what solution for this problem you will recommend? Thanks, Maxim

On Tue, Apr 26, 2016 at 06:26:59AM +0300, Maxim Nestratov wrote:
Hi,
As far as I understand, currently there is no way to live migrate qemu VMs that have internal snapshots, because live migration works via qemu drive mirroring, which in turns mirrors only shallow block layer, effectively losing existing embedded snapshots. The problem could be fixed if we created an external delta for all disks with internal snapshots, live migrate such VMs and then auto merge them back into original images. That said, I would like to know very much your opinion on the matter and would also like to know if this approach is affordable. If so, I or one of my colleagues will send a pathset to fix this, otherwise what solution for this problem you will recommend?
With modern QEMU we do the storage migration using an NBD server that is built-in to QEMU. Now this only knows about disks that are currently attached to the VM. I wonder though if we could simply add all the snapshots to the running VM, but not connect them to any guest device. ie, we'd do blockdev_add, but would *not* do the corresponding device_add. The built-in NBD server would then be able to export them in the same way as the other disks. The other idea would be to just run a standalone qemu-nbd process to export them, but I think that's a bit more messy because it means we'd have to manage a bunch more external processes and allocate extra net connections. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Apr 26, 2016 at 09:19:44AM +0100, Daniel P. Berrange wrote:
On Tue, Apr 26, 2016 at 06:26:59AM +0300, Maxim Nestratov wrote:
Hi,
As far as I understand, currently there is no way to live migrate qemu VMs that have internal snapshots, because live migration works via qemu drive mirroring, which in turns mirrors only shallow block layer, effectively losing existing embedded snapshots. The problem could be fixed if we created an external delta for all disks with internal snapshots, live migrate such VMs and then auto merge them back into original images. That said, I would like to know very much your opinion on the matter and would also like to know if this approach is affordable. If so, I or one of my colleagues will send a pathset to fix this, otherwise what solution for this problem you will recommend?
With modern QEMU we do the storage migration using an NBD server that is built-in to QEMU. Now this only knows about disks that are currently attached to the VM. I wonder though if we could simply add all the snapshots to the running VM, but not connect them to any guest device.
ie, we'd do blockdev_add, but would *not* do the corresponding device_add. The built-in NBD server would then be able to export them in the same way as the other disks.
Sigh. Ignore this. You're talking internal, not external snapshots. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Apr 26, 2016 at 06:26:59AM +0300, Maxim Nestratov wrote:
Hi,
As far as I understand, currently there is no way to live migrate qemu VMs that have internal snapshots, because live migration works via qemu drive mirroring, which in turns mirrors only shallow block layer, effectively losing existing embedded snapshots. The problem could be fixed if we created an external delta for all disks with internal snapshots, live migrate such VMs and then auto merge them back into original images. That said, I would like to know very much your opinion on the matter and would also like to know if this approach is affordable. If so, I or one of my colleagues will send a pathset to fix this, otherwise what solution for this problem you will recommend?
There seems to be an old discussion here[1] on this topic: "You are not the first to request this - libvirt would also like the ability to have read-only access into the contents of an internal snapshot while the rest of qemu continues to write into the image." (Eric Blake) When I asked if this was possible with today's QEMU, Kevin Wolf (QEMU block layer maintainer) on #qemu IRC said: "No, this requires some non-trivial code design changes and was never considered important enough. This would also require writes to those snapshots on the destination, by the way (which wouldn't be impossible if the infrastructure for having multiple active snapshots at the same time were there)" [1] http://lists.nongnu.org/archive/html/qemu-devel/2012-11/msg00402.html -- live migration which includes previos snapshot -- /kashyap

26.04.2016 13:00, Kashyap Chamarthy пишет:
On Tue, Apr 26, 2016 at 06:26:59AM +0300, Maxim Nestratov wrote:
Hi,
As far as I understand, currently there is no way to live migrate qemu VMs that have internal snapshots, because live migration works via qemu drive mirroring, which in turns mirrors only shallow block layer, effectively losing existing embedded snapshots. The problem could be fixed if we created an external delta for all disks with internal snapshots, live migrate such VMs and then auto merge them back into original images. That said, I would like to know very much your opinion on the matter and would also like to know if this approach is affordable. If so, I or one of my colleagues will send a pathset to fix this, otherwise what solution for this problem you will recommend? There seems to be an old discussion here[1] on this topic:
"You are not the first to request this - libvirt would also like the ability to have read-only access into the contents of an internal snapshot while the rest of qemu continues to write into the image." (Eric Blake)
When I asked if this was possible with today's QEMU, Kevin Wolf (QEMU block layer maintainer) on #qemu IRC said:
"No, this requires some non-trivial code design changes and was never considered important enough. This would also require writes to those snapshots on the destination, by the way (which wouldn't be impossible if the infrastructure for having multiple active snapshots at the same time were there)"
[1] http://lists.nongnu.org/archive/html/qemu-devel/2012-11/msg00402.html -- live migration which includes previos snapshot
Hmm. Interesting. But what I'm talking about doesn't involve qemu much. We create an external temporarily delta for images with internal snapshots on libvirt side, then merge it on libvirt side thus, qemu shouldn't be affected at all as far as I understand. Copying my colleagues to the thread.

On 26/04/16 16:34, "Maxim Nestratov" <mnestratov@virtuozzo.com> wrote:
26.04.2016 13:00, Kashyap Chamarthy пишет:
On Tue, Apr 26, 2016 at 06:26:59AM +0300, Maxim Nestratov wrote:
Hi,
As far as I understand, currently there is no way to live migrate qemu VMs that have internal snapshots, because live migration works via qemu drive mirroring, which in turns mirrors only shallow block layer, effectively losing existing embedded snapshots. The problem could be fixed if we created an external delta for all disks with internal snapshots, live migrate such VMs and then auto merge them back into original images. That said, I would like to know very much your opinion on the matter and would also like to know if this approach is affordable. If so, I or one of my colleagues will send a pathset to fix this, otherwise what solution for this problem you will recommend? There seems to be an old discussion here[1] on this topic:
"You are not the first to request this - libvirt would also like the ability to have read-only access into the contents of an internal snapshot while the rest of qemu continues to write into the image." (Eric Blake)
When I asked if this was possible with today's QEMU, Kevin Wolf (QEMU block layer maintainer) on #qemu IRC said:
"No, this requires some non-trivial code design changes and was never considered important enough. This would also require writes to those snapshots on the destination, by the way (which wouldn't be impossible if the infrastructure for having multiple active snapshots at the same time were there)"
[1] http://lists.nongnu.org/archive/html/qemu-devel/2012-11/msg00402.html -- live migration which includes previos snapshot
Hmm. Interesting. But what I'm talking about doesn't involve qemu much. We create an external temporarily delta for images with internal snapshots on libvirt side, then merge it on libvirt side thus, qemu shouldn't be affected at all as far as I understand.
Right now, we are trying the following workaround of this issue: 1) create external snapshot (without metadata) 2) undefine internal snapshots' metadata (because libvirt prohibits migration if detects snapshots) 3) copy metadata & backing file (primary data qcow2 file with internal snapshots) to the destination 4) migrate VM using libvirt 5) define metadata back 6) block-commit - to merge delta to original qcow2 image. Things would be much simpler, if they were in libvirt rather than in management app on top of it. Thank you, Dmitry.
Copying my colleagues to the thread.

On 04/26/2016 07:54 AM, Dmitry Mishin wrote:
Hmm. Interesting. But what I'm talking about doesn't involve qemu much. We create an external temporarily delta for images with internal snapshots on libvirt side, then merge it on libvirt side thus, qemu shouldn't be affected at all as far as I understand.
Right now, we are trying the following workaround of this issue: 1) create external snapshot (without metadata) 2) undefine internal snapshots' metadata (because libvirt prohibits migration if detects snapshots) 3) copy metadata & backing file (primary data qcow2 file with internal snapshots) to the destination 4) migrate VM using libvirt 5) define metadata back 6) block-commit - to merge delta to original qcow2 image.
Yes, that looks like a reasonable workaround.
Things would be much simpler, if they were in libvirt rather than in management app on top of it.
It would indeed be easier - but someone has to contribute the patches for it. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

26.04.2016 18:38, Eric Blake пишет: > On 04/26/2016 07:54 AM, Dmitry Mishin wrote: >>> Hmm. Interesting. But what I'm talking about doesn't involve qemu much. >>> We create >>> an external temporarily delta for images with internal snapshots on >>> libvirt side, >>> then merge it on libvirt side thus, qemu shouldn't be affected at all as >>> far as I >>> understand. >> Right now, we are trying the following workaround of this issue: >> 1) create external snapshot (without metadata) >> 2) undefine internal snapshots' metadata (because libvirt prohibits >> migration if detects snapshots) >> 3) copy metadata & backing file (primary data qcow2 file with internal >> snapshots) to the destination >> 4) migrate VM using libvirt >> 5) define metadata back >> 6) block-commit - to merge delta to original qcow2 image. > Yes, that looks like a reasonable workaround. > >> Things would be much simpler, if they were in libvirt rather than in >> management app on top of it. > It would indeed be easier - but someone has to contribute the patches > for it. > Great! We certainly will.
participants (5)
-
Daniel P. Berrange
-
Dmitry Mishin
-
Eric Blake
-
Kashyap Chamarthy
-
Maxim Nestratov