Hi, all.
The result of "[RFC] live migration of VMs with internal snapshots"
discussion is that it is nice to have ability to migrate images with internal
snapshots while keeping intact their internal snaphot structure. Current qemu
drive mirror functionality drops this structure and it is possible to
workaround the issue with the help of taking extenal snapshot on source and
then blockcommiting it back on destination after migration.
The technique of creating temporary external snapshot could be useful in
other cases too. For example in case of migration with disks on non cache
coherent shared filesystem AFAIU. Thus I want to consider (and in case it is
reasonable implement too) this functionality aside from migration with internal
snapshots task where we have another issue of moving snapshot metadata to
destination.
The whole topic of design comes from the fact that blockcommit can fail. The
fail itself is not a problem as AFAIK commiting active image does not
invalidate it and can be restarted later. So the question is to when and how
handle it. This can happen on source when we rollback after unsuccessfull
migration or on destination as on finish phase in case of blockcommit fail
we can't fail the migration anymore. I see two options.
1. Leave this situation to the mgmt. Thus we need to expose the fact of
external snapshot in domain config and somehow signal that mgmt need to take
care of situation. Well the fact that we expose this technical snapshot to
upper level doubts the whole idea of making the workaround in libvirt itself.
2. Have some kind of background task in libvirt that will try to blockcomit
with some policy. Well again, we introduce a policy which is considered to
reside in mgmt and introduce background task which makes libvirt mgmt-like
too.
In anyway we will need to take special care while we have uncommited
temporary snapshot. For example if we take internal snapshot at this
time then it will be in temporary snapshot and be melted away after
temporary snapshot is committed. Other block operations are probably
are be affected too in this or another way. We can disable such operations
for uncommited temporary snapshot. In case of mgmt managment this is job
of mgmt and in case of libvirt managment this should be done in libvirt.
All that said I guess the first approach is not the way to go as it fails to
archive the goal - hide the details from mgmt layer. As to the second one I'd
be glad to hear opinions of the community.
Nikolay