On 08/11/2011 08:11 AM, Kevin Wolf wrote:
I agree with you. It feels a bit backwards for snapshots, but
it's
really the only reasonable thing to do if you're using external
snapshots. That you can't rename block devices is actually a very point
point, too.
There's one more point to consider: If creating a snapshot of foo.img
just creates a new bar.img, but I keep working on foo.img, I might
expect that by deleting bar.img I remove the snapshot, but foo.img keeps
working.
More ideas on this front:
One of the ideas of 'live snapshot' is to grab state that I can copy to
an independent backup, taking as much time as needed, with minimal
interruption to qemu. Given an original 'file' of any format, then we
can consider the sequence:
rename file to file.tmp (assuming we figure out how to teach qemu about
renames)
use snapshot_blkdev to recreate file with file.tmp as backup
in parallel:
copy file.tmp to file.snapshot
block pull the contents of file.tmp back into file
when both tasks have completed, remove file.tmp
Now, I have created a snapshot file.snap, which can safely be deleted
without breaking 'file', and with minimal downtime to the qemu process.
It's just that there is a window of time where the the snapshot is
still in progress (that is, until both file.snap and the block pull have
completed); dealing with the wrinkle that this forces 'file' to now be
qcow2, even if it started out raw; and dealing with rename() issues not
being usable on block devices. And a non-zero window of time between
starting the sequence and reaching a stable completion implies
ramifications to whether other commands would be locked out in the
meantime, or whether it can be broken into multiple steps with progress
checks along the way, whether events need to be exposed to track when
pieces complete, and so on.
Another idea is that if qemu would ever gain a way to export the
contents of an internal snapshot or backing file (aka external
snapshot), independently of how that state differs from the current
state, then another operation would be:
with qcow2 file, create an internal snapshot
use new API to copy out the snapshot state into file.snap, while qemu is
still actively modifying current state
remove the internal snapshot
with the net result that appears the same as creating file.snap as an
external snapshot of a given state in time, but where the original qcow2
file is not impacted if file.snap is deleted.
So working with renames might turn out to be tricky in many ways, and
not only technical ones.
Hopefully we're leaving enough flexibility to support these additional
snapshot modes, even if we don't implement everything in the first round.
> 2. It is possible to add a new libvirt API, virDomainSnapshotCreateFrom,
> which takes an existing snapshot as a child of the given snapshot passed
> in as its parent. This would combine the action of reverting to a
> disk-snapshot along with the xml argument necessary for naming a new
> live file, so that you could indeed support branching off the
> disk-snapshot with a user-specified or libvirt-generated new active file
> name without having to delete the existing children that were branched
> off the old active file name, and making the original base file the
> backing file to both branches. Unfortunately, adding a new API is out
> of the question for backporting purposes.
This API would be completely pointless with internal snapshots, right?
On the contrary, it might be useful as a way to convert an internal
snapshot into an external one. But yes, we can already do branching
children off internal snapshots without needing this new feature, so the
new feature's main point is for use in creating a branching child off an
external disk snapshot.
The ideal result would be an API where the user doesn't really
have to
deal with internal vs. external snapshots other than setting the right
flag/XML option/whatever and libvirt would do the mapping to the
low-level functions.
Of course, if we want to avoid renames (for which there are good
reasons), then maybe we can't really get a unified API for internal and
external snapshots. In this case, maybe using completely different
functions to signal that we have different semantics might be appropriate.
This looks like it still needs a lot of thought.
Different functions at the qemu level, at the libvirt level, or both? I
agree that the ideal libvirt semantics is a single interface with enough
expressivity to properly map to all the underlying qemu options, where
libvirt correctly decides between migrate to disk and qemu-img, savevm,
snapshot_blkdev, block pull, or any other underlying operations, while
still properly rejecting any combinations that are possible in the XML
matrix but unsupported by current qemu capabilities.
> 2a. But thinking about it a bit more, maybe we don't need a new API, but
> just an XML enhancement to the existing virDomainSnapshotCreateXML!
> That is, if I specify:
> <domainsnapshot>
> <name>branched</name>
> <parent>
> <name>disk-snapstho</name>
> </parent>
> <disks>...</disks>
> </domainsnapshot>
>
> then we can accomplish your goal, without any qemu changes, and without
> any new libvirt API. That is, right now,<parent> is an output-only
> aspect of snapshot xml, but by allowing it to be an input element
> (probably requiring the use of a new flag,
> VIR_DOMAIN_SNAPSHOT_CREATE_BRANCH), then it is possible to both revert
> to the state of the old snapshot and specify the new file name to use to
> collect the branched delta data from that point in time. It also means
> that creation of a branched snapshot would have to learn some of the
> same flags as reverting to a snapshot (can you create the branch as well
> as run a new qemu process?) I'll play with the ideas, once I get the
> groundwork of this RFC done first.
>
> Thanks for forcing me to think about it!
Yes, this sounds like a nice solution for this case, and it looks
consistent with your existing proposal.
It still doesn't change anything for the fundamental problem that you
pointed me at, that internal snapshots give you different semantics than
external snapshots. So I think this is where we need some more discussion.
I guess at this point, my biggest concern is whether my RFC locks out
any useful extensions, or if it still looks like we have enough
flexibility by adding new XML constructs to cover new cases later on,
while we wait for resolution of additional discussion on these sorts of
internal vs. external issues.
--
Eric Blake eblake(a)redhat.com +1-801-349-2682
Libvirt virtualization library
http://libvirt.org