Re: [libvirt] [Qemu-devel] [RFC] live snapshot, live merge, live block migration

On 05/20/2011 03:19 PM, Stefan Hajnoczi wrote:
I'm interested in what the API for snapshots would look like. Specifically how does user software do the following: 1. Create a snapshot 2. Delete a snapshot 3. List snapshots 4. Access data from a snapshot
There are plenty of options there: - Run a (unrelated) VM and hotplug the snapshot as additional disk - Use v2v (libguestfs) - Boot the VM w/ RO - Plenty more
5. Restore a VM from a snapshot 6. Get the dirty blocks list (for incremental backup)
It might be needed for additional proposes like efficient delta sync across sites or any other storage operation (dedup, etc)
We've discussed image format-level approaches but I think the scope of the API should cover several levels at which snapshots are implemented: 1. Image format - image file snapshot (Jes, Jagane) 2. Host file system - ext4 and btrfs snapshots 3. Storage system - LVM or SAN volume snapshots
It will be hard to take advantage of more efficient host file system or storage system snapshots if they are not designed in now.
I agree but it can also be a chicken and the egg problem. Actually 1/2/3/5 are already working today regardless of live snapshots.
Is anyone familiar enough with the libvirt storage APIs to draft an extension that adds snapshot support? I will take a stab at it if no one else want to try it.
I added libvirt-list and Ayal Baron from vdsm. What you're asking is even beyond snapshots, it's the whole management of VM images. Doing the above operations is simple but for enterprise virtualization solution you'll need to lock the NFS/SAN images, handle failures of VM/SAN/Mgmt, keep the snapshots info in mgmt DB, etc. Today it is managed by a combination of rhev-m/vdsm and libvirt. I agree it would have been nice to get such common single entry point interface.
Stefan

On Sun, May 22, 2011 at 10:52 AM, Dor Laor <dlaor@redhat.com> wrote:
On 05/20/2011 03:19 PM, Stefan Hajnoczi wrote:
I'm interested in what the API for snapshots would look like. Specifically how does user software do the following: 1. Create a snapshot 2. Delete a snapshot 3. List snapshots 4. Access data from a snapshot
There are plenty of options there: - Run a (unrelated) VM and hotplug the snapshot as additional disk
This is the backup appliance VM model and makes it possible to move the backup application to where the data is (or not, if you have a SAN and decide to spin up the appliance VM on another host). This should be perfectly doable if snapshots are "volumes" at the libvirt level. A special-case of the backup appliance VM is using libguestfs to access the snapshot from the host. This includes both block-level and file system-level access along with OS detection APIs that libguestfs provides. If snapshots are "volumes" at the libvirt level, then it is also possible to use virStorageVolDownload() to stream the entire snapshot through libvirt: http://libvirt.org/html/libvirt-libvirt.html#virStorageVolDownload Summarizing, here are three access methods that integrate with libvirt and cover many use cases: 1. Backup appliance VM. Add a readonly snapshot volume to a backup appliance VM. If shared storage (e.g. SAN) is available then the appliance can be run on any host. Otherwise the appliance must run on the same host that the snapshot resides on. 2. Libguestfs client on host. Launch libguestfs with the readonly snapshot volume. The backup application runs directly on the host, it has both block and file system access to the snapshot. 3. Download the snapshot to a remote host for backup processing. Use the virStorageVolDownload() API to download the snapshot onto a libvirt client machine. Dirty block tracking is still useful here since the virStorageVolDownload() API supports <offset, length> arguments.
5. Restore a VM from a snapshot
Simplest option: virStorageVolUpload().
6. Get the dirty blocks list (for incremental backup)
It might be needed for additional proposes like efficient delta sync across sites or any other storage operation (dedup, etc)
We've discussed image format-level approaches but I think the scope of the API should cover several levels at which snapshots are implemented: 1. Image format - image file snapshot (Jes, Jagane) 2. Host file system - ext4 and btrfs snapshots 3. Storage system - LVM or SAN volume snapshots
It will be hard to take advantage of more efficient host file system or storage system snapshots if they are not designed in now.
I agree but it can also be a chicken and the egg problem. Actually 1/2/3/5 are already working today regardless of live snapshots.
Is anyone familiar enough with the libvirt storage APIs to draft an extension that adds snapshot support? I will take a stab at it if no one else want to try it.
I added libvirt-list and Ayal Baron from vdsm. What you're asking is even beyond snapshots, it's the whole management of VM images. Doing the above operations is simple but for enterprise virtualization solution you'll need to lock the NFS/SAN images, handle failures of VM/SAN/Mgmt, keep the snapshots info in mgmt DB, etc.
Today it is managed by a combination of rhev-m/vdsm and libvirt. I agree it would have been nice to get such common single entry point interface.
Okay, the user API seems to be one layer above libvirt. Stefan

On Mon, May 23, 2011 at 2:02 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
On Sun, May 22, 2011 at 10:52 AM, Dor Laor <dlaor@redhat.com> wrote:
On 05/20/2011 03:19 PM, Stefan Hajnoczi wrote:
I'm interested in what the API for snapshots would look like. Specifically how does user software do the following: 1. Create a snapshot 2. Delete a snapshot 3. List snapshots 4. Access data from a snapshot
There are plenty of options there: - Run a (unrelated) VM and hotplug the snapshot as additional disk
This is the backup appliance VM model and makes it possible to move the backup application to where the data is (or not, if you have a SAN and decide to spin up the appliance VM on another host). This should be perfectly doable if snapshots are "volumes" at the libvirt level.
A special-case of the backup appliance VM is using libguestfs to access the snapshot from the host. This includes both block-level and file system-level access along with OS detection APIs that libguestfs provides.
If snapshots are "volumes" at the libvirt level, then it is also possible to use virStorageVolDownload() to stream the entire snapshot through libvirt: http://libvirt.org/html/libvirt-libvirt.html#virStorageVolDownload
Summarizing, here are three access methods that integrate with libvirt and cover many use cases:
1. Backup appliance VM. Add a readonly snapshot volume to a backup appliance VM. If shared storage (e.g. SAN) is available then the appliance can be run on any host. Otherwise the appliance must run on the same host that the snapshot resides on.
2. Libguestfs client on host. Launch libguestfs with the readonly snapshot volume. The backup application runs directly on the host, it has both block and file system access to the snapshot.
3. Download the snapshot to a remote host for backup processing. Use the virStorageVolDownload() API to download the snapshot onto a libvirt client machine. Dirty block tracking is still useful here since the virStorageVolDownload() API supports <offset, length> arguments.
Jagane, What do you think about these access methods? What does your custom protocol integrate with today - do you have a custom non-libvirt KVM management stack? Stefan

On 5/27/2011 9:46 AM, Stefan Hajnoczi wrote: > On Mon, May 23, 2011 at 2:02 PM, Stefan Hajnoczi<stefanha@gmail.com> wrote: >> On Sun, May 22, 2011 at 10:52 AM, Dor Laor<dlaor@redhat.com> wrote: >>> On 05/20/2011 03:19 PM, Stefan Hajnoczi wrote: >>>> I'm interested in what the API for snapshots would look like. >>>> Specifically how does user software do the following: >>>> 1. Create a snapshot >>>> 2. Delete a snapshot >>>> 3. List snapshots >>>> 4. Access data from a snapshot >>> There are plenty of options there: >>> - Run a (unrelated) VM and hotplug the snapshot as additional disk >> This is the backup appliance VM model and makes it possible to move >> the backup application to where the data is (or not, if you have a SAN >> and decide to spin up the appliance VM on another host). This should >> be perfectly doable if snapshots are "volumes" at the libvirt level. >> >> A special-case of the backup appliance VM is using libguestfs to >> access the snapshot from the host. This includes both block-level and >> file system-level access along with OS detection APIs that libguestfs >> provides. >> >> If snapshots are "volumes" at the libvirt level, then it is also >> possible to use virStorageVolDownload() to stream the entire snapshot >> through libvirt: >> http://libvirt.org/html/libvirt-libvirt.html#virStorageVolDownload >> >> Summarizing, here are three access methods that integrate with libvirt >> and cover many use cases: >> >> 1. Backup appliance VM. Add a readonly snapshot volume to a backup >> appliance VM. If shared storage (e.g. SAN) is available then the >> appliance can be run on any host. Otherwise the appliance must run on >> the same host that the snapshot resides on. >> >> 2. Libguestfs client on host. Launch libguestfs with the readonly >> snapshot volume. The backup application runs directly on the host, it >> has both block and file system access to the snapshot. >> >> 3. Download the snapshot to a remote host for backup processing. Use >> the virStorageVolDownload() API to download the snapshot onto a >> libvirt client machine. Dirty block tracking is still useful here >> since the virStorageVolDownload() API supports<offset, length> >> arguments. > Jagane, > What do you think about these access methods? What does your custom > protocol integrate with today - do you have a custom non-libvirt KVM > management stack? > > Stefan Hello Stefan, The current livebackup_client simply creates a backup of the VM on the backup server. It can save the backup image as a complete image for quick start of the VM on the backup server, or as 'full + n number of incremental backup redo files'. The 'full + n incremental redo' is useful if you want to store the backup on tape. I don't have a full backup management stack yet. If livebackup_client were available as part of kvm, then that would turn into the command line utility that the backup management stack would use. My own intertest is in using livebackup_client to integrate all management functions into openstack. All management built into openstack will be built with the intent of self service. However, other Enterprise backup management stacks such as that from Symantec, etc. can be enhanced to use livebackup_client to extract the backup from the VM Host. How does it apply to the above access mechanisms. Hmm. Let me see. 1. Backup appliance VM. : A backup appliance VM can be started up and the livebackup images can be connected to it. The limitation is that the backup appliance VM must be started up on the backup server, where the livebackup image resides on a local disk. 2. Libguestfs client on host. This too is possible. The restriction is that libguestfs must be on the backup server, and not on the VM Host. 3. Download the snapshot to a remote host for backup processing. This is the native method for livebackup. Thanks, Jagane
participants (3)
-
Dor Laor
-
Jagane Sundar
-
Stefan Hajnoczi