[libvirt] [RFC] Image Fleecing for Libvirt (BZ 955734, 905125)

newer
[libvirt] Entering freeze for...

older
[libvirt] [PATCH] caps: use...

Richard W.M. Jones

16 Jul 2013 16 Jul '13

6:04 a.m.

On Mon, Jul 15, 2013 at 05:57:12PM +0800, Fam Zheng wrote:

...

Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

We get cheap point-in-time snapshot, and export it through built in NBD server, by commands described below:

1. qemu-img create -f qcow2 -o backing_file=RUNNING-VM.img BACKUP.qcow2

(although the backing_file option is not honoured in the next step because we *override* backing file with an existing BlockDriverState, giving it here does no harm and also makes sure the created image is of right size.)

2. (HMP) drive_add backing=ide0-hd0,file=BACKUP.qcow2,id=target0,if=none

(where ide0-hd0 is the running BlockDriverState name for RUNNING-VM.img)

3. (QMP) drive-backup device=ide0-hd0 mode=drive sync=none target=target0

(NewImageMode 'drive' means target is looked up as a device id, sync mode 'none' means don't copy any data except copy-on-write the point in time snapshot data)

4. (QMP) nbd-server-add device=target0

When image fleecing done:

1. (QMP) block-job-complete device=ide0-hd0

2. (HMP) drive_del target0

3. rm BACKUP.qcow2

Note: HMP drive_add/drive_del has no counterpart in QMP now but a new command blockdev-add to do similar things is WIP, which can be an alternative in QMP flavor.

Any comments are welcome!

-- Best regards, Fam Zheng

-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW

Show replies by date

Richard W.M. Jones

16 Jul 16 Jul

6:35 a.m.

[Sorry for the odd quoting. I forwarded the original message from Fam Zheng so this feature could be discussed in public. This is my reply.] On Mon, Jul 15, 2013 at 05:57:12PM +0800, Fam Zheng wrote:

...

Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

We get cheap point-in-time snapshot, and export it through built in NBD server, by commands described below:

1. qemu-img create -f qcow2 -o backing_file=RUNNING-VM.img BACKUP.qcow2

(although the backing_file option is not honoured in the next step because we *override* backing file with an existing BlockDriverState, giving it here does no harm and also makes sure the created image is of right size.)

2. (HMP) drive_add backing=ide0-hd0,file=BACKUP.qcow2,id=target0,if=none

(where ide0-hd0 is the running BlockDriverState name for RUNNING-VM.img)

3. (QMP) drive-backup device=ide0-hd0 mode=drive sync=none target=target0

(NewImageMode 'drive' means target is looked up as a device id, sync mode 'none' means don't copy any data except copy-on-write the point in time snapshot data)

4. (QMP) nbd-server-add device=target0

When image fleecing done:

If you want to test image inspection, an easy way is: export LIBGUESTFS_BACKEND=direct virt-inspector -a nbd://localhost:port [-v] or: virt-inspector -a 'nbd://?socket=/sockpath' [-v] Use -v for extra debug. Note this requires libguestfs >= 1.22 (that usually means Fedora >= 19, RHEL >= 7, Debian >= unstable).

...

1. (QMP) block-job-complete device=ide0-hd0

2. (HMP) drive_del target0

3. rm BACKUP.qcow2

Note: HMP drive_add/drive_del has no counterpart in QMP now but a new command blockdev-add to do similar things is WIP, which can be an alternative in QMP flavor.

Any comments are welcome!

Do you have a qemu git repo with a working version of all of this (ie. including the non-upstream-yet bits)? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org

Eric Blake

24 Jul 24 Jul

12:40 p.m.

[replying with useful information from another off-list email] On 07/15/2013 03:04 PM, Richard W.M. Jones wrote:

...

On Mon, Jul 15, 2013 at 05:57:12PM +0800, Fam Zheng wrote:

...
Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

On 07/15/2013 06:24 AM, Paolo Bonzini wrote:> Il 15/07/2013 11:57, Fam Zheng ha scritto:

...

...
Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

And since we are at it, here is a possible libvirt API to expose this functionality (cut-and-paste from an old email). If needed, VDSM can provide a similar API and proxy the libvirt API.

Would something like this work?

int virDomainBlockPeekStart (virDomainPtr dom, const char ** disks, unsigned int flags);

Make it possible to use virDomainBlockPeek on the given disks with the new VIR_DOMAIN_BLOCK_PEEK_IMAGE flag.

It is okay to create multiple "snapshot groups", i.e. to invoke the function multiple times with VIR_DOMAIN_BLOCK_PEEK_SNAPSHOT. It is however not okay to specify the same disk multiple times unless all of them are _without_ VIR_DOMAIN_BLOCK_PEEK_SNAPSHOT.

flags: VIR_DOMAIN_BLOCK_PEEK_SNAPSHOT Make an atomic point-in-time snapshot of all the disks included in the list of strings "disks", and expose the snapshot via virDomainBlockPeek

Note: if the virtual machine is running, this will use nbd-server-start/add/end. If the virtual machine is paused, this will use qemu-nbd. Libvirt should be able to switch transparently from one method to the other.

int virDomainBlockPeekStop (virDomainPtr dom);

Stop communication with qemu-nbd or the hypervisor.

VIR_DOMAIN_BLOCK_PEEK_IMAGE

A new flag for virDomainBlockPeek. If specified, virDomainBlockPeek will access the disk image, not the "raw" file (i.e. it will read data as seen by the guest). This is only valid if virDomainBlockPeekStart has been called before for this disk.

Because libvirt would use a local (Unix) socket to communicate with QEMU and pass the file descriptor, there is no need to authenticate the NBD connection. There is no need for ticketing, though if necessary we can make QEMU only accept connections from libvirtd's pid. libvirt and VDSM already do authentication and/or encryption.

Paolo

-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Wenchao Xia

2:38 p.m.

于 2013-7-24 11:40, Eric Blake 写道:

...

[replying with useful information from another off-list email]

On 07/15/2013 03:04 PM, Richard W.M. Jones wrote:

...
On Mon, Jul 15, 2013 at 05:57:12PM +0800, Fam Zheng wrote:

...
Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

On 07/15/2013 06:24 AM, Paolo Bonzini wrote:> Il 15/07/2013 11:57, Fam Zheng ha scritto:

...
...
Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

And since we are at it, here is a possible libvirt API to expose this functionality (cut-and-paste from an old email). If needed, VDSM can provide a similar API and proxy the libvirt API.

Would something like this work?

int virDomainBlockPeekStart (virDomainPtr dom, const char ** disks, unsigned int flags);

Make it possible to use virDomainBlockPeek on the given disks with the new VIR_DOMAIN_BLOCK_PEEK_IMAGE flag.

It is okay to create multiple "snapshot groups", i.e. to invoke the function multiple times with VIR_DOMAIN_BLOCK_PEEK_SNAPSHOT. It is however not okay to specify the same disk multiple times unless all of them are _without_ VIR_DOMAIN_BLOCK_PEEK_SNAPSHOT.

flags: VIR_DOMAIN_BLOCK_PEEK_SNAPSHOT Make an atomic point-in-time snapshot of all the disks included in the list of strings "disks", and expose the snapshot via virDomainBlockPeek

Note: if the virtual machine is running, this will use nbd-server-start/add/end. If the virtual machine is paused, this will use qemu-nbd. Libvirt should be able to switch transparently from one method to the other.

int virDomainBlockPeekStop (virDomainPtr dom);

Stop communication with qemu-nbd or the hypervisor.

VIR_DOMAIN_BLOCK_PEEK_IMAGE

A new flag for virDomainBlockPeek. If specified, virDomainBlockPeek will access the disk image, not the "raw" file (i.e. it will read data as seen by the guest). This is only valid if virDomainBlockPeekStart has been called before for this disk.

Because libvirt would use a local (Unix) socket to communicate with QEMU and pass the file descriptor, there is no need to authenticate the NBD connection. There is no need for ticketing, though if necessary we can make QEMU only accept connections from libvirtd's pid. libvirt and VDSM already do authentication and/or encryption.

Paolo

How do I get the info about IP/port needed to access that snapshot? call virSnapshotGetInfo(or similar API) later?

...

-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

-- Best Regards Wenchao Xia

Paolo Bonzini

6:44 p.m.

Il 24/07/2013 07:38, Wenchao Xia ha scritto:

...

...
...
Because libvirt would use a local (Unix) socket to communicate with QEMU and pass the file descriptor, there is no need to authenticate the NBD connection. There is no need for ticketing, though if necessary we can make QEMU only accept connections from libvirtd's pid. libvirt and VDSM already do authentication and/or encryption.

How do I get the info about IP/port needed to access that snapshot? call virSnapshotGetInfo(or similar API) later?

See above. You don't, you use the libvirt channel. Paolo

Wenchao Xia

10:11 p.m.

于 2013-7-24 17:44, Paolo Bonzini 写道:

...

Il 24/07/2013 07:38, Wenchao Xia ha scritto:

...
...
...
Because libvirt would use a local (Unix) socket to communicate with QEMU and pass the file descriptor, there is no need to authenticate the NBD connection. There is no need for ticketing, though if necessary we can make QEMU only accept connections from libvirtd's pid. libvirt and VDSM already do authentication and/or encryption.

How do I get the info about IP/port needed to access that snapshot? call virSnapshotGetInfo(or similar API) later?

See above. You don't, you use the libvirt channel.

Paolo

So I will got a libvirt API like virSnapshotRead(Domain *domain, SnapshotPtr *sn, uint64 sector_num, uint64 sector_len, char *buf)? then libvirt automatically access the snapshot and fill the buffer for user? A API let me access in some way would be my concern. -- Best Regards Wenchao Xia

Wenchao Xia

29 Jul 29 Jul

5:44 p.m.

于 2013-7-24 21:11, Wenchao Xia 写道:

...

于 2013-7-24 17:44, Paolo Bonzini 写道:

...
Il 24/07/2013 07:38, Wenchao Xia ha scritto:

...
...
...
Because libvirt would use a local (Unix) socket to communicate with QEMU and pass the file descriptor, there is no need to authenticate the NBD connection. There is no need for ticketing, though if necessary we can make QEMU only accept connections from libvirtd's pid. libvirt and VDSM already do authentication and/or encryption.

How do I get the info about IP/port needed to access that snapshot? call virSnapshotGetInfo(or similar API) later?

See above. You don't, you use the libvirt channel.

Paolo

So I will got a libvirt API like virSnapshotRead(Domain *domain, SnapshotPtr *sn, uint64 sector_num, uint64 sector_len, char *buf)? then libvirt automatically access the snapshot and fill the buffer for user? A API let me access in some way would be my concern.

I found the API doc online: int virDomainBlockPeek (virDomainPtr dom, const char * disk, unsigned long long offset, size_t size, void * buffer, unsigned int flags) Guess the problem is extend it to work with snapshot. -- Best Regards Wenchao Xia

Daniel P. Berrange

24 Jul 24 Jul

6:51 p.m.

On Tue, Jul 23, 2013 at 09:40:56PM -0600, Eric Blake wrote:

...

[replying with useful information from another off-list email]

On 07/15/2013 03:04 PM, Richard W.M. Jones wrote:

...
On Mon, Jul 15, 2013 at 05:57:12PM +0800, Fam Zheng wrote:

...
Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

On 07/15/2013 06:24 AM, Paolo Bonzini wrote:> Il 15/07/2013 11:57, Fam Zheng ha scritto:

...
...
Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

And since we are at it, here is a possible libvirt API to expose this functionality (cut-and-paste from an old email). If needed, VDSM can provide a similar API and proxy the libvirt API.

Would something like this work?

int virDomainBlockPeekStart (virDomainPtr dom, const char ** disks, unsigned int flags);

Make it possible to use virDomainBlockPeek on the given disks with the new VIR_DOMAIN_BLOCK_PEEK_IMAGE flag.

It is okay to create multiple "snapshot groups", i.e. to invoke the function multiple times with VIR_DOMAIN_BLOCK_PEEK_SNAPSHOT. It is however not okay to specify the same disk multiple times unless all of them are _without_ VIR_DOMAIN_BLOCK_PEEK_SNAPSHOT.

flags: VIR_DOMAIN_BLOCK_PEEK_SNAPSHOT Make an atomic point-in-time snapshot of all the disks included in the list of strings "disks", and expose the snapshot via virDomainBlockPeek

Note: if the virtual machine is running, this will use nbd-server-start/add/end. If the virtual machine is paused, this will use qemu-nbd. Libvirt should be able to switch transparently from one method to the other.

int virDomainBlockPeekStop (virDomainPtr dom);

Stop communication with qemu-nbd or the hypervisor.

VIR_DOMAIN_BLOCK_PEEK_IMAGE

A new flag for virDomainBlockPeek. If specified, virDomainBlockPeek will access the disk image, not the "raw" file (i.e. it will read data as seen by the guest). This is only valid if virDomainBlockPeekStart has been called before for this disk.

I don't much like this retro-fitting of start/stop actions into the virDomainBlockPeek API as a design, particularly the binding of the PEEK_IMAGE flag to the start/stop actions. Conceptually it would be perfectly possible for a hypervisor to implement support PEEK_IMAGE without these start/stop actions, which are somewhat specific to the need for QEMU to start an NBD driver. The virDomainBlockPeek API is also not particularly efficient as an API, because each read incurrs a round-trip over libvirt's RPC service already. We'd then be adding a round-trip over NBD too. I'm wondering if we could instead try to utilize the virStreamPtr APIs for this task. From a libvirt's RPC POV this much more efficient because once you open the region with a stream API, you don't have any round trips at all - the data is pushed out to/from the client async. Now those APIs are currently designed for sequential streaming of entire data regions only, but I wonder if we could extend them somehow to enable seek'ing within the stream. Alternatively perhaps we could just say if you want to read from dis-joint regions, that you can just re-open a stream for each region to be processed. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Richard W.M. Jones

7:12 p.m.

On Wed, Jul 24, 2013 at 10:51:35AM +0100, Daniel P. Berrange wrote:

...

I'm wondering if we could instead try to utilize the virStreamPtr APIs for this task. From a libvirt's RPC POV this much more efficient because once you open the region with a stream API, you don't have any round trips at all - the data is pushed out to/from the client async.

Now those APIs are currently designed for sequential streaming of entire data regions only, but I wonder if we could extend them somehow to enable seek'ing within the stream. Alternatively perhaps we could just say if you want to read from dis-joint regions, that you can just re-open a stream for each region to be processed.

It'd be so much easier from a client point of view if you just exposed the NBD Unix socket directly. libvirt already exposes qemu sockets directly (eg. console, virtio-serial sockets). It should forward those sockets from the remote side transparently too. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW

Daniel P. Berrange

7:23 p.m.

On Wed, Jul 24, 2013 at 11:12:01AM +0100, Richard W.M. Jones wrote:

...

On Wed, Jul 24, 2013 at 10:51:35AM +0100, Daniel P. Berrange wrote:

...
I'm wondering if we could instead try to utilize the virStreamPtr APIs for this task. From a libvirt's RPC POV this much more efficient because once you open the region with a stream API, you don't have any round trips at all - the data is pushed out to/from the client async.

Now those APIs are currently designed for sequential streaming of entire data regions only, but I wonder if we could extend them somehow to enable seek'ing within the stream. Alternatively perhaps we could just say if you want to read from dis-joint regions, that you can just re-open a stream for each region to be processed.

It'd be so much easier from a client point of view if you just exposed the NBD Unix socket directly. libvirt already exposes qemu sockets directly (eg. console, virtio-serial sockets). It should forward those sockets from the remote side transparently too.

That's a possibility, though it pretty much rules out implementing this functionality for other hypervisors, unless they add a dep on qemu-nbd or another NBD server. eg we could potentially design an API that works fine with VMWare ESX, but if we expose NBD as the "api", then our VMWare driver is doomed, since I don't expect VMWare to ever implement NBD. A question around direct exposure of NBD would be that of authentication and data security of the NBD server. QEMU's NBD server has no auth and if you're accessing it over TCP from a remote host you have no data encryption either. A libvirt stream API could address both of those issues. That all said I'm not neccessarily against just exposing NBD directly. For local use, we could also have a FD passing based API where the livirt API call returns a pre-opened FD connected to the NBD server, and simply warn people that exposing NBD over TCP requires a trusted network. Another option would be to actually tunnel the NBD protocol over the pre-authenticated & secured libvirt RPC protocol. This isn't much different to using the virStreamPr APIs, but perhaps would be easier than trying to add random-access to the stream APIs & be easier for a existing NBD client to integrate with. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Eric Blake

1:22 p.m.

On 07/15/2013 03:04 PM, Richard W.M. Jones wrote:

...

On Mon, Jul 15, 2013 at 05:57:12PM +0800, Fam Zheng wrote:

...
Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

I'm wondering if we can still get something committed in time for the freeze for 1.1.1. At this point, we're close enough to the freeze, and with no patches submitted in libvirt and the qemu design still under discussion, that I'm worried about whether we are rushing things too much to take a new interface this late in a libvirt release cycle, or whether we should wait until after 1.1.1 before attempting to add things. On the other hand, if we can agree on a sane design now (or at least before rc2, if we miss rc1), then we can commit to that design for this libvirt release, and downstream distros can use libvirt 1.1.1 as a starting point for rebases without worrying about so-name compatibility, by signing up to the efforts of backporting actual implementation from future upstream qemu and libvirt releases. We've done the approach of an early commit to a new API in the past, even if I'm not necessarily the biggest fan of the approach. For example, we chose to add virDomainBlockRebase to libvirt 0.9.10 (commit 9f902a2, when qemu 1.0 was current) as a way to expose more functionality than what virDomainBlockPull supported, even though we didn't actually implement new functionality until libvirt 1.0.0 and qemu 1.3 (commit c1eb380). The libvirt API design was sound enough that I was able to drive the eventual qemu implementation without any problems, and where the implementation could be backported without so-name bump all the way to 0.9.10. I do want to emphasize that both image fleecing and point-in-time snapshots are features that people want. At the same time, today's qemu.git does not yet have all the patches in place, and we are past soft freeze for qemu 1.6, so there may be a bit of a debate on the qemu list on what aspects of the proposed patches to take, or even a decision that it is too controversial and will wait until qemu 1.7 before being in upstream qemu. Historically, we are reluctant to add implementations to upstream libvirt until the corresponding qemu feature is fully-baked upstream; and leave it to distro backporters to decide if the feature is important enough to backport onto whatever earlier version they base their distro on. At the same time, distro backporters have more flexibility with pulling changes that do not require a so-name bump, and I'm fairly confident that we need a new libvirt API to drive the features, so if we want to support a distro using libvirt 1.1.1, then we need to settle on the libvirt API now even if it remains unimplemented for another libvirt release. Also, in the past, I have posted proposed API for virDomainBlockCopy() [1], but left it unimplemented in upstream libvirt in case future qemu came up with more options that would need tweaking. At this point in time, now that qemu is talking both about adding point-in-time snapshots (block-backup) and image fleecing, I think the time is right to commit to an API for virDomainBlockCopy(). [1]https://www.redhat.com/archives/libvir-list/2012-April/msg00632.html

...

...
We get cheap point-in-time snapshot, and export it through built in NBD server, by commands described below:

1. qemu-img create -f qcow2 -o backing_file=RUNNING-VM.img BACKUP.qcow2

(although the backing_file option is not honoured in the next step because we *override* backing file with an existing BlockDriverState, giving it here does no harm and also makes sure the created image is of right size.)

Use of qemu-img while the file is also owned by a running qemu is dangerous, we'd need the equivalent of this command to be supported from within qemu, or else create the destination without naming a backing file and follow up with something like qemu-img rebase -u to plug in the metadata of what the eventual backing file name will be, all without ever opening the backing file externally. But that's low-level implementation, and shouldn't affect the design of a libvirt API.

...

...
2. (HMP) drive_add backing=ide0-hd0,file=BACKUP.qcow2,id=target0,if=none

(where ide0-hd0 is the running BlockDriverState name for RUNNING-VM.img)

Whether this is done with HMP, or a QMP command gets added in time, is also a low-level detail.

...

...
3. (QMP) drive-backup device=ide0-hd0 mode=drive sync=none target=target0

(NewImageMode 'drive' means target is looked up as a device id, sync mode 'none' means don't copy any data except copy-on-write the point in time snapshot data)

4. (QMP) nbd-server-add device=target0

When image fleecing done:

1. (QMP) block-job-complete device=ide0-hd0

2. (HMP) drive_del target0

3. rm BACKUP.qcow2

Note: HMP drive_add/drive_del has no counterpart in QMP now but a new command blockdev-add to do similar things is WIP, which can be an alternative in QMP flavor.

The earlier design I mentioned for virDomainBlockCopy in 2012 would only work on only one disk at a time; a user could start multiple block jobs, but would have to coordinate them by hand. Paolo's reply to this thread suggested an interface that took a list of block devices, rather than one, and guarantees that the point in time semantic applies to all the devices at once. Unfortunately, the current libvirt block job semantics are tied to a single disk (virDomainBlockStats, virDomainBlockJobAbort), so if we want to manage multiple disks at a common point in time, it sounds more like we'd want to treat this as a generic domain job id rather than a libvirt block job (virDomainGetJobStats, virDomainAbortJob). On the other hand, virDomainAbortJob is hard-wired to a single background job at a time; but with image fleecing, we definitely want to support multiple clients fleecing from different points in time simultaneously, which would imply having a job id. Therefore, I'm worried that properly supporting this will involve the addition of multiple API; adding just a super-power virDomainBlockCopy() does not give us as much control as what I think we want. It's late for me, and I know DV wants to cut rc1, but I hope this sparks some conversations, and that we can decide on whether we need to pursue the idea of supporting API for image fleecing as part of libvirt 1.1.1, or whether we punt and state that there is just too much design work still in the state of flux. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Daniel P. Berrange

6:40 p.m.

On Tue, Jul 23, 2013 at 10:22:23PM -0600, Eric Blake wrote:

...

On 07/15/2013 03:04 PM, Richard W.M. Jones wrote:

...
On Mon, Jul 15, 2013 at 05:57:12PM +0800, Fam Zheng wrote:

...
Hi all,

QEMU-KVM BZ 955734, and libvirt BZ 905125 are about feature "Read-only point-in-time throwaway snapshot". The development is ongoing on upstream, which implements the core functionality by QMP command drive-backup. I want to demonstrate the HMP/QMP commands here for image fleecing tasks (again) and make sure this interface looks ready and satisfying from Libvirt point of view.

I'm wondering if we can still get something committed in time for the freeze for 1.1.1. At this point, we're close enough to the freeze, and with no patches submitted in libvirt and the qemu design still under discussion, that I'm worried about whether we are rushing things too much to take a new interface this late in a libvirt release cycle, or whether we should wait until after 1.1.1 before attempting to add things. On the other hand, if we can agree on a sane design now (or at least before rc2, if we miss rc1), then we can commit to that design for this libvirt release, and downstream distros can use libvirt 1.1.1 as a starting point for rebases without worrying about so-name compatibility, by signing up to the efforts of backporting actual implementation from future upstream qemu and libvirt releases.

IMHO we've missed the boat for adding new APIs in this release, particularly given that there isn't even an finalized design & implementation ready to review. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

4482

Age (days ago)

4496

Last active (days ago)

List overview

Download

11 comments

5 participants

participants (5)

Daniel P. Berrange
Eric Blake
Paolo Bonzini
Richard W.M. Jones
Wenchao Xia

[libvirt] [RFC] Image Fleecing for Libvirt (BZ 955734, 905125)

tags

participants (5)