On Wed, Jul 13, 2011 at 03:46:30PM -0500, Adam Litke wrote:
Unfortunately, after committing the blockPull API to libvirt, the
qemu
community decided to change the API somewhat and we needed to revert the
libvirt implementation. Now that the qemu API is settling out again, I would
like to propose an updated libvirt public API which has required only a minor
set of changes to sync back up to the qemu API.
Summary of changes:
- Qemu dropped incremental streaming so remove libvirt incremental
BlockPull() API
- Rename virDomainBlockPullAll() to virDomainBlockPull()
- Changes required to qemu monitor handlers for changed command names
Okay.
Currently, qemu block device streaming completely flattens a disk
image's
backing file chain. Consider the following chain: A->B->C where C is a leaf
image that is backed by B and B is backed by A. The current disk streaming
command will produce an independent image C with no backing file. Future
versions of qemu block streaming may support an option to specify a new base
image from the current chain. For example: stream --backing_file B C would
pull all blocks that are only in A to produce the chain: B->C (thus
eliminating a dependency on A but maintaining B as a backing image.
Do we want to create room in the BlockPull API to support this advanced usage
in the future? If so, then a new parameter must be added to BlockPull: const
char *backing_path. Even if we extend the API in this manner, the initial
implementation will not support it because qemu will not support it
immediately, and libvirt is missing core functionality to support it (no
public enumeration of the disk backing file chain). Thoughts?
My own preference would rather be to defer this, and provide a
separate API when QEmu implementation comes. The problem is that
the single backing store file might not be sufficient to fully support
what may come if QEmu adds this and as you pointed out we aren't really
ready for this on libvirt side either.
On the other hand I suspect that we are missing the mechanism to
control the rate of the transfer in the new API, which is something
which could be implemented in the old incremental mechanism, but not
anymore. So I wonder if the virDomainBlockPull() shouldn't get an
"unsigned long bandwidth" (to stay coherent with migration APIs)
before the flags. I don't know if the support is ready in QEmu but
I suspect that will be an important point, so if not present will
be added :-)
--
To help speed the provisioning process for large domains, new QED disks are
created with backing to a template image. These disks are configured with
copy on read such that blocks that are read from the backing file are copied
to the new disk. This reduces I/O over a potentially costly path to the
backing image.
In such a configuration, there is a desire to remove the dependency on the
backing image as the domain runs. To accomplish this, qemu will provide an
interface to perform sequential copy on read operations during normal VM
operation. Once all data has been copied, the disk image's link to the
backing file is removed.
The virDomainBlockPull API family brings this functionality to libvirt.
virDomainBlockPull() instructs the hypervisor to stream the entire device in
the background. Progress of this operation can be checked with the function
virDomainBlockPullInfo(). An ongoing stream can be cancelled with
virDomainBlockPullAbort().
An event (VIR_DOMAIN_EVENT_ID_BLOCK_PULL) will be emitted when a disk has been
fully populated or if a BlockPull() operation was terminated due to an error.
This event is useful to avoid polling on virDomainBlockPullInfo() for
completion and could also be used by the security driver to revoke access to
the backing file when it is no longer needed.
/*
* BlockPull API
*/
/* An iterator for initiating and monitoring block pull operations */
typedef unsigned long long virDomainBlockPullCursor;
typedef struct _virDomainBlockPullInfo virDomainBlockPullInfo;
struct _virDomainBlockPullInfo {
/*
* The following fields provide an indication of block pull progress. @cur
* indicates the current position and will be between 0 and @end. @end is
* the final cursor position for this operation and represents completion.
* To approximate progress, divide @cur by @end.
*/
virDomainBlockPullCursor cur;
virDomainBlockPullCursor end;
};
typedef virDomainBlockPullInfo *virDomainBlockPullInfoPtr;
/**
* virDomainBlockPull:
* @dom: pointer to domain object
* @path: Fully-qualified filename of disk
* @flags: currently unused, for future extension
*
* Populate a disk image with data from its backing image. Once all data from
* its backing image has been pulled, the disk no longer depends on a backing
* image. This function pulls data for the entire device in the background.
* Progress of the operation can be checked with virDomainGetBlockPullInfo() and
* the operation can be aborted with virDomainBlockPullAbort(). When finished,
* an asynchronous event is raised to indicate the final status.
*
* Returns 0 if the operation has started, -1 on failure.
*/
int virDomainBlockPull(virDomainPtr dom,
const char *path,
So I would just add an unsigned long bandwidth,
here
unsigned int flags);
and document it as:
@bandwidth: (optional) specify copy bandwidth limit in Mbps
* The maximum bandwidth (in Mbps) that will be used to do the copy
* can be specified with the bandwidth parameter. If set to 0,
* libvirt will choose a suitable default. Some hypervisors do
* not support this feature and will return an error if bandwidth
* is not 0.
/**
* virDomainBlockPullAbort:
* @dom: pointer to domain object
* @path: fully-qualified filename of disk
* @flags: currently unused, for future extension
*
* Cancel a pull operation previously started by virDomainBlockPullAll().
*
* Returns -1 in case of failure, 0 when successful.
*/
int virDomainBlockPullAbort(virDomainPtr dom,
const char *path,
unsigned int flags);
/**
* virDomainGetBlockPullInfo:
* @dom: pointer to domain object
* @path: fully-qualified filename of disk
* @info: pointer to a virDomainBlockPullInfo structure
* @flags: currently unused, for future extension
*
* Request progress information on a block pull operation that has been started
* with virDomainBlockPull(). If an operation is active for the given
* parameters, @info will be updated with the current progress.
*
* Returns -1 in case of failure, 0 when successful.
*/
int virDomainGetBlockPullInfo(virDomainPtr dom,
const char *path,
virDomainBlockPullInfoPtr info,
unsigned int flags);
I also wonder about the PullInfo, is there any way we could extend it
to suggest remaining time, an approximation might be computable by the
management code, but since libvirt will know about bandwidth better than
the upper layer maybe it's in abetter situation to do that evaluation
(assuming QEmu doesn't give informations about this).
/**
* virConnectDomainEventBlockPullStatus:
*
* The final status of a virDomainBlockPull() operation
*/
typedef enum {
VIR_DOMAIN_BLOCK_PULL_COMPLETED = 0,
VIR_DOMAIN_BLOCK_PULL_FAILED = 1,
} virConnectDomainEventBlockPullStatus;
/**
* virConnectDomainEventBlockPullCallback:
* @conn: connection object
* @dom: domain on which the event occurred
* @path: fully-qualified filename of the affected disk
* @status: final status of the operation (virConnectDomainEventBlockPullStatus
*
* The callback signature to use when registering for an event of type
* VIR_DOMAIN_EVENT_ID_BLOCK_PULL with virConnectDomainEventRegisterAny()
*/
typedef void (*virConnectDomainEventBlockPullCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *path,
int status,
void *opaque);
Except for the bandwidth addition and potential extension of the
BlockInfo, that looks fine to me.
Daniel
--
Daniel Veillard | libxml Gnome XML XSLT toolkit
http://xmlsoft.org/
daniel(a)veillard.com | Rpmfind RPM search engine
http://rpmfind.net/
http://veillard.com/ | virtualization library
http://libvirt.org/