[libvirt] [RFC PATCH 0/5] add virDomainBlockCopy API

Patch series is not complete yet, but I'm posting the front end now to get feedback on whether it has a chance of being included in 1.2.8. Still to go: patch 6 will actually implement the new function in qemu driver, by parsing a <disk> XML snippet into a virStorageSource; patch 7 will add monitor support for the qemu 1.4 parameters of granularity and buffer size. Individual commit messages should give more hints on where I'm taking this series. Ultimately, I'm hoping that Peter's work on gluster backends for block commit can be expanded into also covering gluster backends for the copy destination. Eric Blake (5): blockcopy: virDomainBlockCopy with XML destination, typed params blockcopy: split out virsh implementation blockcopy: expose new API in virsh blockcopy: remote implementation for new API blockcopy: tweak how rebase calls into copy include/libvirt/libvirt.h.in | 57 ++++++++++++++-- src/driver.h | 11 +++- src/libvirt.c | 118 ++++++++++++++++++++++++++++++++- src/libvirt_public.syms | 5 ++ src/qemu/qemu_driver.c | 104 ++++++++++++++++++----------- src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 18 ++++- tools/virsh-domain.c | 153 ++++++++++++++++++++++++++++++++++++------- tools/virsh.pod | 35 ++++++---- 9 files changed, 422 insertions(+), 80 deletions(-) -- 1.9.3

This commit (finally) adds the virDomainBlockCopy API, with the intent that it will provide more power to the existing 'virsh blockcopy' command. 'virsh blockcopy' was first added in Apr 2012 (v0.9.12), which corresponds to the upstream qemu 1.2 timeframe. It was done as a hack on top of the existing virDomainBlockRebase() API call, for two reasons: 1) it was targetting a feature that landed first in downstream RHEL qemu, but had not stabilized in upstream qemu at the time (and indeed, 'drive-mirror' only landed upstream in qemu 1.3 with slight differences to the first RHEL attempt, and later gained further parameters like granularity and buf-size that are also worth exposing), and 2) extending an existing API allowed it to be backported without worrying about bumping .so versions. A virDomainBlockCopy() API was proposed at that time [1], but we decided not to accept it into libvirt until after upstream qemu stabilized, and it ended up getting scrapped. Whether or not RHEL should have attempted adding a new feature without getting it upstream first is a debate that can be held another day; but enough time has now elapsed that we are ready to do the interface cleanly. [1] https://www.redhat.com/archives/libvir-list/2012-April/msg00768.html Delaying the creation of a clean API until now has also had a benefit: we've only recently learned of a shortcoming in the original design, namely, that it is unable to target a network destination (such as a gluster volume) because it hard-coded the assumption that the destination is a local file name. Because of all the refactoring we've done to add virStorageSourcePtr, we are in a better position to declare an API that parses XML describing a host storage source as the copy destination, which was not possible had we implemented virDomainBlockCopy as it had been originally envisioned. At least I had the foresight to create 'virsh blockcopy' as a separate command at the UI level (commit 1f06c00) rather than leaking the underlying API overload of virDomainBlockRebase onto shell users. A note on the bandwidth option: virTypedParameters intentionally lacks unsigned long (since variable-width interaction between mixed 32- vs. 64-bit client/server setups is nasty), but we have to deal with the fact that we are interacting with existing older code that mistakenly chose unsigned long bandwidth at a point before we decided to prohibit it in all new API. The typed parameter is therefore unsigned long long, and the implementation will have to do overflow detection on 32-bit platforms. * include/libvirt/libvirt.h.in (virDomainBlockCopy): New API. (virDomainBlockJobType, virConnectDomainEventBlockJobStatus): Update related documentation. * src/libvirt.c (virDomainBlockCopy): Implement it. * src/libvirt_public.syms (LIBVIRT_1.2.8): Export it. * src/driver.h (_virDriver): New driver callback. Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 57 +++++++++++++++++++-- src/driver.h | 11 +++- src/libvirt.c | 118 ++++++++++++++++++++++++++++++++++++++++++- src/libvirt_public.syms | 5 ++ 4 files changed, 184 insertions(+), 7 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 47ea695..89c8e63 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2518,16 +2518,16 @@ typedef enum { * flags), job ends on completion */ VIR_DOMAIN_BLOCK_JOB_TYPE_COPY = 2, - /* Block Copy (virDomainBlockRebase with flags), job exists as - * long as mirroring is active */ + /* Block Copy (virDomainBlockCopy, or virDomainBlockRebase with + * flags), job exists as long as mirroring is active */ VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT = 3, /* Block Commit (virDomainBlockCommit without flags), job ends on * completion */ VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT = 4, - /* Active Block Commit (virDomainBlockCommit with flags), job - * exists as long as sync is active */ + /* Active Block Commit (virDomainBlockCommit with flags), job exists + * as long as sync is active */ #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_BLOCK_JOB_TYPE_LAST @@ -2597,6 +2597,53 @@ int virDomainBlockRebase(virDomainPtr dom, const char *disk, unsigned int flags); /** + * virDomainBlockCopyFlags: + * + * Flags available for virDomainBlockCopy(). + */ +typedef enum { + VIR_DOMAIN_BLOCK_COPY_SHALLOW = 1 << 0, /* Limit copy to top of source + backing chain */ + VIR_DOMAIN_BLOCK_COPY_REUSE_EXT = 1 << 1, /* Reuse existing external + file for a copy */ +} virDomainBlockCopyFlags; + +/** + * VIR_DOMAIN_BLOCK_COPY_BANDWIDTH: + * Macro for the virDomainBlockCopy bandwidth tunable: it represents + * the maximum bandwidth (in MiB/s) used while getting the copy + * operation into the mirrored phase, with a type of ullong (although + * the hypervisor may restrict the set of valid values to a smaller + * range). Some hypervisors may lack support for this parameter, while + * still allowing a subsequent change of bandwidth via + * virDomainBlockJobSetSpeed(). The actual speed can be determined + * with virDomainGetBlockJobInfo(). + */ +#define VIR_DOMAIN_BLOCK_COPY_BANDWIDTH "bandwidth" + +/** + * VIR_DOMAIN_BLOCK_COPY_GRANULARITY: + * Macro for the virDomainBlockCopy granularity tunable: it represents + * the granularity at which the copy operation recognizes dirty pages, + * as an unsigned int, and must be a power of 2. + */ +#define VIR_DOMAIN_BLOCK_COPY_GRANULARITY "granularity" + +/** + * VIR_DOMAIN_BLOCK_COPY_BUF_SIZE: + * Macro for the virDomainBlockCopy buffer size tunable: it represents + * how much data can be in flight between source and destination, as + * an unsigned int. + */ +#define VIR_DOMAIN_BLOCK_COPY_BUF_SIZE "buf-size" + +int virDomainBlockCopy(virDomainPtr dom, const char *disk, + const char *destxml, + virTypedParameterPtr params, + int nparams, + unsigned int flags); + +/** * virDomainBlockCommitFlags: * * Flags available for virDomainBlockCommit(). @@ -4830,7 +4877,7 @@ typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn, * virConnectDomainEventBlockJobStatus: * * Tracks status of a virDomainBlockPull(), virDomainBlockRebase(), - * or virDomainBlockCommit() operation + * virDomainBlockCopy(), or virDomainBlockCommit() operation */ typedef enum { VIR_DOMAIN_BLOCK_JOB_COMPLETED = 0, diff --git a/src/driver.h b/src/driver.h index ba7c1fc..14933a7 100644 --- a/src/driver.h +++ b/src/driver.h @@ -2,7 +2,7 @@ * driver.h: description of the set of interfaces provided by a * entry point to the virtualization engine * - * Copyright (C) 2006-2013 Red Hat, Inc. + * Copyright (C) 2006-2014 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -1009,6 +1009,14 @@ typedef int unsigned int flags); typedef int +(*virDrvDomainBlockCopy)(virDomainPtr dom, + const char *path, + const char *destxml, + virTypedParameterPtr params, + int nparams, + unsigned int flags); + +typedef int (*virDrvDomainBlockCommit)(virDomainPtr dom, const char *disk, const char *base, @@ -1382,6 +1390,7 @@ struct _virDriver { virDrvDomainBlockJobSetSpeed domainBlockJobSetSpeed; virDrvDomainBlockPull domainBlockPull; virDrvDomainBlockRebase domainBlockRebase; + virDrvDomainBlockCopy domainBlockCopy; virDrvDomainBlockCommit domainBlockCommit; virDrvConnectSetKeepAlive connectSetKeepAlive; virDrvConnectIsAlive connectIsAlive; diff --git a/src/libvirt.c b/src/libvirt.c index 8349261..99f1dc1 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -19924,7 +19924,11 @@ virDomainBlockPull(virDomainPtr dom, const char *disk, * The actual speed can be determined with virDomainGetBlockJobInfo(). * * When @base is NULL and @flags is 0, this is identical to - * virDomainBlockPull(). + * virDomainBlockPull(). When @flags contains VIR_DOMAIN_BLOCK_REBASE_COPY, + * this command is shorthand for virDomainBlockCopy() where the destination + * has file type, @bandwidth is passed as a typed parameter, and the flags + * control whether the destination format is raw, identical to the source, + * or probed from the reused file. * * Returns 0 if the operation has started, -1 on failure. */ @@ -19975,6 +19979,118 @@ virDomainBlockRebase(virDomainPtr dom, const char *disk, /** + * virDomainBlockCopy: + * @dom: pointer to domain object + * @disk: path to the block device, or device shorthand + * @destxml: XML description of the copy destination + * @params: Pointer to block copy parameter objects, or NULL + * @nparams: Number of block copy parameters (this value can be the same or + * less than the number of parameters supported) + * @flags: bitwise-OR of virDomainBlockCopyFlags + * + * Copy the guest-visible contents of a disk image to a new file described + * by @destxml. The destination XML has a top-level element of <disk>, and + * resembles what is used when hot-plugging a disk via virDomainAttachDevice(), + * except that only sub-elements related to describing the new host resource + * are necessary (sub-elements related to the guest view, such as <target>, + * are ignored). It is strongly recommended to include a <driver type='...'/> + * format designation for the destination, to avoid the potential of any + * security problem that might be caused by probing a file for its format. + * + * This command starts a long-running copy. By default, the copy will pull + * the entire source chain into the destination file, but if @flags also + * contains VIR_DOMAIN_BLOCK_COPY_SHALLOW, then only the top of the source + * chain will be copied (the source and destination have a common backing + * file). The format of the destination file is controlled by the <driver> + * sub-element of the XML. The destination will be created unless the + * VIR_DOMAIN_BLOCK_COPY_REUSE_EXT flag is present stating that the file + * was pre-created with the correct format and metadata and sufficient + * size to hold the copy. In case the VIR_DOMAIN_BLOCK_COPY_SHALLOW flag + * is used the pre-created file has to exhibit the same guest visible contents + * as the backing file of the original image. This allows a management app to + * pre-create files with relative backing file names, rather than the default + * of absolute backing file names. + * + * A copy job has two parts; in the first phase, the @bandwidth parameter + * affects how fast the source is pulled into the destination, and the job + * can only be canceled by reverting to the source file; progress in this + * phase can be tracked via the virDomainBlockJobInfo() command, with a + * job type of VIR_DOMAIN_BLOCK_JOB_TYPE_COPY. The job transitions to the + * second phase when the job info states cur == end, and remains alive to + * mirror all further changes to both source and destination. The user + * must call virDomainBlockJobAbort() to end the mirroring while choosing + * whether to revert to source or pivot to the destination. An event is + * issued when the job ends, and depending on the hypervisor, an event may + * also be issued when the job transitions from pulling to mirroring. If + * the job is aborted, a new job will have to start over from the beginning + * of the first phase. + * + * Some hypervisors will restrict certain actions, such as virDomainSave() + * or virDomainDetachDevice(), while a copy job is active; they may + * also restrict a copy job to transient domains. + * + * The @disk parameter is either an unambiguous source name of the + * block device (the <source file='...'/> sub-element, such as + * "/path/to/image"), or the device target shorthand (the + * <target dev='...'/> sub-element, such as "vda"). Valid names + * can be found by calling virDomainGetXMLDesc() and inspecting + * elements within //domain/devices/disk. + * + * The @params and @nparams arguments can be used to set hypervisor-specific + * tuning parameters, such as maximum bandwidth or granularity. + * + * This command is a superset of the older virDomainBlockRebase() when used + * with the VIR_DOMAIN_BLOCK_REBASE_COPY flag, and offers better control + * over the destination format, the ability to copy to a destination that + * is not a local file, and the possibility of additional tuning parameters. + * + * Returns 0 if the operation has started, -1 on failure. + */ +int +virDomainBlockCopy(virDomainPtr dom, const char *disk, + const char *destxml, + virTypedParameterPtr params, + int nparams, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(dom, + "disk=%s, destxml=%s, params=%p, nparams=%d, flags=%x", + disk, destxml, params, nparams, flags); + VIR_TYPED_PARAMS_DEBUG(params, nparams); + + virResetLastError(); + + virCheckDomainReturn(dom, -1); + conn = dom->conn; + + virCheckReadOnlyGoto(conn->flags, error); + virCheckNonNullArgGoto(disk, error); + virCheckNonNullArgGoto(destxml, error); + if (params) + virCheckPositiveArgGoto(nparams, error); + else + virCheckZeroArgGoto(nparams, error); + + if (conn->driver->domainBlockCopy) { + int ret; + ret = conn->driver->domainBlockCopy(dom, disk, destxml, + params, nparams, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + + error: + virDispatchError(dom->conn); + return -1; +} + + +/** * virDomainBlockCommit: * @dom: pointer to domain object * @disk: path to the block device, or device shorthand diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 9f4016a..c3f3bd0 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -670,4 +670,9 @@ LIBVIRT_1.2.7 { virConnectGetDomainCapabilities; } LIBVIRT_1.2.6; +LIBVIRT_1.2.8 { + global: + virDomainBlockCopy; +} LIBVIRT_1.2.7; + # .... define new API here using predicted next version number .... -- 1.9.3

On 08/24/14 05:32, Eric Blake wrote:
This commit (finally) adds the virDomainBlockCopy API, with the intent that it will provide more power to the existing 'virsh blockcopy' command.
'virsh blockcopy' was first added in Apr 2012 (v0.9.12), which corresponds to the upstream qemu 1.2 timeframe. It was done as a hack on top of the existing virDomainBlockRebase() API call, for two reasons: 1) it was targetting a feature that landed first in downstream RHEL qemu, but had not stabilized in upstream qemu at the time (and indeed, 'drive-mirror' only landed upstream in qemu 1.3 with slight differences to the first RHEL attempt, and later gained further parameters like granularity and buf-size that are also worth exposing), and 2) extending an existing API allowed it to be backported without worrying about bumping .so versions. A virDomainBlockCopy() API was proposed at that time [1], but we decided not to accept it into libvirt until after upstream qemu stabilized, and it ended up getting scrapped. Whether or not RHEL should have attempted adding a new feature without getting it upstream first is a debate that can be held another day; but enough time has now elapsed that we are ready to do the interface cleanly.
[1] https://www.redhat.com/archives/libvir-list/2012-April/msg00768.html
Delaying the creation of a clean API until now has also had a benefit: we've only recently learned of a shortcoming in the original design, namely, that it is unable to target a network destination (such as a gluster volume) because it hard-coded the assumption that the destination is a local file name. Because of all the refactoring we've done to add virStorageSourcePtr, we are in a better position to declare an API that parses XML describing a host storage source as the copy destination, which was not possible had we implemented virDomainBlockCopy as it had been originally envisioned.
At least I had the foresight to create 'virsh blockcopy' as a separate command at the UI level (commit 1f06c00) rather than leaking the underlying API overload of virDomainBlockRebase onto shell users.
A note on the bandwidth option: virTypedParameters intentionally lacks unsigned long (since variable-width interaction between mixed 32- vs. 64-bit client/server setups is nasty), but we have to deal with the fact that we are interacting with existing older code that mistakenly chose unsigned long bandwidth at a point before we decided to prohibit it in all new API. The typed parameter is therefore unsigned long long, and the implementation will have to do overflow detection on 32-bit platforms.
* include/libvirt/libvirt.h.in (virDomainBlockCopy): New API. (virDomainBlockJobType, virConnectDomainEventBlockJobStatus): Update related documentation. * src/libvirt.c (virDomainBlockCopy): Implement it. * src/libvirt_public.syms (LIBVIRT_1.2.8): Export it. * src/driver.h (_virDriver): New driver callback.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 57 +++++++++++++++++++-- src/driver.h | 11 +++- src/libvirt.c | 118 ++++++++++++++++++++++++++++++++++++++++++- src/libvirt_public.syms | 5 ++ 4 files changed, 184 insertions(+), 7 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 47ea695..89c8e63 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2518,16 +2518,16 @@ typedef enum { * flags), job ends on completion */
VIR_DOMAIN_BLOCK_JOB_TYPE_COPY = 2, - /* Block Copy (virDomainBlockRebase with flags), job exists as - * long as mirroring is active */ + /* Block Copy (virDomainBlockCopy, or virDomainBlockRebase with + * flags), job exists as long as mirroring is active */
VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT = 3, /* Block Commit (virDomainBlockCommit without flags), job ends on * completion */
VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT = 4, - /* Active Block Commit (virDomainBlockCommit with flags), job - * exists as long as sync is active */ + /* Active Block Commit (virDomainBlockCommit with flags), job exists + * as long as sync is active */
#ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_BLOCK_JOB_TYPE_LAST @@ -2597,6 +2597,53 @@ int virDomainBlockRebase(virDomainPtr dom, const char *disk, unsigned int flags);
/** + * virDomainBlockCopyFlags: + * + * Flags available for virDomainBlockCopy(). + */ +typedef enum { + VIR_DOMAIN_BLOCK_COPY_SHALLOW = 1 << 0, /* Limit copy to top of source + backing chain */ + VIR_DOMAIN_BLOCK_COPY_REUSE_EXT = 1 << 1, /* Reuse existing external + file for a copy */ +} virDomainBlockCopyFlags; + +/** + * VIR_DOMAIN_BLOCK_COPY_BANDWIDTH: + * Macro for the virDomainBlockCopy bandwidth tunable: it represents + * the maximum bandwidth (in MiB/s) used while getting the copy + * operation into the mirrored phase, with a type of ullong (although
MiB/s and ullong? thats an awful lot of speed. Shouldn't we go with KiB/s? This might benefit slower networks. (although it may never converge there)
+ * the hypervisor may restrict the set of valid values to a smaller + * range). Some hypervisors may lack support for this parameter, while + * still allowing a subsequent change of bandwidth via + * virDomainBlockJobSetSpeed(). The actual speed can be determined + * with virDomainGetBlockJobInfo(). + */ +#define VIR_DOMAIN_BLOCK_COPY_BANDWIDTH "bandwidth" + +/** + * VIR_DOMAIN_BLOCK_COPY_GRANULARITY: + * Macro for the virDomainBlockCopy granularity tunable: it represents + * the granularity at which the copy operation recognizes dirty pages,
I'd rather say "dirty blocks". Pages might indicate RAM memory pages.
+ * as an unsigned int, and must be a power of 2. + */ +#define VIR_DOMAIN_BLOCK_COPY_GRANULARITY "granularity" + +/** + * VIR_DOMAIN_BLOCK_COPY_BUF_SIZE: + * Macro for the virDomainBlockCopy buffer size tunable: it represents + * how much data can be in flight between source and destination, as + * an unsigned int.
In bytes?
+ */ +#define VIR_DOMAIN_BLOCK_COPY_BUF_SIZE "buf-size" + +int virDomainBlockCopy(virDomainPtr dom, const char *disk, + const char *destxml, + virTypedParameterPtr params, + int nparams, + unsigned int flags); + +/** * virDomainBlockCommitFlags: * * Flags available for virDomainBlockCommit(). @@ -4830,7 +4877,7 @@ typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn, * virConnectDomainEventBlockJobStatus: * * Tracks status of a virDomainBlockPull(), virDomainBlockRebase(), - * or virDomainBlockCommit() operation + * virDomainBlockCopy(), or virDomainBlockCommit() operation */ typedef enum { VIR_DOMAIN_BLOCK_JOB_COMPLETED = 0, diff --git a/src/driver.h b/src/driver.h index ba7c1fc..14933a7 100644 --- a/src/driver.h +++ b/src/driver.h @@ -2,7 +2,7 @@ * driver.h: description of the set of interfaces provided by a * entry point to the virtualization engine * - * Copyright (C) 2006-2013 Red Hat, Inc. + * Copyright (C) 2006-2014 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -1009,6 +1009,14 @@ typedef int unsigned int flags);
typedef int +(*virDrvDomainBlockCopy)(virDomainPtr dom, + const char *path, + const char *destxml, + virTypedParameterPtr params, + int nparams, + unsigned int flags); + +typedef int (*virDrvDomainBlockCommit)(virDomainPtr dom, const char *disk, const char *base, @@ -1382,6 +1390,7 @@ struct _virDriver { virDrvDomainBlockJobSetSpeed domainBlockJobSetSpeed; virDrvDomainBlockPull domainBlockPull; virDrvDomainBlockRebase domainBlockRebase; + virDrvDomainBlockCopy domainBlockCopy; virDrvDomainBlockCommit domainBlockCommit; virDrvConnectSetKeepAlive connectSetKeepAlive; virDrvConnectIsAlive connectIsAlive; diff --git a/src/libvirt.c b/src/libvirt.c index 8349261..99f1dc1 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -19924,7 +19924,11 @@ virDomainBlockPull(virDomainPtr dom, const char *disk, * The actual speed can be determined with virDomainGetBlockJobInfo(). * * When @base is NULL and @flags is 0, this is identical to - * virDomainBlockPull(). + * virDomainBlockPull(). When @flags contains VIR_DOMAIN_BLOCK_REBASE_COPY, + * this command is shorthand for virDomainBlockCopy() where the destination + * has file type, @bandwidth is passed as a typed parameter, and the flags + * control whether the destination format is raw, identical to the source, + * or probed from the reused file. * * Returns 0 if the operation has started, -1 on failure. */ @@ -19975,6 +19979,118 @@ virDomainBlockRebase(virDomainPtr dom, const char *disk,
/** + * virDomainBlockCopy: + * @dom: pointer to domain object + * @disk: path to the block device, or device shorthand + * @destxml: XML description of the copy destination + * @params: Pointer to block copy parameter objects, or NULL + * @nparams: Number of block copy parameters (this value can be the same or + * less than the number of parameters supported) + * @flags: bitwise-OR of virDomainBlockCopyFlags + * + * Copy the guest-visible contents of a disk image to a new file described + * by @destxml. The destination XML has a top-level element of <disk>, and + * resembles what is used when hot-plugging a disk via virDomainAttachDevice(), + * except that only sub-elements related to describing the new host resource + * are necessary (sub-elements related to the guest view, such as <target>, + * are ignored). It is strongly recommended to include a <driver type='...'/> + * format designation for the destination, to avoid the potential of any + * security problem that might be caused by probing a file for its format. + * + * This command starts a long-running copy. By default, the copy will pull + * the entire source chain into the destination file, but if @flags also + * contains VIR_DOMAIN_BLOCK_COPY_SHALLOW, then only the top of the source + * chain will be copied (the source and destination have a common backing + * file). The format of the destination file is controlled by the <driver> + * sub-element of the XML. The destination will be created unless the + * VIR_DOMAIN_BLOCK_COPY_REUSE_EXT flag is present stating that the file + * was pre-created with the correct format and metadata and sufficient + * size to hold the copy. In case the VIR_DOMAIN_BLOCK_COPY_SHALLOW flag + * is used the pre-created file has to exhibit the same guest visible contents + * as the backing file of the original image. This allows a management app to + * pre-create files with relative backing file names, rather than the default + * of absolute backing file names. + * + * A copy job has two parts; in the first phase, the @bandwidth parameter
@bandwidth is now provided as a typed param.
+ * affects how fast the source is pulled into the destination, and the job + * can only be canceled by reverting to the source file; progress in this + * phase can be tracked via the virDomainBlockJobInfo() command, with a + * job type of VIR_DOMAIN_BLOCK_JOB_TYPE_COPY. The job transitions to the + * second phase when the job info states cur == end, and remains alive to + * mirror all further changes to both source and destination. The user + * must call virDomainBlockJobAbort() to end the mirroring while choosing + * whether to revert to source or pivot to the destination. An event is + * issued when the job ends, and depending on the hypervisor, an event may + * also be issued when the job transitions from pulling to mirroring. If + * the job is aborted, a new job will have to start over from the beginning + * of the first phase. + * + * Some hypervisors will restrict certain actions, such as virDomainSave() + * or virDomainDetachDevice(), while a copy job is active; they may + * also restrict a copy job to transient domains. + * + * The @disk parameter is either an unambiguous source name of the + * block device (the <source file='...'/> sub-element, such as + * "/path/to/image"), or the device target shorthand (the + * <target dev='...'/> sub-element, such as "vda"). Valid names + * can be found by calling virDomainGetXMLDesc() and inspecting + * elements within //domain/devices/disk. + * + * The @params and @nparams arguments can be used to set hypervisor-specific + * tuning parameters, such as maximum bandwidth or granularity. + * + * This command is a superset of the older virDomainBlockRebase() when used + * with the VIR_DOMAIN_BLOCK_REBASE_COPY flag, and offers better control + * over the destination format, the ability to copy to a destination that + * is not a local file, and the possibility of additional tuning parameters. + * + * Returns 0 if the operation has started, -1 on failure. + */ +int +virDomainBlockCopy(virDomainPtr dom, const char *disk, + const char *destxml, + virTypedParameterPtr params, + int nparams, + unsigned int flags)
Wow, XML, typed params and flags. Now that's future proof! :)
+{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(dom, + "disk=%s, destxml=%s, params=%p, nparams=%d, flags=%x", + disk, destxml, params, nparams, flags); + VIR_TYPED_PARAMS_DEBUG(params, nparams); + + virResetLastError(); + + virCheckDomainReturn(dom, -1); + conn = dom->conn; + + virCheckReadOnlyGoto(conn->flags, error); + virCheckNonNullArgGoto(disk, error); + virCheckNonNullArgGoto(destxml, error); + if (params) + virCheckPositiveArgGoto(nparams, error); + else + virCheckZeroArgGoto(nparams, error); + + if (conn->driver->domainBlockCopy) { + int ret; + ret = conn->driver->domainBlockCopy(dom, disk, destxml, + params, nparams, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + + error: + virDispatchError(dom->conn); + return -1; +} + + +/** * virDomainBlockCommit: * @dom: pointer to domain object * @disk: path to the block device, or device shorthand diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 9f4016a..c3f3bd0 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -670,4 +670,9 @@ LIBVIRT_1.2.7 { virConnectGetDomainCapabilities; } LIBVIRT_1.2.6;
+LIBVIRT_1.2.8 { + global: + virDomainBlockCopy;
One of us will have to rebase. I've recently posted a series that adds API too :)
+} LIBVIRT_1.2.7; + # .... define new API here using predicted next version number ....
Apart from a few DOC problems the API looks fine to me and should be fairly future proof. ACK to the design (once docs are fixed). Peter P.S.: I've run out of time to review the rest of this, but this should be good enough to merge the rest a bit later.

On 08/25/2014 11:20 AM, Peter Krempa wrote:
On 08/24/14 05:32, Eric Blake wrote:
This commit (finally) adds the virDomainBlockCopy API, with the intent that it will provide more power to the existing 'virsh blockcopy' command.
+/** + * VIR_DOMAIN_BLOCK_COPY_BANDWIDTH: + * Macro for the virDomainBlockCopy bandwidth tunable: it represents + * the maximum bandwidth (in MiB/s) used while getting the copy + * operation into the mirrored phase, with a type of ullong (although
MiB/s and ullong? thats an awful lot of speed. Shouldn't we go with KiB/s? This might benefit slower networks. (although it may never converge there)
No. We're forced to use back-compat to the existing design of: virDomainBlockRebase(virDomainPtr dom, const char *disk, const char *base, unsigned long bandwidth, unsigned int flags) virDomainBlockJobSetSpeed(virDomainPtr dom, const char *disk, unsigned long bandwidth, unsigned int flags) which already document bandwidth as an unsigned long (32-bit maximum is the only portable value). It's just that virTypedParameters intentionally lack support for 'unsigned long' (precisely because it is variable-sized between 32-bit and 64-bit), so my (still-in-progress) patch 6 has an overflow check that errors out if the user passed in a bandwidth using ullong that doesn't fit in the ulong handed to qemu. The other alternative is to declare the parameter as 'uint', and that the new API is unable to represent uses of the old API that exceed INT_MAX, but I really want the new API to be a superset of the old API. I tried to cover that point in the commit message, and also in the comments to the macro stating that the hypervisor can reject out-of-range values. Do I need to make the wording any stronger?
+/** + * VIR_DOMAIN_BLOCK_COPY_GRANULARITY: + * Macro for the virDomainBlockCopy granularity tunable: it represents + * the granularity at which the copy operation recognizes dirty pages,
I'd rather say "dirty blocks". Pages might indicate RAM memory pages.
Sure. I'm just trying to expose the qemu parameters, which are rather sparsely documented as: # @granularity: #optional granularity of the dirty bitmap, default is 64K # if the image format doesn't have clusters, 4K if the clusters # are smaller than that, else the cluster size. Must be a # power of 2 between 512 and 64M (since 1.4). # # @buf-size: #optional maximum amount of data in flight from source to # target (since 1.4).
+ * as an unsigned int, and must be a power of 2. + */ +#define VIR_DOMAIN_BLOCK_COPY_GRANULARITY "granularity" + +/** + * VIR_DOMAIN_BLOCK_COPY_BUF_SIZE: + * Macro for the virDomainBlockCopy buffer size tunable: it represents + * how much data can be in flight between source and destination, as + * an unsigned int.
In bytes?
I assume so; again, see the qemu docs that I'm trying to expose.
+ * + * A copy job has two parts; in the first phase, the @bandwidth parameter
@bandwidth is now provided as a typed param.
Too much copy-and-paste - yeah, I'll have to adjust that.
+int +virDomainBlockCopy(virDomainPtr dom, const char *disk, + const char *destxml, + virTypedParameterPtr params, + int nparams, + unsigned int flags)
Wow, XML, typed params and flags. Now that's future proof! :)
I sure hope so! Although I'm _already_ slightly worried about image fleecing, which says to expose the state of the disk not as it is currently evolving, but as it existed at a fixed point in time in the past, often in order to copy out that state to backup storage. In that case, fleecing may want to start from a known point whereas this API kind of implies starting from the active image. Hmm, I guess we have the 'vda[1]' notation for picking a known point that is the backing file of vda. Then again, while fleecing is a form of copying data, it might be distinct enough to warrant a separate API anyway.
+LIBVIRT_1.2.8 { + global: + virDomainBlockCopy;
One of us will have to rebase. I've recently posted a series that adds API too :)
Fortunately, the rebase is trivial.
+} LIBVIRT_1.2.7; + # .... define new API here using predicted next version number ....
Apart from a few DOC problems the API looks fine to me and should be fairly future proof.
ACK to the design (once docs are fixed).
Peter
P.S.: I've run out of time to review the rest of this, but this should be good enough to merge the rest a bit later.
Thanks; still, I'll post a v2 with tweaks you mentioned above and with anything else I learn today while implementing the rest, if it looks like DV will give me enough time to still get it into rc0. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

I'm about to extend the capabilities of blockcopy. Hiding a few common lines of implementation gets in the way of the new required logic, and putting the new logic in the common implementation won't benefit any of the other blockjob operations. Therefore, it is simpler to just do the work inline. * tools/virsh-domain.c (blockJobImpl): Drop unused variable. Move block copy guts... (cmdBlockCopy): ...into their lone caller. Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 45 +++++++++++++++++++++++++++------------------ 1 file changed, 27 insertions(+), 18 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 2562326..fb9c009 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1443,8 +1443,7 @@ typedef enum { VSH_CMD_BLOCK_JOB_INFO = 1, VSH_CMD_BLOCK_JOB_SPEED = 2, VSH_CMD_BLOCK_JOB_PULL = 3, - VSH_CMD_BLOCK_JOB_COPY = 4, - VSH_CMD_BLOCK_JOB_COMMIT = 5, + VSH_CMD_BLOCK_JOB_COMMIT = 4, } vshCmdBlockJobMode; static int @@ -1453,14 +1452,14 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, virDomainPtr *pdom) { virDomainPtr dom = NULL; - const char *name, *path; + const char *path; unsigned long bandwidth = 0; int ret = -1; const char *base = NULL; const char *top = NULL; unsigned int flags = 0; - if (!(dom = vshCommandOptDomain(ctl, cmd, &name))) + if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) goto cleanup; if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) @@ -1513,17 +1512,6 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, flags |= VIR_DOMAIN_BLOCK_COMMIT_RELATIVE; ret = virDomainBlockCommit(dom, path, base, top, bandwidth, flags); break; - case VSH_CMD_BLOCK_JOB_COPY: - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; - if (vshCommandOptBool(cmd, "shallow")) - flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; - if (vshCommandOptBool(cmd, "reuse-external")) - flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; - if (vshCommandOptBool(cmd, "raw")) - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; - if (vshCommandOptStringReq(ctl, cmd, "dest", &base) < 0) - goto cleanup; - ret = virDomainBlockRebase(dom, path, base, bandwidth, flags); } cleanup: @@ -1859,6 +1847,9 @@ static bool cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) { virDomainPtr dom = NULL; + const char *dest; + unsigned long bandwidth = 0; + unsigned int flags = VIR_DOMAIN_BLOCK_REBASE_COPY; bool ret = false; bool blocking = vshCommandOptBool(cmd, "wait"); bool verbose = vshCommandOptBool(cmd, "verbose"); @@ -1874,6 +1865,9 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) bool quit = false; int abort_flags = 0; + if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + return false; + blocking |= vshCommandOptBool(cmd, "timeout") || pivot || finish; if (blocking) { if (pivot && finish) { @@ -1882,8 +1876,6 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) } if (vshCommandOptTimeoutToMs(ctl, cmd, &timeout) < 0) return false; - if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) - return false; if (vshCommandOptBool(cmd, "async")) abort_flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC; @@ -1902,7 +1894,24 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) return false; } - if (blockJobImpl(ctl, cmd, NULL, VSH_CMD_BLOCK_JOB_COPY, &dom) < 0) + if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) + goto cleanup; + + if (vshCommandOptULWrap(cmd, "bandwidth", &bandwidth) < 0) { + vshError(ctl, "%s", _("bandwidth must be a number")); + goto cleanup; + } + + if (vshCommandOptBool(cmd, "shallow")) + flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + if (vshCommandOptBool(cmd, "reuse-external")) + flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; + if (vshCommandOptBool(cmd, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + if (vshCommandOptStringReq(ctl, cmd, "dest", &dest) < 0) + goto cleanup; + + if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) goto cleanup; if (!blocking) { -- 1.9.3

Expose the new power of virDomainBlockCopy through virsh. Continue to use the older API where possible, for maximum compatibility. The command now requires either --dest (with optional --format), to directly describe the file destination, or --xml, to name a file that contains an XML description such as: <disk type='network'> <driver type='raw'/> <source protocol='gluster' name='vol1/img'> <host name='red'/> </source> </disk> Non-zero option parameters are converted into virTypedParameters, and if anything requires the new API, the command can synthesize appropriate XML even if the --dest option was used instead of --xml. The existing --raw flag remains for back-compat, but the preferred spelling is now --format=raw, since the new API now allows us to specify all formats rather than just a boolean raw to suppress probing. I hope I did justice in describing the effects of granularity and buf-size on how they get passed through to qemu. * tools/virsh-domain.c (cmdBlockCopy): Add new options --xml, --granularity, --buf-size, --format. Make --raw an alias for --format=raw. Call new API if new parameters are in use. * tools/virsh.pod (blockcopy): Document new options. Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 124 ++++++++++++++++++++++++++++++++++++++++++++++----- tools/virsh.pod | 35 +++++++++------ 2 files changed, 134 insertions(+), 25 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index fb9c009..8a3c7eb 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1793,11 +1793,10 @@ static const vshCmdOptDef opts_block_copy[] = { {.name = "path", .type = VSH_OT_DATA, .flags = VSH_OFLAG_REQ, - .help = N_("fully-qualified path of disk") + .help = N_("fully-qualified path of source disk") }, {.name = "dest", .type = VSH_OT_DATA, - .flags = VSH_OFLAG_REQ, .help = N_("path of the copy to create") }, {.name = "bandwidth", @@ -1813,8 +1812,8 @@ static const vshCmdOptDef opts_block_copy[] = { .help = N_("reuse existing destination") }, {.name = "raw", - .type = VSH_OT_BOOL, - .help = N_("use raw destination file") + .type = VSH_OT_ALIAS, + .help = "format=raw" }, {.name = "wait", .type = VSH_OT_BOOL, @@ -1840,6 +1839,22 @@ static const vshCmdOptDef opts_block_copy[] = { .type = VSH_OT_BOOL, .help = N_("with --wait, don't wait for cancel to finish") }, + {.name = "xml", + .type = VSH_OT_DATA, + .help = N_("filename containing XML description of the copy destination") + }, + {.name = "format", + .type = VSH_OT_DATA, + .help = N_("format of the destination file") + }, + {.name = "granularity", + .type = VSH_OT_INT, + .help = N_("power-of-two granularity to use during the copy") + }, + {.name = "buf-size", + .type = VSH_OT_INT, + .help = N_("maximum amount of in-flight data during the copy") + }, {.name = NULL} }; @@ -1847,9 +1862,12 @@ static bool cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) { virDomainPtr dom = NULL; - const char *dest; + const char *dest = NULL; + const char *format = NULL; unsigned long bandwidth = 0; - unsigned int flags = VIR_DOMAIN_BLOCK_REBASE_COPY; + unsigned int granularity = 0; + unsigned int buf_size = 0; + unsigned int flags = 0; bool ret = false; bool blocking = vshCommandOptBool(cmd, "wait"); bool verbose = vshCommandOptBool(cmd, "verbose"); @@ -1864,6 +1882,22 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) const char *path = NULL; bool quit = false; int abort_flags = 0; + const char *xml = NULL; + char *xmlstr = NULL; + virTypedParameterPtr params = NULL; + int nparams = 0; + + if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + return false; + if (vshCommandOptString(cmd, "dest", &dest) < 0) + goto cleanup; + if (vshCommandOptString(cmd, "xml", &xml) < 0) + return false; + if (vshCommandOptString(cmd, "format", &format) < 0) + return false; + + VSH_EXCLUSIVE_OPTIONS_VAR(dest, xml); + VSH_EXCLUSIVE_OPTIONS_VAR(format, xml); if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) return false; @@ -1901,18 +1935,82 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) vshError(ctl, "%s", _("bandwidth must be a number")); goto cleanup; } + if (vshCommandOptUIntWrap(cmd, "granularity", &granularity) < 0) { + vshError(ctl, "%s", _("granularity must be a number")); + goto cleanup; + } + if (vshCommandOptUIntWrap(cmd, "buf_size", &buf_size) < 0) { + vshError(ctl, "%s", _("buf-size must be a number")); + goto cleanup; + } + if (xml) { + if (virFileReadAll(xml, 8192, &xmlstr) < 0) { + vshReportError(ctl); + goto cleanup; + } + } else if (!dest) { + vshError(ctl, "%s", _("need either --dest or --xml")); + goto cleanup; + } + + /* Exploit that VIR_DOMAIN_BLOCK_REBASE_* and + * VIR_DOMAIN_BLOCK_COPY_* flags have the same values. */ if (vshCommandOptBool(cmd, "shallow")) flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; if (vshCommandOptBool(cmd, "reuse-external")) flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; - if (vshCommandOptBool(cmd, "raw")) - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; - if (vshCommandOptStringReq(ctl, cmd, "dest", &dest) < 0) - goto cleanup; - if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) - goto cleanup; + if (granularity || buf_size || (format && STRNEQ(format, "raw")) || xml) { + /* New API */ + if (bandwidth || granularity || buf_size) { + params = vshCalloc(ctl, 3, sizeof(*params)); + /* bandwidth is ulong, but the typed parameter is ullong; + * use addition to ensure correct vararg typing */ + if (bandwidth && + virTypedParameterAssign(¶ms[nparams++], + VIR_DOMAIN_BLOCK_COPY_BANDWIDTH, + VIR_TYPED_PARAM_ULLONG, + bandwidth + 0ULL) < 0) + goto cleanup; + if (granularity && + virTypedParameterAssign(¶ms[nparams++], + VIR_DOMAIN_BLOCK_COPY_GRANULARITY, + VIR_TYPED_PARAM_UINT, + granularity) < 0) + goto cleanup; + if (buf_size && + virTypedParameterAssign(¶ms[nparams++], + VIR_DOMAIN_BLOCK_COPY_BUF_SIZE, + VIR_TYPED_PARAM_UINT, + buf_size) < 0) + goto cleanup; + } + + if (!xmlstr) { + virBuffer buf = VIR_BUFFER_INITIALIZER; + virBufferAddLit(&buf, "<disk type='file'>\n"); + virBufferAdjustIndent(&buf, 2); + virBufferEscapeString(&buf, "<source file='%s'/>\n", dest); + virBufferEscapeString(&buf, "<driver type='%s'/>\n", format); + virBufferAdjustIndent(&buf, -2); + virBufferAddLit(&buf, "</disk>\n"); + if (virBufferCheckError(&buf) < 0) + goto cleanup; + xmlstr = virBufferContentAndReset(&buf); + } + + if (virDomainBlockCopy(dom, path, xmlstr, params, nparams, flags) < 0) + goto cleanup; + } else { + /* Old API */ + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; + if (STREQ_NULLABLE(format, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + + if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) + goto cleanup; + } if (!blocking) { vshPrint(ctl, "%s", _("Block Copy started")); @@ -1983,6 +2081,8 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) ret = true; cleanup: + VIR_FREE(xmlstr); + virTypedParamsFree(params, nparams); if (dom) virDomainFree(dom); if (blocking) diff --git a/tools/virsh.pod b/tools/virsh.pod index e0dfd13..32c5ff9 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -873,27 +873,29 @@ value is interpreted as an unsigned long long value or essentially unlimited. The hypervisor can choose whether to reject the value or convert it to the maximum value allowed. -=item B<blockcopy> I<domain> I<path> I<dest> [I<bandwidth>] [I<--shallow>] -[I<--reuse-external>] [I<--raw>] [I<--wait> [I<--async>] [I<--verbose>]] +=item B<blockcopy> I<domain> I<path> { I<dest> [I<format>] | I<xml> } +[I<--shallow>] [I<--reuse-external>] [I<--wait> [I<--async>] [I<--verbose>]] [{I<--pivot> | I<--finish>}] [I<--timeout> B<seconds>] +[I<bandwidth>] [I<granularity>] [I<buf-size>] -Copy a disk backing image chain to I<dest>. By default, this command -flattens the entire chain; but if I<--shallow> is specified, the copy -shares the backing chain. +Copy a disk backing image chain to a destination. Either I<dest> as +the destination file name, or I<xml> as the name of an XML file containing +a top-level <disk> element describing the destination, must be present. +Additionally, if I<dest> is given, I<format> should be specified to declare +the format of the destination (if I<format> is omitted, then libvirt +will reuse the format of the source, or with I<--reuse-external> will +be forced to probe the destination format, which could be a potential +security hole). The command supports I<--raw> as a boolean flag synonym for +I<--format=raw>. By default, this command flattens the entire chain; but +if I<--shallow> is specified, the copy shares the backing chain. -If I<--reuse-external> is specified, then I<dest> must exist and have +If I<--reuse-external> is specified, then the destination must exist and have sufficient space to hold the copy. If I<--shallow> is used in conjunction with I<--reuse-external> then the pre-created image must have guest visible contents identical to guest visible contents of the backing file of the original image. This may be used to modify the backing file names on the destination. -The format of the destination is determined by the first match in the -following list: if I<--raw> is specified, it will be raw; if -I<--reuse-external> is specified, the existing destination is probed -for a format; and in all other cases, the destination format will -match the source format. - By default, the copy job runs in the background, and consists of two phases. Initially, the job must copy all data from the source, and during this phase, the job can only be canceled to revert back to the @@ -917,7 +919,14 @@ I<path> specifies fully-qualified path of the disk. I<bandwidth> specifies copying bandwidth limit in MiB/s. Specifying a negative value is interpreted as an unsigned long long value or essentially unlimited. The hypervisor can choose whether to reject the value or -convert it to the maximum value allowed. +convert it to the maximum value allowed. Specifying I<granularity> +allows fine-tuning of the granularity that will be copied when a dirty +page is detected; larger values trigger less I/O overhead but may end +up copying more data overall (the default value is usually correct); +this value must be a power of two. Specifying I<buf-size> will control +how much data can be simultaneously in-flight during the copy; larger +values use more memory but may allow faster completion (the default +value is usually correct). =item B<blockpull> I<domain> I<path> [I<bandwidth>] [I<base>] [I<--wait> [I<--verbose>] [I<--timeout> B<seconds>] [I<--async>]] -- 1.9.3

Fairly straightforward. * src/remote/remote_protocol.x (remote_domain_block_copy_args): New struct. (REMOTE_PROC_DOMAIN_BLOCK_COPY): New RPC. * src/remote/remote_driver.c (remote_driver): Wire it up. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 18 +++++++++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 9a1d78f..98661be 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -7977,6 +7977,7 @@ static virDriver remote_driver = { .domainBlockJobSetSpeed = remoteDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = remoteDomainBlockPull, /* 0.9.4 */ .domainBlockRebase = remoteDomainBlockRebase, /* 0.9.10 */ + .domainBlockCopy = remoteDomainBlockCopy, /* 1.2.8 */ .domainBlockCommit = remoteDomainBlockCommit, /* 0.10.2 */ .connectSetKeepAlive = remoteConnectSetKeepAlive, /* 0.9.8 */ .connectIsAlive = remoteConnectIsAlive, /* 0.9.8 */ diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 5c316fb..81a8603 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -124,6 +124,9 @@ const REMOTE_DOMAIN_BLOCK_IO_TUNE_PARAMETERS_MAX = 16; /* Upper limit on list of numa parameters. */ const REMOTE_DOMAIN_NUMA_PARAMETERS_MAX = 16; +/* Upper limit on block copy tunable parameters. */ +const REMOTE_DOMAIN_BLOCK_COPY_PARAMETERS_MAX = 16; + /* Upper limit on list of node cpu stats. */ const REMOTE_NODE_CPU_STATS_MAX = 16; @@ -1281,6 +1284,13 @@ struct remote_domain_block_rebase_args { unsigned hyper bandwidth; unsigned int flags; }; +struct remote_domain_block_copy_args { + remote_nonnull_domain dom; + remote_nonnull_string path; + remote_nonnull_string destxml; + remote_typed_param params<REMOTE_DOMAIN_BLOCK_COPY_PARAMETERS_MAX>; + unsigned int flags; +}; struct remote_domain_block_commit_args { remote_nonnull_domain dom; remote_nonnull_string disk; @@ -5420,5 +5430,11 @@ enum remote_procedure { * @generate: both * @acl: connect:write */ - REMOTE_PROC_CONNECT_GET_DOMAIN_CAPABILITIES = 342 + REMOTE_PROC_CONNECT_GET_DOMAIN_CAPABILITIES = 342, + + /** + * @generate: both + * @acl: domain:block_write + */ + REMOTE_PROC_DOMAIN_BLOCK_COPY = 343 }; -- 1.9.3

In order to implement the new virDomainBlockCopy, the existing block copy internal implementation needs to be adjusted. The new function will parse XML into a storage source, and parse typed parameters into integers, then call into the same common backend. For now, it's easier to keep the same implementation limits that only local file destinations are suported, but now the check needs to be explicit. * src/qemu/qemu_driver.c (qemuDomainBlockCopy): Rename... (qemuDomainBlockCopyCommon): ...and adjust parameters. (qemuDomainBlockRebase): Adjust caller. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 104 +++++++++++++++++++++++++++++++------------------ 1 file changed, 67 insertions(+), 37 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ad75bd9..4e43b19 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15276,11 +15276,14 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, } static int -qemuDomainBlockCopy(virDomainObjPtr vm, - virConnectPtr conn, - const char *path, - const char *dest, const char *format, - unsigned long bandwidth, unsigned int flags) +qemuDomainBlockCopyCommon(virDomainObjPtr vm, + virConnectPtr conn, + const char *path, + virStorageSourcePtr dest, + unsigned long bandwidth, + int granularity, + int buf_size, + unsigned int flags) { virQEMUDriverPtr driver = conn->privateData; qemuDomainObjPrivatePtr priv; @@ -15292,10 +15295,21 @@ qemuDomainBlockCopy(virDomainObjPtr vm, bool need_unlink = false; virStorageSourcePtr mirror = NULL; virQEMUDriverConfigPtr cfg = NULL; + const char *format = NULL; - /* Preliminaries: find the disk we are editing, sanity checks */ - virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | - VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT, -1); + /* Preliminaries: find the disk we are editing, sanity checks. */ + virCheckFlags(VIR_DOMAIN_BLOCK_COPY_SHALLOW | + VIR_DOMAIN_BLOCK_COPY_REUSE_EXT, -1); + if (granularity) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("granularity tuning not supported yet")); + return -1; + } + if (buf_size) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("buffer size tuning not supported yet")); + return -1; + } priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); @@ -15341,7 +15355,7 @@ qemuDomainBlockCopy(virDomainObjPtr vm, goto endjob; if ((flags & VIR_DOMAIN_BLOCK_REBASE_SHALLOW) && - STREQ_NULLABLE(format, "raw") && + dest->format == VIR_STORAGE_FILE_RAW && disk->src->backingStore->path) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("disk '%s' has backing file, so raw shallow copy " @@ -15351,15 +15365,20 @@ qemuDomainBlockCopy(virDomainObjPtr vm, } /* Prepare the destination file. */ - if (stat(dest, &st) < 0) { + /* XXX Allow non-file mirror destinations */ + if (!virStorageSourceIsLocalStorage(dest)) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("non-file destination not supported yet")); + } + if (stat(dest->path, &st) < 0) { if (errno != ENOENT) { virReportSystemError(errno, _("unable to stat for disk %s: %s"), - disk->dst, dest); + disk->dst, dest->path); goto endjob; } else if (flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT) { virReportSystemError(errno, _("missing destination file for disk %s: %s"), - disk->dst, dest); + disk->dst, dest->path); goto endjob; } } else if (!S_ISBLK(st.st_mode) && st.st_size && @@ -15367,22 +15386,14 @@ qemuDomainBlockCopy(virDomainObjPtr vm, virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("external destination file for disk %s already " "exists and is not a block device: %s"), - disk->dst, dest); + disk->dst, dest->path); goto endjob; } - if (VIR_ALLOC(mirror) < 0) + if (!(mirror = virStorageSourceCopy(dest, false))) goto endjob; - /* XXX Allow non-file mirror destinations */ - mirror->type = VIR_STORAGE_TYPE_FILE; - if (format) { - if ((mirror->format = virStorageFileFormatTypeFromString(format)) <= 0) { - virReportError(VIR_ERR_INVALID_ARG, _("unrecognized format '%s'"), - format); - goto endjob; - } - } else { + if (!dest->format) { if (!(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { mirror->format = disk->src->format; } else { @@ -15390,24 +15401,23 @@ qemuDomainBlockCopy(virDomainObjPtr vm, * also passed the RAW flag (and format is non-NULL), or it is * safe for us to probe the format from the file that we will * be using. */ - mirror->format = virStorageFileProbeFormat(dest, cfg->user, + mirror->format = virStorageFileProbeFormat(dest->path, cfg->user, cfg->group); } } /* pre-create the image file */ if (!(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { - int fd = qemuOpenFile(driver, vm, dest, O_WRONLY | O_TRUNC | O_CREAT, + int fd = qemuOpenFile(driver, vm, dest->path, + O_WRONLY | O_TRUNC | O_CREAT, &need_unlink, NULL); if (fd < 0) goto endjob; VIR_FORCE_CLOSE(fd); } - if (!format && mirror->format > 0) + if (mirror->format > 0) format = virStorageFileFormatTypeToString(mirror->format); - if (VIR_STRDUP(mirror->path, dest) < 0) - goto endjob; if (virStorageSourceInitChainElement(mirror, disk->src, false) < 0) goto endjob; @@ -15421,8 +15431,8 @@ qemuDomainBlockCopy(virDomainObjPtr vm, /* Actually start the mirroring */ qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorDriveMirror(priv->mon, device, dest, format, bandwidth, - flags); + ret = qemuMonitorDriveMirror(priv->mon, device, dest->path, format, + bandwidth, flags); virDomainAuditDisk(vm, NULL, mirror, "mirror", ret >= 0); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) { @@ -15442,8 +15452,8 @@ qemuDomainBlockCopy(virDomainObjPtr vm, vm->def->name); endjob: - if (need_unlink && unlink(dest)) - VIR_WARN("unable to unlink just-created %s", dest); + if (need_unlink && unlink(dest->path)) + VIR_WARN("unable to unlink just-created %s", dest->path); virStorageSourceFree(mirror); if (!qemuDomainObjEndJob(driver, vm)) vm = NULL; @@ -15461,6 +15471,8 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { virDomainObjPtr vm; + int ret = -1; + virStorageSourcePtr dest = NULL; virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | @@ -15477,16 +15489,34 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, } if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY) { - const char *format = NULL; + if (VIR_ALLOC(dest) < 0) + return -1; + dest->type = VIR_STORAGE_TYPE_FILE; + if (VIR_STRDUP(dest->path, base) < 0) + goto cleanup; if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) - format = "raw"; + dest->format = VIR_STORAGE_FILE_RAW; flags &= ~(VIR_DOMAIN_BLOCK_REBASE_COPY | VIR_DOMAIN_BLOCK_REBASE_COPY_RAW); - return qemuDomainBlockCopy(vm, dom->conn, path, base, format, bandwidth, flags); + if (flags & VIR_DOMAIN_BLOCK_REBASE_RELATIVE) { + /* FIXME: should this check be in libvirt.c? */ + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Relative rebase incompatible with block copy")); + goto cleanup; + } + /* We rely on the fact that VIR_DOMAIN_BLOCK_REBASE_SHALLOW + * and VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT map to the same value + * as for block copy. */ + ret = qemuDomainBlockCopyCommon(vm, dom->conn, path, dest, + bandwidth, 0, 0, flags); + } else { + ret = qemuDomainBlockJobImpl(vm, dom->conn, path, base, bandwidth, + NULL, BLOCK_JOB_PULL, flags); } - return qemuDomainBlockJobImpl(vm, dom->conn, path, base, bandwidth, NULL, - BLOCK_JOB_PULL, flags); + cleanup: + virStorageSourceFree(dest); + return ret; } static int -- 1.9.3
participants (2)
-
Eric Blake
-
Peter Krempa