[libvirt] [PATCH v3 00/18] Implement virDomainBlockCopy

Took me longer than I wanted to get v3 posted. There's lots of new patches in this version, based on feedback on v2. Among other things, the bandwidth of virDomainBlockCopy is in bytes/s, and all the remaining interfaces are updated to use a flags value to decide whether to use bytes/s instead of the back-compat MiB/s. I really want patch 1 in 1.2.8; pasth 2-17 would be nice but it may be better to delay to later, and I still don't have virsh updated to cover the changed proposed in patch 18 so that one should probably wait. Eric Blake (18): blockcopy: allow larger buf-size blockjob: shuffle block rebase code blockjob: split out block info driver handling blockjob: split out block info monitor handling blockjob: hoist bandwidth scaling out of monitor code blockjob: allow finer bandwidth tuning for query blockjob: split up virsh blockjob info blockjob: add new --raw flag to virsh blockjob blockjob: add new --bytes flag to virsh blockjob blockcopy: allow block device destination blockcopy: split out virsh implementation blockcopy: expose new API in virsh blockcopy: remote implementation for new API blockcopy: tweak how rebase calls into copy blockcopy: add a way to parse disk source blockcopy: add qemu implementation of new API blockcopy: add qemu implementation of new tunables blockjob: allow finer bandwidth tuning for set speed include/libvirt/libvirt.h.in | 77 +++-- src/conf/domain_conf.c | 107 ++++-- src/conf/domain_conf.h | 4 + src/libvirt.c | 99 ++++-- src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 382 ++++++++++++++++----- src/qemu/qemu_migration.c | 10 +- src/qemu/qemu_monitor.c | 108 +++--- src/qemu/qemu_monitor.h | 16 +- src/qemu/qemu_monitor_json.c | 80 +++-- src/qemu/qemu_monitor_json.h | 9 +- src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 18 +- src/remote_protocol-structs | 11 + tests/qemumonitorjsontest.c | 2 +- .../qemuxml2argvdata/qemuxml2argv-disk-mirror.xml | 4 +- tools/virsh-domain.c | 336 ++++++++++++++---- tools/virsh.c | 2 + tools/virsh.h | 2 + tools/virsh.pod | 65 ++-- 20 files changed, 975 insertions(+), 359 deletions(-) -- 1.9.3

While qemu definitely caps granularity to 64 MiB, it places no limits on buf-size. On a machine beefy enough for lots of memory, a buf-size larger than 2 GiB is feasible, so we should pass a 64-bit parameter. * include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_COPY_BUF_SIZE): Allow 64 bits. Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 9358314..a64f597 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2678,8 +2678,8 @@ typedef enum { * VIR_DOMAIN_BLOCK_COPY_BUF_SIZE: * Macro for the virDomainBlockCopy buffer size tunable: it represents * how much data in bytes can be in flight between source and destination, - * as an unsigned int. Specifying 0 is the same as omitting this parameter, - * to request the hypervisor default. + * as an unsigned long long. Specifying 0 is the same as omitting this + * parameter, to request the hypervisor default. */ #define VIR_DOMAIN_BLOCK_COPY_BUF_SIZE "buf-size" -- 1.9.3

On Sat, Aug 30, 2014 at 22:02:19 -0600, Eric Blake wrote:
While qemu definitely caps granularity to 64 MiB, it places no limits on buf-size. On a machine beefy enough for lots of memory, a buf-size larger than 2 GiB is feasible, so we should pass a 64-bit parameter.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_COPY_BUF_SIZE): Allow 64 bits.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 9358314..a64f597 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2678,8 +2678,8 @@ typedef enum { * VIR_DOMAIN_BLOCK_COPY_BUF_SIZE: * Macro for the virDomainBlockCopy buffer size tunable: it represents * how much data in bytes can be in flight between source and destination, - * as an unsigned int. Specifying 0 is the same as omitting this parameter, - * to request the hypervisor default. + * as an unsigned long long. Specifying 0 is the same as omitting this + * parameter, to request the hypervisor default. */ #define VIR_DOMAIN_BLOCK_COPY_BUF_SIZE "buf-size"
ACK, we need this in 1.2.8. Jirka

On 09/01/14 16:57, Jiri Denemark wrote:
On Sat, Aug 30, 2014 at 22:02:19 -0600, Eric Blake wrote:
While qemu definitely caps granularity to 64 MiB, it places no limits on buf-size. On a machine beefy enough for lots of memory, a buf-size larger than 2 GiB is feasible, so we should pass a 64-bit parameter.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_COPY_BUF_SIZE): Allow 64 bits.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 9358314..a64f597 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2678,8 +2678,8 @@ typedef enum { * VIR_DOMAIN_BLOCK_COPY_BUF_SIZE: * Macro for the virDomainBlockCopy buffer size tunable: it represents * how much data in bytes can be in flight between source and destination, - * as an unsigned int. Specifying 0 is the same as omitting this parameter, - * to request the hypervisor default. + * as an unsigned long long. Specifying 0 is the same as omitting this + * parameter, to request the hypervisor default. */ #define VIR_DOMAIN_BLOCK_COPY_BUF_SIZE "buf-size"
ACK, we need this in 1.2.8.
Jirka
This one was actually pushed before the release.

The existing virDomainBlockRebase code rejected the combination of _RELATIVE and _COPY flags, but only by accident. It makes sense, at least for the case of _SHALLOW and not _REUSE_EXT, but to implement it, libvirt would have to pre-create the file with a relative backing name. Meanwhile, the code to forward on to the block copy code is getting longer, and reorganising the function to have the block pull done early makes it easier to add even more block copy prep code. This patch should have no semantic difference other than the quality of the error message on the unsupported flag combination. * src/qemu/qemu_driver.c (qemuDomainBlockRebase): Reorder code, and improve error message of relative copy. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 47 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 15 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 239a300..177d3e4 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15467,6 +15467,8 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { virDomainObjPtr vm; + const char *format = NULL; + int ret = -1; virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | @@ -15477,22 +15479,37 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, if (!(vm = qemuDomObjFromDomain(dom))) return -1; - if (virDomainBlockRebaseEnsureACL(dom->conn, vm->def) < 0) { + if (virDomainBlockRebaseEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + /* For normal rebase (enhanced blockpull), the common code handles + * everything, including vm cleanup. */ + if (!(flags & VIR_DOMAIN_BLOCK_REBASE_COPY)) + return qemuDomainBlockJobImpl(vm, dom->conn, path, base, bandwidth, + NULL, BLOCK_JOB_PULL, flags); + + /* If we got here, we are doing a block copy rebase. */ + if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) + format = "raw"; + + /* XXX: If we are doing a shallow copy but not reusing an external + * file, we should attempt to pre-create the destination with a + * relative backing chain instead of qemu's default of absolute */ + if (flags & VIR_DOMAIN_BLOCK_REBASE_RELATIVE) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Relative backing during copy not supported yet")); + goto cleanup; + } + + flags &= (VIR_DOMAIN_BLOCK_REBASE_SHALLOW | + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT); + ret = qemuDomainBlockCopy(vm, dom->conn, path, base, format, + bandwidth, flags); + vm = NULL; + cleanup: + if (vm) virObjectUnlock(vm); - return -1; - } - - if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY) { - const char *format = NULL; - if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) - format = "raw"; - flags &= ~(VIR_DOMAIN_BLOCK_REBASE_COPY | - VIR_DOMAIN_BLOCK_REBASE_COPY_RAW); - return qemuDomainBlockCopy(vm, dom->conn, path, base, format, bandwidth, flags); - } - - return qemuDomainBlockJobImpl(vm, dom->conn, path, base, bandwidth, NULL, - BLOCK_JOB_PULL, flags); + return ret; } static int -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
The existing virDomainBlockRebase code rejected the combination of _RELATIVE and _COPY flags, but only by accident. It makes sense, at least for the case of _SHALLOW and not _REUSE_EXT, but to implement it, libvirt would have to pre-create the file with a relative backing name.
Meanwhile, the code to forward on to the block copy code is getting longer, and reorganising the function to have the block pull done early makes it easier to add even more block copy prep code.
This patch should have no semantic difference other than the quality of the error message on the unsupported flag combination.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Reorder code, and improve error message of relative copy.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 47 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 15 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 239a300..177d3e4 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15467,6 +15467,8 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { virDomainObjPtr vm; + const char *format = NULL; + int ret = -1;
virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | @@ -15477,22 +15479,37 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, if (!(vm = qemuDomObjFromDomain(dom))) return -1;
- if (virDomainBlockRebaseEnsureACL(dom->conn, vm->def) < 0) { + if (virDomainBlockRebaseEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + /* For normal rebase (enhanced blockpull), the common code handles + * everything, including vm cleanup. */ + if (!(flags & VIR_DOMAIN_BLOCK_REBASE_COPY)) + return qemuDomainBlockJobImpl(vm, dom->conn, path, base, bandwidth, + NULL, BLOCK_JOB_PULL, flags); + + /* If we got here, we are doing a block copy rebase. */ + if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) + format = "raw"; + + /* XXX: If we are doing a shallow copy but not reusing an external + * file, we should attempt to pre-create the destination with a + * relative backing chain instead of qemu's default of absolute */ + if (flags & VIR_DOMAIN_BLOCK_REBASE_RELATIVE) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Relative backing during copy not supported yet")); + goto cleanup;
If you'd shuffle this more up and add (flags & VIR_DOMAIN_BLOCK_REBASE_COPY) you'd be able to get rid of the cleanup section entirely.
+ } + + flags &= (VIR_DOMAIN_BLOCK_REBASE_SHALLOW | + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT); + ret = qemuDomainBlockCopy(vm, dom->conn, path, base, format, + bandwidth, flags);
And return the result right away.
+ vm = NULL; + cleanup: + if (vm) virObjectUnlock(vm); - return -1; - } - - if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY) { - const char *format = NULL; - if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) - format = "raw"; - flags &= ~(VIR_DOMAIN_BLOCK_REBASE_COPY | - VIR_DOMAIN_BLOCK_REBASE_COPY_RAW); - return qemuDomainBlockCopy(vm, dom->conn, path, base, format, bandwidth, flags); - } - - return qemuDomainBlockJobImpl(vm, dom->conn, path, base, bandwidth, NULL, - BLOCK_JOB_PULL, flags); + return ret; }
static int
ACK as is. Peter

On 09/04/14 16:40, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
The existing virDomainBlockRebase code rejected the combination of _RELATIVE and _COPY flags, but only by accident. It makes sense, at least for the case of _SHALLOW and not _REUSE_EXT, but to implement it, libvirt would have to pre-create the file with a relative backing name.
Meanwhile, the code to forward on to the block copy code is getting longer, and reorganising the function to have the block pull done early makes it easier to add even more block copy prep code.
This patch should have no semantic difference other than the quality of the error message on the unsupported flag combination.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Reorder code, and improve error message of relative copy.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 47 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 15 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 239a300..177d3e4 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15467,6 +15467,8 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { virDomainObjPtr vm; + const char *format = NULL; + int ret = -1;
virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | @@ -15477,22 +15479,37 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, if (!(vm = qemuDomObjFromDomain(dom))) return -1;
- if (virDomainBlockRebaseEnsureACL(dom->conn, vm->def) < 0) { + if (virDomainBlockRebaseEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + /* For normal rebase (enhanced blockpull), the common code handles + * everything, including vm cleanup. */ + if (!(flags & VIR_DOMAIN_BLOCK_REBASE_COPY)) + return qemuDomainBlockJobImpl(vm, dom->conn, path, base, bandwidth, + NULL, BLOCK_JOB_PULL, flags); + + /* If we got here, we are doing a block copy rebase. */ + if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) + format = "raw"; + + /* XXX: If we are doing a shallow copy but not reusing an external + * file, we should attempt to pre-create the destination with a + * relative backing chain instead of qemu's default of absolute */ + if (flags & VIR_DOMAIN_BLOCK_REBASE_RELATIVE) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Relative backing during copy not supported yet")); + goto cleanup;
If you'd shuffle this more up and add (flags & VIR_DOMAIN_BLOCK_REBASE_COPY) you'd be able to get rid of the cleanup section entirely.
Okay. Now I see that you are adding other stuff with "goto cleanup" at this place so this makes sense.
+ } +
This is then okay as-is without any change.

On 09/04/2014 09:53 AM, Peter Krempa wrote:
On 09/04/14 16:40, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
The existing virDomainBlockRebase code rejected the combination of _RELATIVE and _COPY flags, but only by accident. It makes sense, at least for the case of _SHALLOW and not _REUSE_EXT, but to implement it, libvirt would have to pre-create the file with a relative backing name.
Meanwhile, the code to forward on to the block copy code is getting longer, and reorganising the function to have the block pull done early makes it easier to add even more block copy prep code.
This patch should have no semantic difference other than the quality of the error message on the unsupported flag combination.
I improved the commit message to show the actual before-and-after errors,
This is then okay as-is without any change.
and pushed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The qemu implementation for virDomainGetBlockJobInfo() has a minor bug: it grabs the qemu job with intent to QEMU_JOB_MODIFY, which means it cannot be run in parallel with any other domain-modifying command. Among others, virDomainBlockJobAbort() is such a modifying command, and it defaults to being synchronous, and can wait as long as several seconds to ensure that the job has actually finished. Due to the job rules, this means a user cannot obtain status about the job during that timeframe, even though we know management code is using a polling loop on status to see when a job finishes. This bug has been present ever since blockpull support was first introduced (commit b976165, v0.9.4 in Jul 2011), all because we stupidly tried to cram too much multiplexing through a single helper routine. It's time to disentangle some of that mess, and in the process relax block job query to use QEMU_JOB_QUERY, since it can safely be used in parallel with any long running modify command. Technically, there is one case where getting block job info can modify domain XML - we do snooping to see if a 2-phase job has transitioned into the second phase, for an optimization in the case of old qemu that lacked an event for the transition. But I think that optimization is safe; and if it proves to be a problem, we could use the difference between QEMU_CAPS_BLOCKJOB_{ASYNC,SYNC} to determine whether we even need snooping, and request a modifying job in that case. * src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Move info handling... (qemuDomainGetBlockJobInfo): ...here, and relax job type. (qemuDomainBlockJobAbort, qemuDomainBlockJobSetSpeed) (qemuDomainBlockRebase, qemuDomainBlockPull): Adjust callers. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 86 ++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 66 insertions(+), 20 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 177d3e4..bedccc6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14994,7 +14994,7 @@ static int qemuDomainBlockJobImpl(virDomainObjPtr vm, virConnectPtr conn, const char *path, const char *base, - unsigned long bandwidth, virDomainBlockJobInfoPtr info, + unsigned long bandwidth, int mode, unsigned int flags) { virQEMUDriverPtr driver = conn->privateData; @@ -15125,25 +15125,14 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorBlockJob(priv->mon, device, basePath, backingPath, - bandwidth, info, mode, async); + bandwidth, NULL, mode, async); qemuDomainObjExitMonitor(driver, vm); - if (info && info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT && - disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT) - info->type = disk->mirrorJob; if (ret < 0) { if (mode == BLOCK_JOB_ABORT && disk->mirror) disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_NONE; goto endjob; } - /* Snoop block copy operations, so future cancel operations can - * avoid checking if pivot is safe. */ - if (mode == BLOCK_JOB_INFO && ret == 1 && disk->mirror && - info->cur == info->end && !disk->mirrorState) { - disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_READY; - save = true; - } - waitjob: /* If we have made changes to XML due to a copy job, make a best * effort to save it now. But we can ignore failure, since there @@ -15239,15 +15228,22 @@ qemuDomainBlockJobAbort(virDomainPtr dom, const char *path, unsigned int flags) return -1; } - return qemuDomainBlockJobImpl(vm, dom->conn, path, NULL, 0, NULL, BLOCK_JOB_ABORT, - flags); + return qemuDomainBlockJobImpl(vm, dom->conn, path, NULL, 0, + BLOCK_JOB_ABORT, flags); } static int qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, virDomainBlockJobInfoPtr info, unsigned int flags) { + virQEMUDriverPtr driver = dom->conn->privateData; + qemuDomainObjPrivatePtr priv; virDomainObjPtr vm; + char *device = NULL; + int idx; + virDomainDiskDefPtr disk; + int ret = -1; + virCheckFlags(0, -1); if (!(vm = qemuDomObjFromDomain(dom))) @@ -15258,8 +15254,58 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, return -1; } - return qemuDomainBlockJobImpl(vm, dom->conn, path, NULL, 0, info, BLOCK_JOB_INFO, - flags); + priv = vm->privateData; + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_BLOCKJOB_ASYNC) && + !virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_BLOCKJOB_SYNC)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("block jobs not supported with this QEMU binary")); + goto cleanup; + } + + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not running")); + goto endjob; + } + + device = qemuDiskPathToAlias(vm, path, &idx); + if (!device) + goto endjob; + disk = vm->def->disks[idx]; + + qemuDomainObjEnterMonitor(driver, vm); + ret = qemuMonitorBlockJob(priv->mon, device, NULL, NULL, 0, + info, BLOCK_JOB_INFO, true); + qemuDomainObjExitMonitor(driver, vm); + if (ret < 0) + goto endjob; + + if (info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT && + disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT) + info->type = disk->mirrorJob; + + /* Snoop block copy operations, so future cancel operations can + * avoid checking if pivot is safe. Save the change to XML, but + * we can ignore failure because it is only an optimization. */ + if (ret == 1 && disk->mirror && + info->cur == info->end && !disk->mirrorState) { + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + + disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_READY; + ignore_value(virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm)); + virObjectUnref(cfg); + } + endjob: + if (!qemuDomainObjEndJob(driver, vm)) + vm = NULL; + + cleanup: + if (vm) + virObjectUnlock(vm); + return ret; } static int @@ -15277,7 +15323,7 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, return -1; } - return qemuDomainBlockJobImpl(vm, dom->conn, path, NULL, bandwidth, NULL, + return qemuDomainBlockJobImpl(vm, dom->conn, path, NULL, bandwidth, BLOCK_JOB_SPEED, flags); } @@ -15486,7 +15532,7 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, * everything, including vm cleanup. */ if (!(flags & VIR_DOMAIN_BLOCK_REBASE_COPY)) return qemuDomainBlockJobImpl(vm, dom->conn, path, base, bandwidth, - NULL, BLOCK_JOB_PULL, flags); + BLOCK_JOB_PULL, flags); /* If we got here, we are doing a block copy rebase. */ if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) @@ -15527,7 +15573,7 @@ qemuDomainBlockPull(virDomainPtr dom, const char *path, unsigned long bandwidth, return -1; } - return qemuDomainBlockJobImpl(vm, dom->conn, path, NULL, bandwidth, NULL, + return qemuDomainBlockJobImpl(vm, dom->conn, path, NULL, bandwidth, BLOCK_JOB_PULL, flags); } -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
The qemu implementation for virDomainGetBlockJobInfo() has a minor bug: it grabs the qemu job with intent to QEMU_JOB_MODIFY, which means it cannot be run in parallel with any other domain-modifying command. Among others, virDomainBlockJobAbort() is such a modifying command, and it defaults to being synchronous, and can wait as long as several seconds to ensure that the job has actually finished. Due to the job rules, this means a user cannot obtain status about the job during that timeframe, even though we know management code is using a polling loop on status to see when a job finishes.
This bug has been present ever since blockpull support was first introduced (commit b976165, v0.9.4 in Jul 2011), all because we stupidly tried to cram too much multiplexing through a single helper routine. It's time to disentangle some of that mess, and in the process relax block job query to use QEMU_JOB_QUERY, since it can safely be used in parallel with any long running modify command. Technically, there is one case where getting block job info can modify domain XML - we do snooping to see if a 2-phase job has transitioned into the second phase, for an optimization in the case of old qemu that lacked an event for the transition. But I think that optimization is safe; and if it proves to be a problem, we could use the difference between QEMU_CAPS_BLOCKJOB_{ASYNC,SYNC} to determine whether we even need snooping, and request a modifying job in that case.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Move info handling... (qemuDomainGetBlockJobInfo): ...here, and relax job type. (qemuDomainBlockJobAbort, qemuDomainBlockJobSetSpeed) (qemuDomainBlockRebase, qemuDomainBlockPull): Adjust callers.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 86 ++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 66 insertions(+), 20 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 177d3e4..bedccc6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c
@@ -15258,8 +15254,58 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, return -1; }
- return qemuDomainBlockJobImpl(vm, dom->conn, path, NULL, 0, info, BLOCK_JOB_INFO, - flags); + priv = vm->privateData; + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_BLOCKJOB_ASYNC) && + !virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_BLOCKJOB_SYNC)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("block jobs not supported with this QEMU binary")); + goto cleanup; + } + + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not running")); + goto endjob; + } + + device = qemuDiskPathToAlias(vm, path, &idx); + if (!device) + goto endjob; + disk = vm->def->disks[idx]; + + qemuDomainObjEnterMonitor(driver, vm); + ret = qemuMonitorBlockJob(priv->mon, device, NULL, NULL, 0, + info, BLOCK_JOB_INFO, true); + qemuDomainObjExitMonitor(driver, vm); + if (ret < 0) + goto endjob; + + if (info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT && + disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT) + info->type = disk->mirrorJob; + + /* Snoop block copy operations, so future cancel operations can + * avoid checking if pivot is safe. Save the change to XML, but + * we can ignore failure because it is only an optimization. */ + if (ret == 1 && disk->mirror && + info->cur == info->end && !disk->mirrorState) { + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + + disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_READY;
Atomicity is guaranteed by holding the domain lock here. We should document that the mirrorState field can change even when the _MODIFY job is held though. Otherwise I don't see a problem currently with this as long as it is stated somewhere. Possibly even in a comment here.
+ ignore_value(virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm)); + virObjectUnref(cfg); + } + endjob: + if (!qemuDomainObjEndJob(driver, vm)) + vm = NULL; + + cleanup: + if (vm) + virObjectUnlock(vm); + return ret; }
static int
ACK Peter

On 09/04/2014 09:11 AM, Peter Krempa wrote:
modify command. Technically, there is one case where getting block job info can modify domain XML - we do snooping to see if a 2-phase job has transitioned into the second phase, for an optimization in the case of old qemu that lacked an event for the transition. But I think that optimization is safe; and if it proves to be a problem, we could use the difference between QEMU_CAPS_BLOCKJOB_{ASYNC,SYNC} to determine whether we even need snooping, and request a modifying job in that case.
+ + if (info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT && + disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT) + info->type = disk->mirrorJob; + + /* Snoop block copy operations, so future cancel operations can + * avoid checking if pivot is safe. Save the change to XML, but + * we can ignore failure because it is only an optimization. */ + if (ret == 1 && disk->mirror && + info->cur == info->end && !disk->mirrorState) { + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + + disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_READY;
Atomicity is guaranteed by holding the domain lock here. We should document that the mirrorState field can change even when the _MODIFY job is held though.
Otherwise I don't see a problem currently with this as long as it is stated somewhere. Possibly even in a comment here.
ACK
Here's what I squashed in. diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c index e7bd504..fb13169 100644 --- i/src/qemu/qemu_driver.c +++ w/src/qemu/qemu_driver.c @@ -15289,7 +15289,9 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, /* Snoop block copy operations, so future cancel operations can * avoid checking if pivot is safe. Save the change to XML, but - * we can ignore failure because it is only an optimization. */ + * we can ignore failure because it is only an optimization. We + * hold the vm lock, so modifying the in-memory representation + * even though we are a query rather than a modify job, is safe. */ if (ret == 1 && disk->mirror && info->cur == info->end && !disk->mirrorState) { virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Another layer of overly-multiplexed code that deserves to be split into obviously separate paths for query vs. modify. This continues the cleanup started in the previous patch. In the process, make some tweaks to simplify the logic when parsing the JSON reply. There should be no user-visible semantic changes. * src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Drop parameter. (qemuMonitorBlockJobInfo): New prototype. (BLOCK_JOB_INFO): Drop enum. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob) (qemuMonitorJSONBlockJobInfo): Likewise. * src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Split... (qemuMonitorBlockJobInfo): ...into second function. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Move block info portions... (qemuMonitorJSONGetBlockJobInfo): ...here, and rename... (qemuMonitorJSONBlockJobInfo): ...and export. (qemuMonitorJSONGetBlockJobInfoOne): Alter return semantics. * src/qemu/qemu_driver.c (qemuDomainBlockPivot) (qemuDomainBlockJobImpl, qemuDomainGetBlockJobInfo): Adjust callers. * src/qemu/qemu_migration.c (qemuMigrationDriveMirror) (qemuMigrationCancelDriveMirror): Likewise. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 11 +++----- src/qemu/qemu_migration.c | 7 +++-- src/qemu/qemu_monitor.c | 41 ++++++++++++++++++++-------- src/qemu/qemu_monitor.h | 7 +++-- src/qemu/qemu_monitor_json.c | 63 ++++++++++++++++++++++++-------------------- src/qemu/qemu_monitor_json.h | 6 ++++- 6 files changed, 81 insertions(+), 54 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index bedccc6..f208ba0 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14857,8 +14857,7 @@ qemuDomainBlockPivot(virConnectPtr conn, /* Probe the status, if needed. */ if (!disk->mirrorState) { qemuDomainObjEnterMonitor(driver, vm); - rc = qemuMonitorBlockJob(priv->mon, device, NULL, NULL, 0, &info, - BLOCK_JOB_INFO, true); + rc = qemuMonitorBlockJobInfo(priv->mon, device, &info); qemuDomainObjExitMonitor(driver, vm); if (rc < 0) goto cleanup; @@ -15125,7 +15124,7 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorBlockJob(priv->mon, device, basePath, backingPath, - bandwidth, NULL, mode, async); + bandwidth, mode, async); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) { if (mode == BLOCK_JOB_ABORT && disk->mirror) @@ -15169,8 +15168,7 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, virDomainBlockJobInfo dummy; qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorBlockJob(priv->mon, device, NULL, NULL, 0, - &dummy, BLOCK_JOB_INFO, async); + ret = qemuMonitorBlockJobInfo(priv->mon, device, &dummy); qemuDomainObjExitMonitor(driver, vm); if (ret <= 0) @@ -15277,8 +15275,7 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, disk = vm->def->disks[idx]; qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorBlockJob(priv->mon, device, NULL, NULL, 0, - info, BLOCK_JOB_INFO, true); + ret = qemuMonitorBlockJobInfo(priv->mon, device, info); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) goto endjob; diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 9cfb77e..9885a16 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1307,8 +1307,7 @@ qemuMigrationDriveMirror(virQEMUDriverPtr driver, _("canceled by client")); goto error; } - mon_ret = qemuMonitorBlockJob(priv->mon, diskAlias, NULL, NULL, 0, - &info, BLOCK_JOB_INFO, true); + mon_ret = qemuMonitorBlockJobInfo(priv->mon, diskAlias, &info); qemuDomainObjExitMonitor(driver, vm); if (mon_ret < 0) @@ -1361,7 +1360,7 @@ qemuMigrationDriveMirror(virQEMUDriverPtr driver, if (qemuDomainObjEnterMonitorAsync(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT) == 0) { if (qemuMonitorBlockJob(priv->mon, diskAlias, NULL, NULL, 0, - NULL, BLOCK_JOB_ABORT, true) < 0) { + BLOCK_JOB_ABORT, true) < 0) { VIR_WARN("Unable to cancel block-job on '%s'", diskAlias); } qemuDomainObjExitMonitor(driver, vm); @@ -1428,7 +1427,7 @@ qemuMigrationCancelDriveMirror(qemuMigrationCookiePtr mig, goto cleanup; if (qemuMonitorBlockJob(priv->mon, diskAlias, NULL, NULL, 0, - NULL, BLOCK_JOB_ABORT, true) < 0) + BLOCK_JOB_ABORT, true) < 0) VIR_WARN("Unable to stop block job on %s", diskAlias); qemuDomainObjExitMonitor(driver, vm); } diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 5b2952a..4493051 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -3360,22 +3360,22 @@ int qemuMonitorScreendump(qemuMonitorPtr mon, } /* bandwidth is in MiB/sec */ -int qemuMonitorBlockJob(qemuMonitorPtr mon, - const char *device, - const char *base, - const char *backingName, - unsigned long bandwidth, - virDomainBlockJobInfoPtr info, - qemuMonitorBlockJobCmd mode, - bool modern) +int +qemuMonitorBlockJob(qemuMonitorPtr mon, + const char *device, + const char *base, + const char *backingName, + unsigned long bandwidth, + qemuMonitorBlockJobCmd mode, + bool modern) { int ret = -1; unsigned long long speed; VIR_DEBUG("mon=%p, device=%s, base=%s, backingName=%s, bandwidth=%luM, " - "info=%p, mode=%o, modern=%d", + "mode=%o, modern=%d", mon, device, NULLSTR(base), NULLSTR(backingName), - bandwidth, info, mode, modern); + bandwidth, mode, modern); /* Convert bandwidth MiB to bytes - unfortunately the JSON QMP protocol is * limited to LLONG_MAX also for unsigned values */ @@ -3390,13 +3390,32 @@ int qemuMonitorBlockJob(qemuMonitorPtr mon, if (mon->json) ret = qemuMonitorJSONBlockJob(mon, device, base, backingName, - speed, info, mode, modern); + speed, mode, modern); else virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("block jobs require JSON monitor")); return ret; } + +int +qemuMonitorBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) +{ + int ret = -1; + + VIR_DEBUG("mon=%p, device=%s, info=%p", mon, device, info); + + if (mon->json) + ret = qemuMonitorJSONBlockJobInfo(mon, device, info); + else + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("block jobs require JSON monitor")); + return ret; +} + + int qemuMonitorSetBlockIoThrottle(qemuMonitorPtr mon, const char *device, virDomainBlockIoTuneInfoPtr info) diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 4fd6f01..947ca34 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -685,7 +685,6 @@ int qemuMonitorSendKey(qemuMonitorPtr mon, typedef enum { BLOCK_JOB_ABORT, - BLOCK_JOB_INFO, BLOCK_JOB_SPEED, BLOCK_JOB_PULL, } qemuMonitorBlockJobCmd; @@ -695,11 +694,15 @@ int qemuMonitorBlockJob(qemuMonitorPtr mon, const char *base, const char *backingName, unsigned long bandwidth, - virDomainBlockJobInfoPtr info, qemuMonitorBlockJobCmd mode, bool modern) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); +int qemuMonitorBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); + int qemuMonitorOpenGraphics(qemuMonitorPtr mon, const char *protocol, int fd, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 62e7d5d..68ba084 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -3684,9 +3684,11 @@ int qemuMonitorJSONScreendump(qemuMonitorPtr mon, return ret; } -static int qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry, - const char *device, - virDomainBlockJobInfoPtr info) +/* Returns -1 on error, 0 if not the right device, 1 if info was populated. */ +static int +qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry, + const char *device, + virDomainBlockJobInfoPtr info) { const char *this_dev; const char *type; @@ -3698,7 +3700,7 @@ static int qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry, return -1; } if (!STREQ(this_dev, device)) - return -1; + return 0; type = virJSONValueObjectGetString(entry, "type"); if (!type) { @@ -3733,54 +3735,66 @@ static int qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry, _("entry was missing 'len'")); return -1; } - return 0; + return 1; } -/** qemuMonitorJSONGetBlockJobInfo: - * Parse Block Job information. - * The reply is a JSON array of objects, one per active job. +/** + * qemuMonitorJSONBlockJobInfo: + * Parse Block Job information, and populate info for the named device. + * Return 1 if info available, 0 if device has no block job, and -1 on error. */ -static int qemuMonitorJSONGetBlockJobInfo(virJSONValuePtr reply, - const char *device, - virDomainBlockJobInfoPtr info) +int +qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) { + virJSONValuePtr cmd = NULL; + virJSONValuePtr reply = NULL; virJSONValuePtr data; int nr_results; size_t i; + int ret; - if (!info) + cmd = qemuMonitorJSONMakeCommand("query-block-jobs", NULL); + if (!cmd) return -1; + ret = qemuMonitorJSONCommand(mon, cmd, &reply); + if (ret < 0) + goto cleanup; if ((data = virJSONValueObjectGet(reply, "return")) == NULL) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("reply was missing return data")); - return -1; + goto cleanup; } if (data->type != VIR_JSON_TYPE_ARRAY) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("unrecognized format of block job information")); - return -1; + goto cleanup; } if ((nr_results = virJSONValueArraySize(data)) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("unable to determine array size")); - return -1; + goto cleanup; } - for (i = 0; i < nr_results; i++) { + for (i = 0, ret = 0; i < nr_results && ret == 0; i++) { virJSONValuePtr entry = virJSONValueArrayGet(data, i); if (!entry) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("missing array element")); - return -1; + ret = -1; + goto cleanup; } - if (qemuMonitorJSONGetBlockJobInfoOne(entry, device, info) == 0) - return 1; + ret = qemuMonitorJSONGetBlockJobInfoOne(entry, device, info); } - return 0; + cleanup: + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; } @@ -3791,7 +3805,6 @@ qemuMonitorJSONBlockJob(qemuMonitorPtr mon, const char *base, const char *backingName, unsigned long long speed, - virDomainBlockJobInfoPtr info, qemuMonitorBlockJobCmd mode, bool modern) { @@ -3833,11 +3846,6 @@ qemuMonitorJSONBlockJob(qemuMonitorPtr mon, NULL); break; - case BLOCK_JOB_INFO: - cmd_name = "query-block-jobs"; - cmd = qemuMonitorJSONMakeCommand(cmd_name, NULL); - break; - case BLOCK_JOB_SPEED: cmd_name = modern ? "block-job-set-speed" : "block_job_set_speed"; cmd = qemuMonitorJSONMakeCommand(cmd_name, @@ -3888,9 +3896,6 @@ qemuMonitorJSONBlockJob(qemuMonitorPtr mon, } } - if (ret == 0 && mode == BLOCK_JOB_INFO) - ret = qemuMonitorJSONGetBlockJobInfo(reply, device, info); - virJSONValueFree(cmd); virJSONValueFree(reply); return ret; diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index d8c9308..d3f6f37 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -285,11 +285,15 @@ int qemuMonitorJSONBlockJob(qemuMonitorPtr mon, const char *base, const char *backingName, unsigned long long speed, - virDomainBlockJobInfoPtr info, qemuMonitorBlockJobCmd mode, bool modern) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); +int qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); + int qemuMonitorJSONSetLink(qemuMonitorPtr mon, const char *name, virDomainNetInterfaceLinkState state); -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
Another layer of overly-multiplexed code that deserves to be split into obviously separate paths for query vs. modify. This continues the cleanup started in the previous patch.
In the process, make some tweaks to simplify the logic when parsing the JSON reply. There should be no user-visible semantic changes.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Drop parameter. (qemuMonitorBlockJobInfo): New prototype. (BLOCK_JOB_INFO): Drop enum. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob) (qemuMonitorJSONBlockJobInfo): Likewise. * src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Split... (qemuMonitorBlockJobInfo): ...into second function. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Move block info portions... (qemuMonitorJSONGetBlockJobInfo): ...here, and rename... (qemuMonitorJSONBlockJobInfo): ...and export. (qemuMonitorJSONGetBlockJobInfoOne): Alter return semantics. * src/qemu/qemu_driver.c (qemuDomainBlockPivot) (qemuDomainBlockJobImpl, qemuDomainGetBlockJobInfo): Adjust callers. * src/qemu/qemu_migration.c (qemuMigrationDriveMirror) (qemuMigrationCancelDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 11 +++----- src/qemu/qemu_migration.c | 7 +++-- src/qemu/qemu_monitor.c | 41 ++++++++++++++++++++-------- src/qemu/qemu_monitor.h | 7 +++-- src/qemu/qemu_monitor_json.c | 63 ++++++++++++++++++++++++-------------------- src/qemu/qemu_monitor_json.h | 6 ++++- 6 files changed, 81 insertions(+), 54 deletions(-)
...
diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 4fd6f01..947ca34 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -695,11 +694,15 @@ int qemuMonitorBlockJob(qemuMonitorPtr mon, const char *base, const char *backingName, unsigned long bandwidth, - virDomainBlockJobInfoPtr info, qemuMonitorBlockJobCmd mode, bool modern) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2);
+int qemuMonitorBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2);
Implementation of this function doesn't check for presence of the @info structure and dereferences it always. You should mark argument 3 nonnull too.
+ int qemuMonitorOpenGraphics(qemuMonitorPtr mon, const char *protocol, int fd,
...
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 62e7d5d..68ba084 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c
@@ -3733,54 +3735,66 @@ static int qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry, _("entry was missing 'len'")); return -1; } - return 0; + return 1; }
-/** qemuMonitorJSONGetBlockJobInfo: - * Parse Block Job information. - * The reply is a JSON array of objects, one per active job. +/** + * qemuMonitorJSONBlockJobInfo: + * Parse Block Job information, and populate info for the named device. + * Return 1 if info available, 0 if device has no block job, and -1 on error. */ -static int qemuMonitorJSONGetBlockJobInfo(virJSONValuePtr reply, - const char *device, - virDomainBlockJobInfoPtr info) +int +qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) { + virJSONValuePtr cmd = NULL; + virJSONValuePtr reply = NULL; virJSONValuePtr data; int nr_results; size_t i; + int ret;
- if (!info) + cmd = qemuMonitorJSONMakeCommand("query-block-jobs", NULL); + if (!cmd) return -1; + ret = qemuMonitorJSONCommand(mon, cmd, &reply); + if (ret < 0) + goto cleanup;
if ((data = virJSONValueObjectGet(reply, "return")) == NULL) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("reply was missing return data")); - return -1; + goto cleanup; }
if (data->type != VIR_JSON_TYPE_ARRAY) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("unrecognized format of block job information")); - return -1; + goto cleanup; }
if ((nr_results = virJSONValueArraySize(data)) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("unable to determine array size")); - return -1; + goto cleanup; }
- for (i = 0; i < nr_results; i++) { + for (i = 0, ret = 0; i < nr_results && ret == 0; i++) { virJSONValuePtr entry = virJSONValueArrayGet(data, i); if (!entry) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("missing array element")); - return -1; + ret = -1; + goto cleanup; } - if (qemuMonitorJSONGetBlockJobInfoOne(entry, device, info) == 0) - return 1; + ret = qemuMonitorJSONGetBlockJobInfoOne(entry, device, info);
This doesn't break the loop, thus if the device info isn't last in the structure you'd overwrite the return code. I presume the clients don't check that far but you've documented that semantics.
}
- return 0; + cleanup: + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; }
...
diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index d8c9308..d3f6f37 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -285,11 +285,15 @@ int qemuMonitorJSONBlockJob(qemuMonitorPtr mon, const char *base, const char *backingName, unsigned long long speed, - virDomainBlockJobInfoPtr info, qemuMonitorBlockJobCmd mode, bool modern) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2);
+int qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2);
Again, @info is required, thus should be marked ATTRIBUTE_NONNULL.
+ int qemuMonitorJSONSetLink(qemuMonitorPtr mon, const char *name, virDomainNetInterfaceLinkState state);
Peter

On 09/04/2014 09:39 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
Another layer of overly-multiplexed code that deserves to be split into obviously separate paths for query vs. modify. This continues the cleanup started in the previous patch.
In the process, make some tweaks to simplify the logic when parsing the JSON reply. There should be no user-visible semantic changes.
+int qemuMonitorBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2);
Implementation of this function doesn't check for presence of the @info structure and dereferences it always. You should mark argument 3 nonnull too.
Good catch.
- for (i = 0; i < nr_results; i++) { + for (i = 0, ret = 0; i < nr_results && ret == 0; i++) { virJSONValuePtr entry = virJSONValueArrayGet(data, i); if (!entry) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("missing array element")); - return -1; + ret = -1; + goto cleanup; } - if (qemuMonitorJSONGetBlockJobInfoOne(entry, device, info) == 0) - return 1; + ret = qemuMonitorJSONGetBlockJobInfoOne(entry, device, info);
This doesn't break the loop, thus if the device info isn't last in the structure you'd overwrite the return code. I presume the clients don't check that far but you've documented that semantics.
Actually, the loop DOES break, just on the next iteration. I added '&& ret == 0' to the loop continuation condition. So the semantics are: InfoOne fails and returns -1, the next iteration is aborted and overall ret is -1 InfoOne returns 0 because the requested device didn't match, so the loop continues to look at the next array member. If all array members are exhausted, ret remains 0 and overall return is 0 for job not found. InfoOne succeeds and returns 1, the next iteration is aborted and the overall ret is 1 for info populated. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 09/04/14 21:30, Eric Blake wrote:
On 09/04/2014 09:39 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
Another layer of overly-multiplexed code that deserves to be split into obviously separate paths for query vs. modify. This continues the cleanup started in the previous patch.
In the process, make some tweaks to simplify the logic when parsing the JSON reply. There should be no user-visible semantic changes.
+int qemuMonitorBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2);
Implementation of this function doesn't check for presence of the @info structure and dereferences it always. You should mark argument 3 nonnull too.
Good catch.
- for (i = 0; i < nr_results; i++) { + for (i = 0, ret = 0; i < nr_results && ret == 0; i++) { virJSONValuePtr entry = virJSONValueArrayGet(data, i); if (!entry) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("missing array element")); - return -1; + ret = -1; + goto cleanup; } - if (qemuMonitorJSONGetBlockJobInfoOne(entry, device, info) == 0) - return 1; + ret = qemuMonitorJSONGetBlockJobInfoOne(entry, device, info);
This doesn't break the loop, thus if the device info isn't last in the structure you'd overwrite the return code. I presume the clients don't check that far but you've documented that semantics.
Actually, the loop DOES break, just on the next iteration. I added '&& ret == 0' to the loop continuation condition. So the semantics are:
InfoOne fails and returns -1, the next iteration is aborted and overall ret is -1
InfoOne returns 0 because the requested device didn't match, so the loop continues to look at the next array member. If all array members are exhausted, ret remains 0 and overall return is 0 for job not found.
InfoOne succeeds and returns 1, the next iteration is aborted and the overall ret is 1 for info populated.
I actually figured out that you probably changed the loop condition when I walked out of the office :D Unfortunately I wasn't around a computer to verify and correct myself. ACK to the original patch if you add ATTRIBUTE_NONNUL to the correct places. Peter

On 08/30/2014 10:02 PM, Eric Blake wrote:
Another layer of overly-multiplexed code that deserves to be split into obviously separate paths for query vs. modify. This continues the cleanup started in the previous patch.
In the process, make some tweaks to simplify the logic when parsing the JSON reply. There should be no user-visible semantic changes.
In addition to the ATTRIBUTE_NONNULL addition, I found a bug that needs fixing:
+int +qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, + const char *device, + virDomainBlockJobInfoPtr info) { + virJSONValuePtr cmd = NULL; + virJSONValuePtr reply = NULL; virJSONValuePtr data; int nr_results; size_t i; + int ret;
- if (!info) + cmd = qemuMonitorJSONMakeCommand("query-block-jobs", NULL); + if (!cmd) return -1; + ret = qemuMonitorJSONCommand(mon, cmd, &reply); + if (ret < 0) + goto cleanup;
If we haven't errored out yet, then ret is now 0...
if ((data = virJSONValueObjectGet(reply, "return")) == NULL) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("reply was missing return data")); - return -1; + goto cleanup;
...and we have changed the return code from -1 to 0 here. Oops. I'm squashing this in, before pushing: diff --git i/src/qemu/qemu_monitor_json.c w/src/qemu/qemu_monitor_json.c index 68ba084..2b58c78 100644 --- i/src/qemu/qemu_monitor_json.c +++ w/src/qemu/qemu_monitor_json.c @@ -3753,13 +3753,12 @@ qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, virJSONValuePtr data; int nr_results; size_t i; - int ret; + int ret = -1; cmd = qemuMonitorJSONMakeCommand("query-block-jobs", NULL); if (!cmd) return -1; - ret = qemuMonitorJSONCommand(mon, cmd, &reply); - if (ret < 0) + if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0) goto cleanup; if ((data = virJSONValueObjectGet(reply, "return")) == NULL) { @@ -3780,7 +3779,7 @@ qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, goto cleanup; } - for (i = 0, ret = 0; i < nr_results && ret == 0; i++) { + for (i = ret = 0; i < nr_results && ret == 0; i++) { virJSONValuePtr entry = virJSONValueArrayGet(data, i); if (!entry) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

qemu treats blockjob bandwidth as a 64-bit number, in the units of bytes/second. But we stupidly modeled block job bandwidth after migration bandwidth, which in turn was an 'unsigned long' and therefore subject to 32-bit vs. 64-bit interpretations, and with a scale of MiB/s. Our code already has to convert between the two scales, and report overflow as appropriate; although this conversion currently lives in the monitor code. On the bright side, our use of MiB/s means that even with a 32-bit unsigned long, we still have no problem representing a bandwidth of 2GiB/s, which is starting to be more feasible as 10-gigabit or even faster interfaces are used. And once you get past the physical speeds of existing interfaces, any larger bandwidth number behaves the same - effectively unlimited. But on the low side, the granularity of 1MiB/s tuning is rather coarse. So the new virDomainBlockJob API decided to go with a direct 64-bit bytes/sec number instead of the scaled number that prior blockjob APIs had used. But there is no point in rounding this number to MiB/s just to scale it back to bytes/s for handing to qemu. In order to make future code sharing possible between the old virDomainBlockRebase and the new virDomainBlockCopy, this patch moves the scaling and overflow detection into the driver code. Several of the block job calls that can set speed are fed through a common interface, so it was easier to adjust all block jobs at once, for consistency. * src/qemu/qemu_monitor.h (qemuMonitorBlockJob) (qemuMonitorBlockCommit, qemuMonitorDriveMirror): Change parameter type and scale. * src/qemu/qemu_monitor.c (qemuMonitorBlockJob) (qemuMonitorBlockCommit, qemuMonitorDriveMirror): Move scaling and overflow detection... * src/qemu/qemu_driver.c (qemuDomainBlockJobImpl) (qemuDomainBlockRebase, qemuDomainBlockCommit): ...here. (qemuDomainBlockCopy): Use bytes/sec. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 40 +++++++++++++++++++++++++++++--- src/qemu/qemu_monitor.c | 61 +++++++++++-------------------------------------- src/qemu/qemu_monitor.h | 6 ++--- 3 files changed, 53 insertions(+), 54 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f208ba0..e86ebf5 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14989,6 +14989,8 @@ qemuDomainBlockPivot(virConnectPtr conn, return ret; } + +/* bandwidth in MiB/s per public API */ static int qemuDomainBlockJobImpl(virDomainObjPtr vm, virConnectPtr conn, @@ -15011,6 +15013,7 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, char *backingPath = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); bool save = false; + unsigned long long speed = bandwidth; if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", @@ -15122,9 +15125,18 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, } } + /* Convert bandwidth MiB to bytes */ + if (speed > LLONG_MAX >> 20) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu"), + LLONG_MAX >> 20); + goto endjob; + } + speed <<= 20; + qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorBlockJob(priv->mon, device, basePath, backingPath, - bandwidth, mode, async); + speed, mode, async); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) { if (mode == BLOCK_JOB_ABORT && disk->mirror) @@ -15324,12 +15336,14 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, BLOCK_JOB_SPEED, flags); } + +/* bandwidth in bytes/s */ static int qemuDomainBlockCopy(virDomainObjPtr vm, virConnectPtr conn, const char *path, const char *dest, const char *format, - unsigned long bandwidth, unsigned int flags) + unsigned long long bandwidth, unsigned int flags) { virQEMUDriverPtr driver = conn->privateData; qemuDomainObjPrivatePtr priv; @@ -15512,6 +15526,7 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, virDomainObjPtr vm; const char *format = NULL; int ret = -1; + unsigned long long speed = bandwidth; virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | @@ -15535,6 +15550,15 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) format = "raw"; + /* Convert bandwidth MiB to bytes */ + if (speed > LLONG_MAX >> 20) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu"), + LLONG_MAX >> 20); + goto cleanup; + } + speed <<= 20; + /* XXX: If we are doing a shallow copy but not reusing an external * file, we should attempt to pre-create the destination with a * relative backing chain instead of qemu's default of absolute */ @@ -15600,6 +15624,7 @@ qemuDomainBlockCommit(virDomainPtr dom, char *basePath = NULL; char *backingPath = NULL; virStorageSourcePtr mirror = NULL; + unsigned long long speed = bandwidth; /* XXX Add support for COMMIT_DELETE */ virCheckFlags(VIR_DOMAIN_BLOCK_COMMIT_SHALLOW | @@ -15630,6 +15655,15 @@ qemuDomainBlockCommit(virDomainPtr dom, goto endjob; } + /* Convert bandwidth MiB to bytes */ + if (speed > LLONG_MAX >> 20) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu"), + LLONG_MAX >> 20); + goto endjob; + } + speed <<= 20; + device = qemuDiskPathToAlias(vm, path, &idx); if (!device) goto endjob; @@ -15764,7 +15798,7 @@ qemuDomainBlockCommit(virDomainPtr dom, qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorBlockCommit(priv->mon, device, topPath, basePath, backingPath, - bandwidth); + speed); qemuDomainObjExitMonitor(driver, vm); if (mirror) { diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 4493051..ef35e6a 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -3178,33 +3178,21 @@ qemuMonitorDiskSnapshot(qemuMonitorPtr mon, virJSONValuePtr actions, return ret; } -/* Start a drive-mirror block job. bandwidth is in MiB/sec. */ +/* Start a drive-mirror block job. bandwidth is in bytes/sec. */ int qemuMonitorDriveMirror(qemuMonitorPtr mon, const char *device, const char *file, - const char *format, unsigned long bandwidth, + const char *format, unsigned long long bandwidth, unsigned int flags) { int ret = -1; - unsigned long long speed; - VIR_DEBUG("mon=%p, device=%s, file=%s, format=%s, bandwidth=%ld, " + VIR_DEBUG("mon=%p, device=%s, file=%s, format=%s, bandwidth=%lld, " "flags=%x", mon, device, file, NULLSTR(format), bandwidth, flags); - /* Convert bandwidth MiB to bytes - unfortunately the JSON QMP protocol is - * limited to LLONG_MAX also for unsigned values */ - speed = bandwidth; - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20); - return -1; - } - speed <<= 20; - if (mon->json) - ret = qemuMonitorJSONDriveMirror(mon, device, file, format, speed, + ret = qemuMonitorJSONDriveMirror(mon, device, file, format, bandwidth, flags); else virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", @@ -3228,33 +3216,22 @@ qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) return ret; } -/* Start a block-commit block job. bandwidth is in MiB/sec. */ +/* Start a block-commit block job. bandwidth is in bytes/sec. */ int qemuMonitorBlockCommit(qemuMonitorPtr mon, const char *device, const char *top, const char *base, const char *backingName, - unsigned long bandwidth) + unsigned long long bandwidth) { int ret = -1; - unsigned long long speed; - VIR_DEBUG("mon=%p, device=%s, top=%s, base=%s, backingName=%s, bandwidth=%lu", + VIR_DEBUG("mon=%p, device=%s, top=%s, base=%s, backingName=%s, " + "bandwidth=%llu", mon, device, top, base, NULLSTR(backingName), bandwidth); - /* Convert bandwidth MiB to bytes - unfortunately the JSON QMP protocol is - * limited to LLONG_MAX also for unsigned values */ - speed = bandwidth; - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20); - return -1; - } - speed <<= 20; - if (mon->json) ret = qemuMonitorJSONBlockCommit(mon, device, top, base, - backingName, speed); + backingName, bandwidth); else virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("block-commit requires JSON monitor")); @@ -3359,38 +3336,26 @@ int qemuMonitorScreendump(qemuMonitorPtr mon, return ret; } -/* bandwidth is in MiB/sec */ +/* bandwidth is in bytes/sec */ int qemuMonitorBlockJob(qemuMonitorPtr mon, const char *device, const char *base, const char *backingName, - unsigned long bandwidth, + unsigned long long bandwidth, qemuMonitorBlockJobCmd mode, bool modern) { int ret = -1; - unsigned long long speed; - VIR_DEBUG("mon=%p, device=%s, base=%s, backingName=%s, bandwidth=%luM, " + VIR_DEBUG("mon=%p, device=%s, base=%s, backingName=%s, bandwidth=%lluB, " "mode=%o, modern=%d", mon, device, NULLSTR(base), NULLSTR(backingName), bandwidth, mode, modern); - /* Convert bandwidth MiB to bytes - unfortunately the JSON QMP protocol is - * limited to LLONG_MAX also for unsigned values */ - speed = bandwidth; - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20); - return -1; - } - speed <<= 20; - if (mon->json) ret = qemuMonitorJSONBlockJob(mon, device, base, backingName, - speed, mode, modern); + bandwidth, mode, modern); else virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("block jobs require JSON monitor")); diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 947ca34..35d9546 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -649,7 +649,7 @@ int qemuMonitorDriveMirror(qemuMonitorPtr mon, const char *device, const char *file, const char *format, - unsigned long bandwidth, + unsigned long long bandwidth, unsigned int flags) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int qemuMonitorDrivePivot(qemuMonitorPtr mon, @@ -663,7 +663,7 @@ int qemuMonitorBlockCommit(qemuMonitorPtr mon, const char *top, const char *base, const char *backingName, - unsigned long bandwidth) + unsigned long long bandwidth) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4); bool qemuMonitorSupportsActiveCommit(qemuMonitorPtr mon); @@ -693,7 +693,7 @@ int qemuMonitorBlockJob(qemuMonitorPtr mon, const char *device, const char *base, const char *backingName, - unsigned long bandwidth, + unsigned long long bandwidth, qemuMonitorBlockJobCmd mode, bool modern) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
qemu treats blockjob bandwidth as a 64-bit number, in the units of bytes/second. But we stupidly modeled block job bandwidth after migration bandwidth, which in turn was an 'unsigned long' and therefore subject to 32-bit vs. 64-bit interpretations, and with a scale of MiB/s. Our code already has to convert between the two scales, and report overflow as appropriate; although this conversion currently lives in the monitor code.
On the bright side, our use of MiB/s means that even with a 32-bit unsigned long, we still have no problem representing a bandwidth of 2GiB/s, which is starting to be more feasible as 10-gigabit or even faster interfaces are used. And once you get past the physical speeds of existing interfaces, any larger bandwidth number behaves the same - effectively unlimited. But on the low side, the granularity of 1MiB/s tuning is rather coarse. So the new virDomainBlockJob API decided to go with a direct 64-bit bytes/sec number instead of the scaled number that prior blockjob APIs had used. But there is no point in rounding this number to MiB/s just to scale it back to bytes/s for handing to qemu.
In order to make future code sharing possible between the old virDomainBlockRebase and the new virDomainBlockCopy, this patch moves the scaling and overflow detection into the driver code. Several of the block job calls that can set speed are fed through a common interface, so it was easier to adjust all block jobs at once, for consistency.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob) (qemuMonitorBlockCommit, qemuMonitorDriveMirror): Change parameter type and scale. * src/qemu/qemu_monitor.c (qemuMonitorBlockJob) (qemuMonitorBlockCommit, qemuMonitorDriveMirror): Move scaling and overflow detection... * src/qemu/qemu_driver.c (qemuDomainBlockJobImpl) (qemuDomainBlockRebase, qemuDomainBlockCommit): ...here. (qemuDomainBlockCopy): Use bytes/sec.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 40 +++++++++++++++++++++++++++++--- src/qemu/qemu_monitor.c | 61 +++++++++++-------------------------------------- src/qemu/qemu_monitor.h | 6 ++--- 3 files changed, 53 insertions(+), 54 deletions(-)
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 4493051..ef35e6a 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -3178,33 +3178,21 @@ qemuMonitorDiskSnapshot(qemuMonitorPtr mon, virJSONValuePtr actions, return ret; }
-/* Start a drive-mirror block job. bandwidth is in MiB/sec. */ +/* Start a drive-mirror block job. bandwidth is in bytes/sec. */ int qemuMonitorDriveMirror(qemuMonitorPtr mon, const char *device, const char *file, - const char *format, unsigned long bandwidth, + const char *format, unsigned long long bandwidth, unsigned int flags) { int ret = -1; - unsigned long long speed;
- VIR_DEBUG("mon=%p, device=%s, file=%s, format=%s, bandwidth=%ld, " + VIR_DEBUG("mon=%p, device=%s, file=%s, format=%s, bandwidth=%lld, " "flags=%x", mon, device, file, NULLSTR(format), bandwidth, flags);
- /* Convert bandwidth MiB to bytes - unfortunately the JSON QMP protocol is - * limited to LLONG_MAX also for unsigned values */ - speed = bandwidth; - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20); - return -1;
I started from bottom, so see at the end for a common comment ...
- } - speed <<= 20; - if (mon->json) - ret = qemuMonitorJSONDriveMirror(mon, device, file, format, speed, + ret = qemuMonitorJSONDriveMirror(mon, device, file, format, bandwidth, flags); else virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", @@ -3228,33 +3216,22 @@ qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) return ret; }
-/* Start a block-commit block job. bandwidth is in MiB/sec. */ +/* Start a block-commit block job. bandwidth is in bytes/sec. */ int qemuMonitorBlockCommit(qemuMonitorPtr mon, const char *device, const char *top, const char *base, const char *backingName, - unsigned long bandwidth) + unsigned long long bandwidth) { int ret = -1; - unsigned long long speed;
- VIR_DEBUG("mon=%p, device=%s, top=%s, base=%s, backingName=%s, bandwidth=%lu", + VIR_DEBUG("mon=%p, device=%s, top=%s, base=%s, backingName=%s, " + "bandwidth=%llu", mon, device, top, base, NULLSTR(backingName), bandwidth);
- /* Convert bandwidth MiB to bytes - unfortunately the JSON QMP protocol is - * limited to LLONG_MAX also for unsigned values */ - speed = bandwidth; - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20);
See below ...
- return -1; - } - speed <<= 20; - if (mon->json) ret = qemuMonitorJSONBlockCommit(mon, device, top, base, - backingName, speed); + backingName, bandwidth); else virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("block-commit requires JSON monitor")); @@ -3359,38 +3336,26 @@ int qemuMonitorScreendump(qemuMonitorPtr mon, return ret; }
-/* bandwidth is in MiB/sec */ +/* bandwidth is in bytes/sec */ int qemuMonitorBlockJob(qemuMonitorPtr mon, const char *device, const char *base, const char *backingName, - unsigned long bandwidth, + unsigned long long bandwidth, qemuMonitorBlockJobCmd mode, bool modern) { int ret = -1; - unsigned long long speed;
- VIR_DEBUG("mon=%p, device=%s, base=%s, backingName=%s, bandwidth=%luM, " + VIR_DEBUG("mon=%p, device=%s, base=%s, backingName=%s, bandwidth=%lluB, " "mode=%o, modern=%d", mon, device, NULLSTR(base), NULLSTR(backingName), bandwidth, mode, modern);
- /* Convert bandwidth MiB to bytes - unfortunately the JSON QMP protocol is - * limited to LLONG_MAX also for unsigned values */ - speed = bandwidth; - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20);
I'd keep the check for if (speed > LLONG_MAX) here to be sure that we don't pass something between LLONG_MAX and ULLONG_MAX to qemu as it would be converted to signed by the monitor code.
- return -1; - } - speed <<= 20;
Of course this has to be dropped.
- if (mon->json) ret = qemuMonitorJSONBlockJob(mon, device, base, backingName, - speed, mode, modern); + bandwidth, mode, modern); else virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("block jobs require JSON monitor"));
Possibly we could add a new conversion option to qemuMonitorJSONMakeCommand that would check and reject numbers between LLONG_MAX and ULLONG_MAX rather than converting them to signed silently... If you decide against the option above I ACK this with the bandwidth check added in the monitor APIs. Peter

On 09/04/2014 09:54 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
qemu treats blockjob bandwidth as a 64-bit number, in the units of bytes/second. But we stupidly modeled block job bandwidth after migration bandwidth, which in turn was an 'unsigned long' and therefore subject to 32-bit vs. 64-bit interpretations, and with a scale of MiB/s. Our code already has to convert between the two scales, and report overflow as appropriate; although this conversion currently lives in the monitor code.
- /* Convert bandwidth MiB to bytes - unfortunately the JSON QMP protocol is - * limited to LLONG_MAX also for unsigned values */ - speed = bandwidth; - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20);
I'd keep the check for if (speed > LLONG_MAX) here to be sure that we don't pass something between LLONG_MAX and ULLONG_MAX to qemu as it would be converted to signed by the monitor code.
The callers in qemu_driver.c are making the same check, so by this point, it is just a redundant safety check. I'm not sure it adds anything, since it is not arbitrary user input so much as protection against developer botches.
Possibly we could add a new conversion option to qemuMonitorJSONMakeCommand that would check and reject numbers between LLONG_MAX and ULLONG_MAX rather than converting them to signed silently...
That actually sounds interesting - a new code that requires a positive value within a signed value. I'll see what it looks like for v4. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The previous patch hoisted some bounds checks to the callers; but someone that is not aware of the hoisted check could now try passing an integer between LLONG_MAX and ULLONG_MAX. As a safety measure, add new json conversion modes that let libvirt error out early instead of pass bad numbers to qemu, if the caller ever makes a mistake due to later refactoring. Convert the various blockjob QMP calls to use the new modes, and switch some of them to be optional (QMP has always supported an omitted "speed" the same as "speed":0, for everything except block-job-set-speed). * src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommandRaw): Add 'j'/'y' and 'J'/'Y' to error out on negative input. (qemuMonitorJSONDriveMirror, qemuMonitorJSONBlockCommit) (qemuMonitorJSONBlockJob): Use it. Signed-off-by: Eric Blake <eblake@redhat.com> --- This should address Peter's review concerns on 5/18; I could push it either first or second (although I worded the commit message to push it second). src/qemu/qemu_monitor_json.c | 34 ++++++++++++++++++++++++++++------ 1 file changed, 28 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 68ba084..13274ea 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -457,10 +457,14 @@ qemuMonitorJSONMakeCommandRaw(bool wrap, const char *cmdname, ...) * S: string value, omitted if null * * i: signed integer value + * j: signed integer value, error if negative * z: signed integer value, omitted if zero + * y: signed integer value, omitted if zero, error if negative * * I: signed long integer value + * J: signed long integer value, error if negative * Z: signed long integer value, omitted if zero + * Y: signed long integer value, omitted if zero, error if negative * * u: unsigned integer value * p: unsigned integer value, omitted if zero @@ -500,10 +504,19 @@ qemuMonitorJSONMakeCommandRaw(bool wrap, const char *cmdname, ...) } break; case 'z': + case 'y': + case 'j': case 'i': { int val = va_arg(args, int); - if (!val && type == 'z') + if (val < 0 && (type == 'j' || type == 'y')) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("argument key '%s' must not be negative"), + key); + goto error; + } + + if (!val && (type == 'z' || type == 'y')) continue; ret = virJSONValueObjectAppendNumberInt(jargs, key, val); @@ -520,10 +533,19 @@ qemuMonitorJSONMakeCommandRaw(bool wrap, const char *cmdname, ...) } break; case 'Z': + case 'Y': + case 'J': case 'I': { long long val = va_arg(args, long long); - if (!val && type == 'Z') + if (val < 0 && (type == 'J' || type == 'Y')) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("argument key '%s' must not be negative"), + key); + goto error; + } + + if (!val && (type == 'Z' || type == 'Y')) continue; ret = virJSONValueObjectAppendNumberLong(jargs, key, val); @@ -3408,7 +3430,7 @@ qemuMonitorJSONDriveMirror(qemuMonitorPtr mon, cmd = qemuMonitorJSONMakeCommand("drive-mirror", "s:device", device, "s:target", file, - "U:speed", speed, + "Y:speed", speed, "s:sync", shallow ? "top" : "full", "s:mode", reuse ? "existing" : "absolute-paths", "S:format", format, @@ -3474,7 +3496,7 @@ qemuMonitorJSONBlockCommit(qemuMonitorPtr mon, const char *device, cmd = qemuMonitorJSONMakeCommand("block-commit", "s:device", device, - "U:speed", speed, + "Y:speed", speed, "S:top", top, "S:base", base, "S:backing-file", backingName, @@ -3850,7 +3872,7 @@ qemuMonitorJSONBlockJob(qemuMonitorPtr mon, cmd_name = modern ? "block-job-set-speed" : "block_job_set_speed"; cmd = qemuMonitorJSONMakeCommand(cmd_name, "s:device", device, - modern ? "U:speed" : "U:value", speed, + modern ? "J:speed" : "J:value", speed, NULL); break; @@ -3858,7 +3880,7 @@ qemuMonitorJSONBlockJob(qemuMonitorPtr mon, cmd_name = modern ? "block-stream" : "block_stream"; cmd = qemuMonitorJSONMakeCommand(cmd_name, "s:device", device, - "P:speed", speed, + "Y:speed", speed, "S:base", base, "S:backing-file", backingName, NULL); -- 1.9.3

On 09/05/14 13:36, Eric Blake wrote:
The previous patch hoisted some bounds checks to the callers; but someone that is not aware of the hoisted check could now try passing an integer between LLONG_MAX and ULLONG_MAX. As a safety measure, add new json conversion modes that let libvirt error out early instead of pass bad numbers to qemu, if the caller ever makes a mistake due to later refactoring.
Convert the various blockjob QMP calls to use the new modes, and switch some of them to be optional (QMP has always supported an omitted "speed" the same as "speed":0, for everything except block-job-set-speed).
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommandRaw): Add 'j'/'y' and 'J'/'Y' to error out on negative input. (qemuMonitorJSONDriveMirror, qemuMonitorJSONBlockCommit) (qemuMonitorJSONBlockJob): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com> ---
This should address Peter's review concerns on 5/18; I could push it either first or second (although I worded the commit message to push it second).
Either way I'm fine with it. I just wanted to make sure that we fail rather than pass garbage to qemu.
src/qemu/qemu_monitor_json.c | 34 ++++++++++++++++++++++++++++------ 1 file changed, 28 insertions(+), 6 deletions(-)
Exactly what I had in mind. ACK. Peter

On 09/05/2014 06:00 AM, Peter Krempa wrote:
On 09/05/14 13:36, Eric Blake wrote:
The previous patch hoisted some bounds checks to the callers; but someone that is not aware of the hoisted check could now try passing an integer between LLONG_MAX and ULLONG_MAX. As a safety measure, add new json conversion modes that let libvirt error out early instead of pass bad numbers to qemu, if the caller ever makes a mistake due to later refactoring.
Exactly what I had in mind. ACK.
Thanks; I've pushed 4, 5, and 5.5 now (still double-checking the rest of the series, and will push as I go). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

While reviewing the new virDomainBlockCopy API, Peter Krempa pointed out that our existing design of using MiB/s for block job bandwidth is rather coarse, especially since qemu tracks it in bytes/s; so virDomainBlockCopy only accepts bytes/s. But once the new API is implemented for qemu, we will be in the situation where it is possible to set a value that cannot be accurately reflected back to the user, because the existing virDomainGetBlockJobInfo defaults to the coarser units. Fortunately, we have an escape hatch; and one that has already served us well in the past: we can use the flags argument to specify which scale to use (see virDomainBlockResize for prior art). This patch fixes the query side of the API; made easier by previous patches that split the query side out from the modification code. Later patches will address the virsh interface, as well retrofitting all other blockjob APIs to also accept a flag for toggling bandwidth units. * include/libvirt/libvirt.h.in (_virDomainBlockJobInfo) (VIR_DOMAIN_BLOCK_COPY_BANDWIDTH): Document sizing issues. (virDomainBlockJobInfoFlags): New enum. * src/libvirt.c (virDomainGetBlockJobInfo): Document new flag. * src/qemu/qemu_monitor.h (qemuMonitorBlockJobInfo): Add parameter. * src/qemu/qemu_monitor.c (qemuMonitorBlockJobInfo): Likewise. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJobInfo): Likewise. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJobInfo) (qemuMonitorJSONGetBlockJobInfoOne): Likewise. Don't scale here. * src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Update callers. * src/qemu/qemu_driver.c (qemuDomainBlockPivot) (qemuDomainBlockJobImpl): Likewise. (qemuDomainGetBlockJobInfo): Likewise, and support new flag. Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 38 +++++++++++++++++++++++++------------- src/libvirt.c | 8 ++++++-- src/qemu/qemu_driver.c | 20 ++++++++++++++++---- src/qemu/qemu_migration.c | 3 ++- src/qemu/qemu_monitor.c | 8 +++++--- src/qemu/qemu_monitor.h | 3 ++- src/qemu/qemu_monitor_json.c | 19 ++++++++++++------- src/qemu/qemu_monitor_json.h | 3 ++- 8 files changed, 70 insertions(+), 32 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index a64f597..6d0c042 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2585,13 +2585,23 @@ typedef enum { VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT = 1 << 1, } virDomainBlockJobAbortFlags; +int virDomainBlockJobAbort(virDomainPtr dom, const char *disk, + unsigned int flags); + +/* Flags for use with virDomainGetBlockJobInfo */ +typedef enum { + VIR_DOMAIN_BLOCK_JOB_INFO_BANDWIDTH_BYTES = 1 << 0, /* bandwidth in bytes/s + instead of MiB/s */ +} virDomainBlockJobInfoFlags; + /* An iterator for monitoring block job operations */ typedef unsigned long long virDomainBlockJobCursor; typedef struct _virDomainBlockJobInfo virDomainBlockJobInfo; struct _virDomainBlockJobInfo { int type; /* virDomainBlockJobType */ - unsigned long bandwidth; + unsigned long bandwidth; /* either bytes/s or MiB/s, according to flags */ + /* * The following fields provide an indication of block job progress. @cur * indicates the current position and will be between 0 and @end. @end is @@ -2603,11 +2613,10 @@ struct _virDomainBlockJobInfo { }; typedef virDomainBlockJobInfo *virDomainBlockJobInfoPtr; -int virDomainBlockJobAbort(virDomainPtr dom, const char *disk, - unsigned int flags); -int virDomainGetBlockJobInfo(virDomainPtr dom, const char *disk, - virDomainBlockJobInfoPtr info, - unsigned int flags); +int virDomainGetBlockJobInfo(virDomainPtr dom, const char *disk, + virDomainBlockJobInfoPtr info, + unsigned int flags); + int virDomainBlockJobSetSpeed(virDomainPtr dom, const char *disk, unsigned long bandwidth, unsigned int flags); @@ -2653,13 +2662,16 @@ typedef enum { * the maximum bandwidth in bytes/s, and is used while getting the * copy operation into the mirrored phase, with a type of ullong. For * compatibility with virDomainBlockJobSetSpeed(), values larger than - * 2^52 bytes/sec (a 32-bit MiB/s value) may be rejected due to - * overflow considerations, and hypervisors may further restrict the - * set of valid values. Specifying 0 is the same as omitting this - * parameter, to request no bandwidth limiting. Some hypervisors may - * lack support for this parameter, while still allowing a subsequent - * change of bandwidth via virDomainBlockJobSetSpeed(). The actual - * speed can be determined with virDomainGetBlockJobInfo(). + * 2^52 bytes/sec (a 32-bit MiB/s value) may be rejected on input due + * to overflow considerations (but do you really have an interface + * with that much bandwidth?), and values larger than 2^31 bytes/sec + * may cause overflow problems if queried in bytes/sec. Hypervisors + * may further restrict the set of valid values. Specifying 0 is the + * same as omitting this parameter, to request no bandwidth limiting. + * Some hypervisors may lack support for this parameter, while still + * allowing a subsequent change of bandwidth via + * virDomainBlockJobSetSpeed(). The actual speed can be determined + * with virDomainGetBlockJobInfo(). */ #define VIR_DOMAIN_BLOCK_COPY_BANDWIDTH "bandwidth" diff --git a/src/libvirt.c b/src/libvirt.c index 5d8f01c..b00ee16 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -19667,10 +19667,14 @@ virDomainBlockJobAbort(virDomainPtr dom, const char *disk, * @dom: pointer to domain object * @disk: path to the block device, or device shorthand * @info: pointer to a virDomainBlockJobInfo structure - * @flags: extra flags; not used yet, so callers should always pass 0 + * @flags: bitwise-OR of virDomainBlockJobInfoFlags * * Request block job information for the given disk. If an operation is active - * @info will be updated with the current progress. + * @info will be updated with the current progress. The units used for the + * bandwidth field of @info depends on @flags. If @flags includes + * VIR_DOMAIN_BLOCK_JOB_INFO_BANDWIDTH_BYTES, bandwidth is in bytes/second + * (although this mode can risk failure due to overflow, depending on both + * client and server word size); otherwise, the value is rounded up to MiB/s. * * The @disk parameter is either an unambiguous source name of the * block device (the <source file='...'/> sub-element, such as diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e86ebf5..1ad682f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14857,7 +14857,7 @@ qemuDomainBlockPivot(virConnectPtr conn, /* Probe the status, if needed. */ if (!disk->mirrorState) { qemuDomainObjEnterMonitor(driver, vm); - rc = qemuMonitorBlockJobInfo(priv->mon, device, &info); + rc = qemuMonitorBlockJobInfo(priv->mon, device, &info, NULL); qemuDomainObjExitMonitor(driver, vm); if (rc < 0) goto cleanup; @@ -15180,7 +15180,7 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, virDomainBlockJobInfo dummy; qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorBlockJobInfo(priv->mon, device, &dummy); + ret = qemuMonitorBlockJobInfo(priv->mon, device, &dummy, NULL); qemuDomainObjExitMonitor(driver, vm); if (ret <= 0) @@ -15253,8 +15253,9 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, int idx; virDomainDiskDefPtr disk; int ret = -1; + unsigned long long bandwidth; - virCheckFlags(0, -1); + virCheckFlags(VIR_DOMAIN_BLOCK_JOB_INFO_BANDWIDTH_BYTES, -1); if (!(vm = qemuDomObjFromDomain(dom))) return -1; @@ -15287,7 +15288,7 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, disk = vm->def->disks[idx]; qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorBlockJobInfo(priv->mon, device, info); + ret = qemuMonitorBlockJobInfo(priv->mon, device, info, &bandwidth); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) goto endjob; @@ -15295,6 +15296,17 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, if (info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT && disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT) info->type = disk->mirrorJob; + if (bandwidth) { + if (!(flags & VIR_DOMAIN_BLOCK_JOB_INFO_BANDWIDTH_BYTES)) + bandwidth = VIR_DIV_UP(bandwidth, 1024 * 1024); + info->bandwidth = bandwidth; + if (info->bandwidth != bandwidth) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth %llu cannot be represented in result"), + bandwidth); + goto endjob; + } + } /* Snoop block copy operations, so future cancel operations can * avoid checking if pivot is safe. Save the change to XML, but diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 9885a16..6bdee4e 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1307,7 +1307,8 @@ qemuMigrationDriveMirror(virQEMUDriverPtr driver, _("canceled by client")); goto error; } - mon_ret = qemuMonitorBlockJobInfo(priv->mon, diskAlias, &info); + mon_ret = qemuMonitorBlockJobInfo(priv->mon, diskAlias, &info, + NULL); qemuDomainObjExitMonitor(driver, vm); if (mon_ret < 0) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index ef35e6a..d96b1b8 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -3366,14 +3366,16 @@ qemuMonitorBlockJob(qemuMonitorPtr mon, int qemuMonitorBlockJobInfo(qemuMonitorPtr mon, const char *device, - virDomainBlockJobInfoPtr info) + virDomainBlockJobInfoPtr info, + unsigned long long *bandwidth) { int ret = -1; - VIR_DEBUG("mon=%p, device=%s, info=%p", mon, device, info); + VIR_DEBUG("mon=%p, device=%s, info=%p, bandwidth=%p", + mon, device, info, bandwidth); if (mon->json) - ret = qemuMonitorJSONBlockJobInfo(mon, device, info); + ret = qemuMonitorJSONBlockJobInfo(mon, device, info, bandwidth); else virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("block jobs require JSON monitor")); diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 35d9546..9012ad4 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -700,7 +700,8 @@ int qemuMonitorBlockJob(qemuMonitorPtr mon, int qemuMonitorBlockJobInfo(qemuMonitorPtr mon, const char *device, - virDomainBlockJobInfoPtr info) + virDomainBlockJobInfoPtr info, + unsigned long long *bandwidth) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); int qemuMonitorOpenGraphics(qemuMonitorPtr mon, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 68ba084..8130f92 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -3684,15 +3684,18 @@ int qemuMonitorJSONScreendump(qemuMonitorPtr mon, return ret; } -/* Returns -1 on error, 0 if not the right device, 1 if info was populated. */ +/* Returns -1 on error, 0 if not the right device, 1 if info was + * populated. However, rather than populate info->bandwidth (which + * might overflow on 32-bit machines), bandwidth is tracked optionally + * on the side. */ static int qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry, const char *device, - virDomainBlockJobInfoPtr info) + virDomainBlockJobInfoPtr info, + unsigned long long *bandwidth) { const char *this_dev; const char *type; - unsigned long long speed_bytes; if ((this_dev = virJSONValueObjectGetString(entry, "device")) == NULL) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -3717,12 +3720,12 @@ qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry, else info->type = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; - if (virJSONValueObjectGetNumberUlong(entry, "speed", &speed_bytes) < 0) { + if (bandwidth && + virJSONValueObjectGetNumberUlong(entry, "speed", bandwidth) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("entry was missing 'speed'")); return -1; } - info->bandwidth = speed_bytes / 1024ULL / 1024ULL; if (virJSONValueObjectGetNumberUlong(entry, "offset", &info->cur) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -3746,7 +3749,8 @@ qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry, int qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, const char *device, - virDomainBlockJobInfoPtr info) + virDomainBlockJobInfoPtr info, + unsigned long long *bandwidth) { virJSONValuePtr cmd = NULL; virJSONValuePtr reply = NULL; @@ -3788,7 +3792,8 @@ qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, ret = -1; goto cleanup; } - ret = qemuMonitorJSONGetBlockJobInfoOne(entry, device, info); + ret = qemuMonitorJSONGetBlockJobInfoOne(entry, device, info, + bandwidth); } cleanup: diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index d3f6f37..4595eae 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -291,7 +291,8 @@ int qemuMonitorJSONBlockJob(qemuMonitorPtr mon, int qemuMonitorJSONBlockJobInfo(qemuMonitorPtr mon, const char *device, - virDomainBlockJobInfoPtr info) + virDomainBlockJobInfoPtr info, + unsigned long long *bandwidth) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); int qemuMonitorJSONSetLink(qemuMonitorPtr mon, -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
While reviewing the new virDomainBlockCopy API, Peter Krempa pointed out that our existing design of using MiB/s for block job bandwidth is rather coarse, especially since qemu tracks it in bytes/s; so virDomainBlockCopy only accepts bytes/s. But once the new API is implemented for qemu, we will be in the situation where it is possible to set a value that cannot be accurately reflected back to the user, because the existing virDomainGetBlockJobInfo defaults to the coarser units.
Fortunately, we have an escape hatch; and one that has already served us well in the past: we can use the flags argument to specify which scale to use (see virDomainBlockResize for prior art). This patch fixes the query side of the API; made easier by previous patches that split the query side out from the modification code. Later patches will address the virsh interface, as well retrofitting all other blockjob APIs to also accept a flag for toggling bandwidth units.
* include/libvirt/libvirt.h.in (_virDomainBlockJobInfo) (VIR_DOMAIN_BLOCK_COPY_BANDWIDTH): Document sizing issues. (virDomainBlockJobInfoFlags): New enum. * src/libvirt.c (virDomainGetBlockJobInfo): Document new flag. * src/qemu/qemu_monitor.h (qemuMonitorBlockJobInfo): Add parameter. * src/qemu/qemu_monitor.c (qemuMonitorBlockJobInfo): Likewise. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJobInfo): Likewise. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJobInfo) (qemuMonitorJSONGetBlockJobInfoOne): Likewise. Don't scale here. * src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Update callers. * src/qemu/qemu_driver.c (qemuDomainBlockPivot) (qemuDomainBlockJobImpl): Likewise. (qemuDomainGetBlockJobInfo): Likewise, and support new flag.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 38 +++++++++++++++++++++++++------------- src/libvirt.c | 8 ++++++-- src/qemu/qemu_driver.c | 20 ++++++++++++++++---- src/qemu/qemu_migration.c | 3 ++- src/qemu/qemu_monitor.c | 8 +++++--- src/qemu/qemu_monitor.h | 3 ++- src/qemu/qemu_monitor_json.c | 19 ++++++++++++------- src/qemu/qemu_monitor_json.h | 3 ++- 8 files changed, 70 insertions(+), 32 deletions(-)
ACK. I presume virsh will be modified later in the series. Peter

On 09/04/2014 10:11 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
While reviewing the new virDomainBlockCopy API, Peter Krempa pointed out that our existing design of using MiB/s for block job bandwidth is rather coarse, especially since qemu tracks it in bytes/s; so virDomainBlockCopy only accepts bytes/s. But once the new API is implemented for qemu, we will be in the situation where it is possible to set a value that cannot be accurately reflected back to the user, because the existing virDomainGetBlockJobInfo defaults to the coarser units.
ACK. I presume virsh will be modified later in the series.
Indeed. I'm still working on the virsh modifications for setting bandwidth in a different scale, but 9/18 was the modification for getting in smarter units. This one is now pushed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

I have plans to make future enhancements to the job list mode, which will be easier to do if the common blockJobImpl function is not mixing a query command with multiple modify commands. Besides, it just feels weird that all callers to blockJobImpl had to supply both a bandwidth input argument (unused for info mode) and an info output argument (unused for all other modes); not to mention I just made similar cleanups on the libvirtd side. The only reason blockJobImpl returned int was because of info mode returning -1/0/1 (all other job API are -1/0), so that can also be cleaned up. * tools/virsh-domain.c (blockJobImpl): Change signature and return value. Drop info handling. (cmdBlockJob): Handle info here. (cmdBlockCommit, cmdBlockCopy, cmdBlockPull): Adjust callers. Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 97 +++++++++++++++++++++++++++++++--------------------- 1 file changed, 58 insertions(+), 39 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index c75cd73..d1297ad 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1461,23 +1461,21 @@ cmdBlkiotune(vshControl * ctl, const vshCmd * cmd) } typedef enum { - VSH_CMD_BLOCK_JOB_ABORT = 0, - VSH_CMD_BLOCK_JOB_INFO = 1, - VSH_CMD_BLOCK_JOB_SPEED = 2, - VSH_CMD_BLOCK_JOB_PULL = 3, - VSH_CMD_BLOCK_JOB_COPY = 4, - VSH_CMD_BLOCK_JOB_COMMIT = 5, + VSH_CMD_BLOCK_JOB_ABORT, + VSH_CMD_BLOCK_JOB_SPEED, + VSH_CMD_BLOCK_JOB_PULL, + VSH_CMD_BLOCK_JOB_COPY, + VSH_CMD_BLOCK_JOB_COMMIT, } vshCmdBlockJobMode; -static int +static bool blockJobImpl(vshControl *ctl, const vshCmd *cmd, - virDomainBlockJobInfoPtr info, int mode, - virDomainPtr *pdom) + vshCmdBlockJobMode mode, virDomainPtr *pdom) { virDomainPtr dom = NULL; const char *path; unsigned long bandwidth = 0; - int ret = -1; + bool ret = false; const char *base = NULL; const char *top = NULL; unsigned int flags = 0; @@ -1493,19 +1491,18 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, goto cleanup; } - switch ((vshCmdBlockJobMode) mode) { + switch (mode) { case VSH_CMD_BLOCK_JOB_ABORT: if (vshCommandOptBool(cmd, "async")) flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC; if (vshCommandOptBool(cmd, "pivot")) flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT; - ret = virDomainBlockJobAbort(dom, path, flags); - break; - case VSH_CMD_BLOCK_JOB_INFO: - ret = virDomainGetBlockJobInfo(dom, path, info, 0); + if (virDomainBlockJobAbort(dom, path, flags) < 0) + goto cleanup; break; case VSH_CMD_BLOCK_JOB_SPEED: - ret = virDomainBlockJobSetSpeed(dom, path, bandwidth, 0); + if (virDomainBlockJobSetSpeed(dom, path, bandwidth, 0) < 0) + goto cleanup; break; case VSH_CMD_BLOCK_JOB_PULL: if (vshCommandOptStringReq(ctl, cmd, "base", &base) < 0) @@ -1513,10 +1510,13 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, if (vshCommandOptBool(cmd, "keep-relative")) flags |= VIR_DOMAIN_BLOCK_REBASE_RELATIVE; - if (base || flags) - ret = virDomainBlockRebase(dom, path, base, bandwidth, flags); - else - ret = virDomainBlockPull(dom, path, bandwidth, 0); + if (base || flags) { + if (virDomainBlockRebase(dom, path, base, bandwidth, flags) < 0) + goto cleanup; + } else { + if (virDomainBlockPull(dom, path, bandwidth, 0) < 0) + goto cleanup; + } break; case VSH_CMD_BLOCK_JOB_COMMIT: @@ -1533,7 +1533,8 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, flags |= VIR_DOMAIN_BLOCK_COMMIT_ACTIVE; if (vshCommandOptBool(cmd, "keep-relative")) flags |= VIR_DOMAIN_BLOCK_COMMIT_RELATIVE; - ret = virDomainBlockCommit(dom, path, base, top, bandwidth, flags); + if (virDomainBlockCommit(dom, path, base, top, bandwidth, flags) < 0) + goto cleanup; break; case VSH_CMD_BLOCK_JOB_COPY: flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; @@ -1545,11 +1546,15 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; if (vshCommandOptStringReq(ctl, cmd, "dest", &base) < 0) goto cleanup; - ret = virDomainBlockRebase(dom, path, base, bandwidth, flags); + if (virDomainBlockRebase(dom, path, base, bandwidth, flags) < 0) + goto cleanup; + break; } + ret = true; + cleanup: - if (pdom && ret == 0) + if (pdom && ret) *pdom = dom; else if (dom) virDomainFree(dom); @@ -1721,7 +1726,7 @@ cmdBlockCommit(vshControl *ctl, const vshCmd *cmd) return false; } - if (blockJobImpl(ctl, cmd, NULL, VSH_CMD_BLOCK_JOB_COMMIT, &dom) < 0) + if (!blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_COMMIT, &dom)) goto cleanup; if (!blocking) { @@ -1924,7 +1929,7 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) return false; } - if (blockJobImpl(ctl, cmd, NULL, VSH_CMD_BLOCK_JOB_COPY, &dom) < 0) + if (!blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_COPY, &dom)) goto cleanup; if (!blocking) { @@ -2069,14 +2074,17 @@ vshDomainBlockJobToString(int type) static bool cmdBlockJob(vshControl *ctl, const vshCmd *cmd) { - int mode; virDomainBlockJobInfo info; - int ret; + bool ret = false; + int rc; bool abortMode = (vshCommandOptBool(cmd, "abort") || vshCommandOptBool(cmd, "async") || vshCommandOptBool(cmd, "pivot")); bool infoMode = vshCommandOptBool(cmd, "info"); bool bandwidth = vshCommandOptBool(cmd, "bandwidth"); + virDomainPtr dom = NULL; + const char *path; + unsigned int flags = 0; if (abortMode + infoMode + bandwidth > 1) { vshError(ctl, "%s", @@ -2085,24 +2093,35 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) } if (abortMode) - mode = VSH_CMD_BLOCK_JOB_ABORT; - else if (bandwidth) - mode = VSH_CMD_BLOCK_JOB_SPEED; - else - mode = VSH_CMD_BLOCK_JOB_INFO; + return blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_ABORT, NULL); + if (bandwidth) + return blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_SPEED, NULL); - ret = blockJobImpl(ctl, cmd, &info, mode, NULL); - if (ret < 0) - return false; + /* Everything below here is for --info mode */ + if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) + goto cleanup; - if (ret == 0 || mode != VSH_CMD_BLOCK_JOB_INFO) - return true; + /* XXX Allow path to be optional to list info on all devices at once */ + if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + goto cleanup; + + rc = virDomainGetBlockJobInfo(dom, path, &info, flags); + if (rc < 0) + goto cleanup; + if (rc == 0) { + ret = true; + goto cleanup; + } vshPrintJobProgress(vshDomainBlockJobToString(info.type), info.end - info.cur, info.end); if (info.bandwidth != 0) vshPrint(ctl, _(" Bandwidth limit: %lu MiB/s\n"), info.bandwidth); - return true; + ret = true; + cleanup: + if (dom) + virDomainFree(dom); + return ret; } /* @@ -2201,7 +2220,7 @@ cmdBlockPull(vshControl *ctl, const vshCmd *cmd) return false; } - if (blockJobImpl(ctl, cmd, NULL, VSH_CMD_BLOCK_JOB_PULL, &dom) < 0) + if (!blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_PULL, &dom)) goto cleanup; if (!blocking) { -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
I have plans to make future enhancements to the job list mode, which will be easier to do if the common blockJobImpl function is not mixing a query command with multiple modify commands. Besides, it just feels weird that all callers to blockJobImpl had to supply both a bandwidth input argument (unused for info mode) and an info output argument (unused for all other modes); not to mention I just made similar cleanups on the libvirtd side.
The only reason blockJobImpl returned int was because of info mode returning -1/0/1 (all other job API are -1/0), so that can also be cleaned up.
* tools/virsh-domain.c (blockJobImpl): Change signature and return value. Drop info handling. (cmdBlockJob): Handle info here. (cmdBlockCommit, cmdBlockCopy, cmdBlockPull): Adjust callers.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 97 +++++++++++++++++++++++++++++++--------------------- 1 file changed, 58 insertions(+), 39 deletions(-)
ACK, Peter

The current output of 'blockjob [--info]' is a single line designed for human consumption; it's not very nice for machine parsing. Furthermore, I have plans to modify the line in response to the new flag for controlling bandwidth units. Solve that by adding a --raw parameter, which outputs information closer to the C struct. $ virsh blockjob testvm1 vda --raw type=Block Copy bandwidth=1 cur=197120 end=197120 The information is indented, because I'd like for a later patch to add a mode that iterates over all the vm's disks with status for each; in that mode, each block name would be listed unindented before information (if any) about that block. Now that we have a raw mode, we can guarantee that it won't change format over time. Any app that cares about parsing the output can try --raw, and if it fails, know that it was talking to an older virsh and fall back to parsing the human-readable format which had not changed until now; meanwhile, when not using --raw, we have freed future virsh to change the output to whatever makes sense. My first change to human mode: this command now guarantees a line is printed on successful use of the API, even when the API did not find a current block job (consistent with the rest of virsh). Bonus: https://bugzilla.redhat.com/show_bug.cgi?id=1135441 complained that this message was confusing: $ virsh blockjob test1 hda --async --bandwidth 10 error: conflict between --abort, --info, and --bandwidth modes even though the man page already documents that --async implies abort mode, all because '--abort' wasn't present in the command line. Since I'm adding another case where options are tied to or imply a mode, I changed that error to: error: conflict between abort, info, and bandwidth modes * tools/virsh-domain.c (cmdBlockJob): Add --raw parameter; tweak error wording. * tools/virsh.pod (blockjob): Document it. Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 31 +++++++++++++++++++++++-------- tools/virsh.pod | 19 ++++++++++++------- 2 files changed, 35 insertions(+), 15 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index d1297ad..b95e971 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -2038,16 +2038,20 @@ static const vshCmdOptDef opts_block_job[] = { }, {.name = "async", .type = VSH_OT_BOOL, - .help = N_("don't wait for --abort to complete") + .help = N_("implies --abort; request but don't wait for job end") }, {.name = "pivot", .type = VSH_OT_BOOL, - .help = N_("conclude and pivot a copy job") + .help = N_("implies --abort; conclude and pivot a copy job") }, {.name = "info", .type = VSH_OT_BOOL, .help = N_("get active job information for the specified disk") }, + {.name = "raw", + .type = VSH_OT_BOOL, + .help = N_("implies --info; output details rather than human summary") + }, {.name = "bandwidth", .type = VSH_OT_DATA, .help = N_("set the Bandwidth limit in MiB/s") @@ -2077,10 +2081,11 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) virDomainBlockJobInfo info; bool ret = false; int rc; + bool raw = vshCommandOptBool(cmd, "raw"); bool abortMode = (vshCommandOptBool(cmd, "abort") || vshCommandOptBool(cmd, "async") || vshCommandOptBool(cmd, "pivot")); - bool infoMode = vshCommandOptBool(cmd, "info"); + bool infoMode = vshCommandOptBool(cmd, "info") || raw; bool bandwidth = vshCommandOptBool(cmd, "bandwidth"); virDomainPtr dom = NULL; const char *path; @@ -2088,7 +2093,7 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) if (abortMode + infoMode + bandwidth > 1) { vshError(ctl, "%s", - _("conflict between --abort, --info, and --bandwidth modes")); + _("conflict between abort, info, and bandwidth modes")); return false; } @@ -2109,14 +2114,24 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) if (rc < 0) goto cleanup; if (rc == 0) { + if (!raw) + vshPrint(ctl, _("No current block job for %s"), path); ret = true; goto cleanup; } - vshPrintJobProgress(vshDomainBlockJobToString(info.type), - info.end - info.cur, info.end); - if (info.bandwidth != 0) - vshPrint(ctl, _(" Bandwidth limit: %lu MiB/s\n"), info.bandwidth); + if (raw) { + vshPrint(ctl, _(" type=%s\n bandwidth=%lu\n cur=%llu\n end=%llu\n"), + vshDomainBlockJobTypeToString(info.type), + info.bandwidth, info.cur, info.end); + } else { + vshPrintJobProgress(vshDomainBlockJobToString(info.type), + info.end - info.cur, info.end); + if (info.bandwidth != 0) + vshPrint(ctl, _(" Bandwidth limit: %lu MiB/s"), + info.bandwidth); + vshPrint(ctl, "\n"); + } ret = true; cleanup: if (dom) diff --git a/tools/virsh.pod b/tools/virsh.pod index ea9267e..3f1bf7e 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -1013,24 +1013,29 @@ exclusive. If no flag is specified, behavior is different depending on hypervisor. =item B<blockjob> I<domain> I<path> { [I<--abort>] [I<--async>] [I<--pivot>] | -[I<--info>] | [I<bandwidth>] } +[I<--info>] [I<--raw>] | [I<bandwidth>] } -Manage active block operations. There are three modes: I<--info>, -I<bandwidth>, and I<--abort>; I<--info> is default except that I<--async> -or I<--pivot> implies I<--abort>. +Manage active block operations. There are three mutually-exclusive modes: +I<--info>, I<bandwidth>, and I<--abort>. I<--async> and I<--pivot> imply +abort mode; I<--raw> implies info mode; and if no mode was given, I<--info> +mode is assumed. I<path> specifies fully-qualified path of the disk; it corresponds to a unique target name (<target dev='name'/>) or source file (<source file='name'/>) for one of the disk devices attached to I<domain> (see also B<domblklist> for listing these names). -If I<--abort> is specified, the active job on the specified disk will +In I<--abort> mode, the active job on the specified disk will be aborted. If I<--async> is also specified, this command will return immediately, rather than waiting for the cancellation to complete. If I<--pivot> is specified, this requests that an active copy or active commit job be pivoted over to the new image. -If I<--info> is specified, the active job information on the specified -disk will be printed. + +In I<--info> mode, the active job information on the specified +disk will be printed. By default, the output is a single human-readable +summary line; this format may change in future versions. Adding +I<--raw> lists each field of the struct, in a stable format. + I<bandwidth> can be used to set bandwidth limit for the active job. Specifying a negative value is interpreted as an unsigned long long value or essentially unlimited. The hypervisor can choose whether to -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
The current output of 'blockjob [--info]' is a single line designed for human consumption; it's not very nice for machine parsing. Furthermore, I have plans to modify the line in response to the new flag for controlling bandwidth units. Solve that by adding a --raw parameter, which outputs information closer to the C struct.
$ virsh blockjob testvm1 vda --raw type=Block Copy bandwidth=1 cur=197120 end=197120
The information is indented, because I'd like for a later patch to add a mode that iterates over all the vm's disks with status for each; in that mode, each block name would be listed unindented before information (if any) about that block.
Now that we have a raw mode, we can guarantee that it won't change format over time. Any app that cares about parsing the output can try --raw, and if it fails, know that it was talking to an older virsh and fall back to parsing the human-readable format which had not changed until now; meanwhile, when not using --raw, we have freed future virsh to change the output to whatever makes sense.
My first change to human mode: this command now guarantees a line is printed on successful use of the API, even when the API did not find a current block job (consistent with the rest of virsh).
Bonus: https://bugzilla.redhat.com/show_bug.cgi?id=1135441 complained that this message was confusing:
$ virsh blockjob test1 hda --async --bandwidth 10 error: conflict between --abort, --info, and --bandwidth modes
even though the man page already documents that --async implies abort mode, all because '--abort' wasn't present in the command line. Since I'm adding another case where options are tied to or imply a mode, I changed that error to:
error: conflict between abort, info, and bandwidth modes
* tools/virsh-domain.c (cmdBlockJob): Add --raw parameter; tweak error wording. * tools/virsh.pod (blockjob): Document it.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 31 +++++++++++++++++++++++-------- tools/virsh.pod | 19 ++++++++++++------- 2 files changed, 35 insertions(+), 15 deletions(-)
ACK, Peter

On 09/04/2014 10:23 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
The current output of 'blockjob [--info]' is a single line designed for human consumption; it's not very nice for machine parsing. Furthermore, I have plans to modify the line in response to the new flag for controlling bandwidth units. Solve that by adding a --raw parameter, which outputs information closer to the C struct.
$ virsh blockjob testvm1 vda --raw type=Block Copy bandwidth=1 cur=197120 end=197120
The information is indented, because I'd like for a later patch to add a mode that iterates over all the vm's disks with status for each; in that mode, each block name would be listed unindented before information (if any) about that block.
I haven't actually finished adding a loop operation yet, but hope to have it in v4.
* tools/virsh-domain.c (cmdBlockJob): Add --raw parameter; tweak error wording. * tools/virsh.pod (blockjob): Document it.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 31 +++++++++++++++++++++++-------- tools/virsh.pod | 19 ++++++++++++------- 2 files changed, 35 insertions(+), 15 deletions(-)
ACK,
Pushed after fixing an accidental double space. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Expose the new flag just added to virDomainGetBlockJobInfo. With --raw, the presence or absence of --bytes determines which flag to use in the single API call. Without --raw, the use of --bytes forces an error if the server doesn't support it, otherwise, the code tries to silently fall back to scaling the MiB/s value. My goal is to eventually also support --bytes in bandwidth mode; but that's a bit further down the road (and needs a new API flag added in libvirt.h first). This changes the human output, but the previous patch added raw output precisely so that we can have flexibility with the human output. For this commit, I used qemu-monitor-command to force an unusual bandwidth, but the same will be possible once qemu implements virDomainBlockCopy: Before: Block Copy: [100 %] Bandwidth limit: 2 MiB/s After: Block Copy: [100 %] Bandwidth limit: 1048577 bytes/s (1.000 MiB/s) The cache avoids having to repeatedly attempt a flag probe when talking to an older server, when multiple blockjob commands are issued during a batch session and the user is manually polling for job completion. * tools/virsh.h (_vshControl): Add a cache. * tools/virsh.c (cmdConnect, vshReconnect): Initialize the cache. * tools/virsh-domain.c (opts_block_job): Add --bytes. * tools/virsh.pod (blockjob): Document this. Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++------ tools/virsh.c | 2 ++ tools/virsh.h | 2 ++ tools/virsh.pod | 8 +++++-- 4 files changed, 67 insertions(+), 9 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index b95e971..eb3ed15 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -2048,6 +2048,10 @@ static const vshCmdOptDef opts_block_job[] = { .type = VSH_OT_BOOL, .help = N_("get active job information for the specified disk") }, + {.name = "bytes", + .type = VSH_OT_BOOL, + .help = N_("with --info, get bandwidth in bytes rather than MiB/s") + }, {.name = "raw", .type = VSH_OT_BOOL, .help = N_("implies --info; output details rather than human summary") @@ -2080,8 +2084,9 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) { virDomainBlockJobInfo info; bool ret = false; - int rc; + int rc = -1; bool raw = vshCommandOptBool(cmd, "raw"); + bool bytes = vshCommandOptBool(cmd, "bytes"); bool abortMode = (vshCommandOptBool(cmd, "abort") || vshCommandOptBool(cmd, "async") || vshCommandOptBool(cmd, "pivot")); @@ -2090,12 +2095,18 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) virDomainPtr dom = NULL; const char *path; unsigned int flags = 0; + unsigned long long speed; if (abortMode + infoMode + bandwidth > 1) { vshError(ctl, "%s", _("conflict between abort, info, and bandwidth modes")); return false; } + /* XXX also support --bytes with bandwidth mode */ + if (bytes && (abortMode || bandwidth)) { + vshError(ctl, "%s", _("--bytes requires info mode")); + return false; + } if (abortMode) return blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_ABORT, NULL); @@ -2110,9 +2121,45 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) goto cleanup; - rc = virDomainGetBlockJobInfo(dom, path, &info, flags); - if (rc < 0) - goto cleanup; + /* If bytes were requested, or if raw mode is not forcing MiB/s + * query and cache can't prove failure, then query bytes/sec. */ + if (bytes || !(raw || ctl->blockJobNoBytes)) { + flags |= VIR_DOMAIN_BLOCK_JOB_INFO_BANDWIDTH_BYTES; + rc = virDomainGetBlockJobInfo(dom, path, &info, flags); + if (rc < 0) { + /* Check for particular errors, let all the rest be fatal. */ + switch (last_error->code) { + case VIR_ERR_INVALID_ARG: + ctl->blockJobNoBytes = true; + /* fallthrough */ + case VIR_ERR_OVERFLOW: + if (!bytes && !raw) { + /* try again with MiB/s, unless forcing bytes */ + vshResetLibvirtError(); + break; + } + /* fallthrough */ + default: + goto cleanup; + } + } + speed = info.bandwidth; + } + /* If we don't already have a query result, query for MiB/s */ + if (rc < 0) { + flags &= ~VIR_DOMAIN_BLOCK_JOB_INFO_BANDWIDTH_BYTES; + if ((rc = virDomainGetBlockJobInfo(dom, path, &info, flags)) < 0) + goto cleanup; + speed = info.bandwidth; + /* Scale to bytes/s unless in raw mode */ + if (!raw) { + speed <<= 20; + if (speed >> 20 != info.bandwidth) + vshError(ctl, _("overflow in converting %ld MiB/s to bytes\n"), + info.bandwidth); + } + } + if (rc == 0) { if (!raw) vshPrint(ctl, _("No current block job for %s"), path); @@ -2127,9 +2174,12 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) } else { vshPrintJobProgress(vshDomainBlockJobToString(info.type), info.end - info.cur, info.end); - if (info.bandwidth != 0) - vshPrint(ctl, _(" Bandwidth limit: %lu MiB/s"), - info.bandwidth); + if (speed) { + const char *unit; + double val = vshPrettyCapacity(speed, &unit); + vshPrint(ctl, _(" Bandwidth limit: %llu bytes/s (%-.3lf %s/s)"), + speed, val, unit); + } vshPrint(ctl, "\n"); } ret = true; diff --git a/tools/virsh.c b/tools/virsh.c index 713c9a5..9706acc 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -397,6 +397,7 @@ vshReconnect(vshControl *ctl) disconnected = 0; ctl->useGetInfo = false; ctl->useSnapshotOld = false; + ctl->blockJobNoBytes = false; } @@ -454,6 +455,7 @@ cmdConnect(vshControl *ctl, const vshCmd *cmd) ctl->useGetInfo = false; ctl->useSnapshotOld = false; + ctl->blockJobNoBytes = false; ctl->readonly = ro; ctl->conn = vshConnect(ctl, ctl->name, ctl->readonly); diff --git a/tools/virsh.h b/tools/virsh.h index b4df24b..7d5d8a2 100644 --- a/tools/virsh.h +++ b/tools/virsh.h @@ -238,6 +238,8 @@ struct _vshControl { virDomainGetState is not supported */ bool useSnapshotOld; /* cannot use virDomainSnapshotGetParent or virDomainSnapshotNumChildren */ + bool blockJobNoBytes; /* true if _BANDWIDTH_BYTE blockjob flags + are missing */ virThread eventLoop; virMutex lock; bool eventLoopStarted; diff --git a/tools/virsh.pod b/tools/virsh.pod index 3f1bf7e..406de5c 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -1013,7 +1013,7 @@ exclusive. If no flag is specified, behavior is different depending on hypervisor. =item B<blockjob> I<domain> I<path> { [I<--abort>] [I<--async>] [I<--pivot>] | -[I<--info>] [I<--raw>] | [I<bandwidth>] } +[I<--info>] [I<--raw>] [I<--bytes>] | [I<bandwidth>] } Manage active block operations. There are three mutually-exclusive modes: I<--info>, I<bandwidth>, and I<--abort>. I<--async> and I<--pivot> imply @@ -1034,7 +1034,11 @@ commit job be pivoted over to the new image. In I<--info> mode, the active job information on the specified disk will be printed. By default, the output is a single human-readable summary line; this format may change in future versions. Adding -I<--raw> lists each field of the struct, in a stable format. +I<--raw> lists each field of the struct, in a stable format. If the +I<--bytes> flag is set, then the command errors out if the server could +not supply bytes/s resolution; when omitting the flag, raw output is +listed in MiB/s and human-readable output automatically selects the +best resolution supported by the server. I<bandwidth> can be used to set bandwidth limit for the active job. Specifying a negative value is interpreted as an unsigned long long -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
Expose the new flag just added to virDomainGetBlockJobInfo. With --raw, the presence or absence of --bytes determines which flag to use in the single API call. Without --raw, the use of --bytes forces an error if the server doesn't support it, otherwise, the code tries to silently fall back to scaling the MiB/s value.
My goal is to eventually also support --bytes in bandwidth mode; but that's a bit further down the road (and needs a new API flag added in libvirt.h first).
This changes the human output, but the previous patch added raw output precisely so that we can have flexibility with the human output. For this commit, I used qemu-monitor-command to force an unusual bandwidth, but the same will be possible once qemu implements virDomainBlockCopy:
Before: Block Copy: [100 %] Bandwidth limit: 2 MiB/s After: Block Copy: [100 %] Bandwidth limit: 1048577 bytes/s (1.000 MiB/s)
The cache avoids having to repeatedly attempt a flag probe when talking to an older server, when multiple blockjob commands are issued during a batch session and the user is manually polling for job completion.
* tools/virsh.h (_vshControl): Add a cache. * tools/virsh.c (cmdConnect, vshReconnect): Initialize the cache. * tools/virsh-domain.c (opts_block_job): Add --bytes. * tools/virsh.pod (blockjob): Document this.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++------ tools/virsh.c | 2 ++ tools/virsh.h | 2 ++ tools/virsh.pod | 8 +++++-- 4 files changed, 67 insertions(+), 9 deletions(-)
ACK Peter

On 09/05/2014 03:05 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
Expose the new flag just added to virDomainGetBlockJobInfo. With --raw, the presence or absence of --bytes determines which flag to use in the single API call. Without --raw, the use of --bytes forces an error if the server doesn't support it, otherwise, the code tries to silently fall back to scaling the MiB/s value.
My goal is to eventually also support --bytes in bandwidth mode; but that's a bit further down the road (and needs a new API flag added in libvirt.h first).
the new flag hinted at is patch 14/14; I'm still working on the code for a v4 patch for actually setting in byte mode.
ACK
Pushed with a tweak I found during self-review: diff --git i/tools/virsh-domain.c w/tools/virsh-domain.c index 749fa67..2bb3bbc 100644 --- i/tools/virsh-domain.c +++ w/tools/virsh-domain.c @@ -2121,7 +2121,7 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) goto cleanup; - /* If bytes were requested, or if raw mode is not forcing MiB/s + /* If bytes were requested, or if raw mode is not forcing a MiB/s * query and cache can't prove failure, then query bytes/sec. */ if (bytes || !(raw || ctl->blockJobNoBytes)) { flags |= VIR_DOMAIN_BLOCK_JOB_INFO_BANDWIDTH_BYTES; @@ -2154,9 +2154,11 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) /* Scale to bytes/s unless in raw mode */ if (!raw) { speed <<= 20; - if (speed >> 20 != info.bandwidth) + if (speed >> 20 != info.bandwidth) { vshError(ctl, _("overflow in converting %ld MiB/s to bytes\n"), info.bandwidth); + goto cleanup; + } } } -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

To date, anyone performing a block copy and pivot ends up with the destination being treated as <disk type='file'>. While this works for data access for a block device, it has at least one noticeable shortcoming: virDomainGetBlockInfo() reports allocation differently for block devices visited as files (the size of the device) than for block devices visited as <disk type='block'> (the maximum sector used, as reported by qemu); and this difference is significant when trying to manage qcow2 format on block devices that can be grown as needed. Of course, the more powerful virDomainBlockCopy() API can already express the ability to set the <disk> type. But a new API can't be backported, while a new flag to an existing API can; and it is also rather inconvenient to have to resort to the full power of generating XML when just adding a flag to the older call will do the trick. So this patch enhances blockcopy to let the user flag when the resulting XML after the copy must list the device as type='block'. * include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_REBASE_COPY_DEV): New flag. * src/libvirt.c (virDomainBlockRebase): Document it. * tools/virsh-domain.c (opts_block_copy, blockJobImpl): Add --blockdev option. * tools/virsh.pod (blockcopy): Document it. * src/qemu/qemu_driver.c (qemuDomainBlockRebase): Allow new flag. (qemuDomainBlockCopy): Remember the flag, and make sure it is only used on actual block devices. * tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it. Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 2 ++ src/libvirt.c | 8 +++-- src/qemu/qemu_driver.c | 36 ++++++++++++++-------- .../qemuxml2argvdata/qemuxml2argv-disk-mirror.xml | 4 +-- tools/virsh-domain.c | 6 ++++ tools/virsh.pod | 7 +++-- 6 files changed, 45 insertions(+), 18 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 6d0c042..aced31c 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2638,6 +2638,8 @@ typedef enum { VIR_DOMAIN_BLOCK_REBASE_RELATIVE = 1 << 4, /* Keep backing chain referenced using relative names */ + VIR_DOMAIN_BLOCK_REBASE_COPY_DEV = 1 << 5, /* Treat destination as block + device instead of file */ } virDomainBlockRebaseFlags; int virDomainBlockRebase(virDomainPtr dom, const char *disk, diff --git a/src/libvirt.c b/src/libvirt.c index b00ee16..4806535 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -19885,7 +19885,10 @@ virDomainBlockPull(virDomainPtr dom, const char *disk, * pre-create files with relative backing file names, rather than the default * of absolute backing file names; as a security precaution, you should * generally only use reuse_ext with the shallow flag and a non-raw - * destination file. + * destination file. By default, the copy destination will be treated as + * type='file', but using VIR_DOMAIN_BLOCK_REBASE_COPY_DEV treats the + * destination as type='block' (affecting how virDomainGetBlockInfo() will + * report allocation after pivoting). * * A copy job has two parts; in the first phase, the @bandwidth parameter * affects how fast the source is pulled into the destination, and the job @@ -19960,7 +19963,8 @@ virDomainBlockRebase(virDomainPtr dom, const char *disk, virCheckNonNullArgGoto(base, error); } else if (flags & (VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | - VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)) { + VIR_DOMAIN_BLOCK_REBASE_COPY_RAW | + VIR_DOMAIN_BLOCK_REBASE_COPY_DEV)) { virReportInvalidArg(flags, _("use of flags in %s requires a copy job"), __FUNCTION__); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1ad682f..946c384 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15370,7 +15370,8 @@ qemuDomainBlockCopy(virDomainObjPtr vm, /* Preliminaries: find the disk we are editing, sanity checks */ virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | - VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT, -1); + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | + VIR_DOMAIN_BLOCK_REBASE_COPY_DEV, -1); priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); @@ -15431,25 +15432,34 @@ qemuDomainBlockCopy(virDomainObjPtr vm, virReportSystemError(errno, _("unable to stat for disk %s: %s"), disk->dst, dest); goto endjob; - } else if (flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT) { + } else if (flags & (VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | + VIR_DOMAIN_BLOCK_REBASE_COPY_DEV)) { virReportSystemError(errno, _("missing destination file for disk %s: %s"), disk->dst, dest); goto endjob; } - } else if (!S_ISBLK(st.st_mode) && st.st_size && - !(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("external destination file for disk %s already " - "exists and is not a block device: %s"), - disk->dst, dest); - goto endjob; + } else if (!S_ISBLK(st.st_mode)) { + if (st.st_size && !(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("external destination file for disk %s already " + "exists and is not a block device: %s"), + disk->dst, dest); + goto endjob; + } + if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_DEV) { + virReportError(VIR_ERR_INVALID_ARG, + _("blockdev flag requested for disk %s, but file " + "'%s' is not a block device"), disk->dst, dest); + goto endjob; + } } if (VIR_ALLOC(mirror) < 0) goto endjob; /* XXX Allow non-file mirror destinations */ - mirror->type = VIR_STORAGE_TYPE_FILE; + mirror->type = flags & VIR_DOMAIN_BLOCK_REBASE_COPY_DEV ? + VIR_STORAGE_TYPE_BLOCK : VIR_STORAGE_TYPE_FILE; if (format) { if ((mirror->format = virStorageFileFormatTypeFromString(format)) <= 0) { @@ -15544,7 +15554,8 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | VIR_DOMAIN_BLOCK_REBASE_COPY | VIR_DOMAIN_BLOCK_REBASE_COPY_RAW | - VIR_DOMAIN_BLOCK_REBASE_RELATIVE, -1); + VIR_DOMAIN_BLOCK_REBASE_RELATIVE | + VIR_DOMAIN_BLOCK_REBASE_COPY_DEV, -1); if (!(vm = qemuDomObjFromDomain(dom))) return -1; @@ -15581,7 +15592,8 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, } flags &= (VIR_DOMAIN_BLOCK_REBASE_SHALLOW | - VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT); + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | + VIR_DOMAIN_BLOCK_REBASE_COPY_DEV); ret = qemuDomainBlockCopy(vm, dom->conn, path, base, format, bandwidth, flags); vm = NULL; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml b/tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml index 46f2a3e..7495a45 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml +++ b/tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml @@ -17,8 +17,8 @@ <disk type='block' device='disk'> <source dev='/dev/HostVG/QEMUGuest1'/> <backingStore/> - <mirror type='file' file='/dev/HostVG/QEMUGuest1Copy' job='copy' ready='yes'> - <source file='/dev/HostVG/QEMUGuest1Copy'/> + <mirror type='block' job='copy' ready='yes'> + <source dev='/dev/HostVG/QEMUGuest1Copy'/> </mirror> <target dev='hda' bus='ide'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index eb3ed15..813d8e0 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1544,6 +1544,8 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; if (vshCommandOptBool(cmd, "raw")) flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + if (vshCommandOptBool(cmd, "blockdev")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; if (vshCommandOptStringReq(ctl, cmd, "dest", &base) < 0) goto cleanup; if (virDomainBlockRebase(dom, path, base, bandwidth, flags) < 0) @@ -1855,6 +1857,10 @@ static const vshCmdOptDef opts_block_copy[] = { .type = VSH_OT_BOOL, .help = N_("use raw destination file") }, + {.name = "blockdev", + .type = VSH_OT_BOOL, + .help = N_("copy destination is block device instead of regular file") + }, {.name = "wait", .type = VSH_OT_BOOL, .help = N_("wait for job to reach mirroring phase") diff --git a/tools/virsh.pod b/tools/virsh.pod index 406de5c..2923fd9 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -899,7 +899,8 @@ unlimited. The hypervisor can choose whether to reject the value or convert it to the maximum value allowed. =item B<blockcopy> I<domain> I<path> I<dest> [I<bandwidth>] [I<--shallow>] -[I<--reuse-external>] [I<--raw>] [I<--wait> [I<--async>] [I<--verbose>]] +[I<--reuse-external>] [I<--raw>] [I<--blockdev>] +[I<--wait> [I<--async>] [I<--verbose>]] [{I<--pivot> | I<--finish>}] [I<--timeout> B<seconds>] Copy a disk backing image chain to I<dest>. By default, this command @@ -917,7 +918,9 @@ The format of the destination is determined by the first match in the following list: if I<--raw> is specified, it will be raw; if I<--reuse-external> is specified, the existing destination is probed for a format; and in all other cases, the destination format will -match the source format. +match the source format. The destination is treated as a regular +file unless I<--blockdev> is used to signal that it is a block +device. By default, the copy job runs in the background, and consists of two phases. Initially, the job must copy all data from the source, and -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
To date, anyone performing a block copy and pivot ends up with the destination being treated as <disk type='file'>. While this works for data access for a block device, it has at least one noticeable shortcoming: virDomainGetBlockInfo() reports allocation differently for block devices visited as files (the size of the device) than for block devices visited as <disk type='block'> (the maximum sector used, as reported by qemu); and this difference is significant when trying to manage qcow2 format on block devices that can be grown as needed.
Of course, the more powerful virDomainBlockCopy() API can already express the ability to set the <disk> type. But a new API can't be backported, while a new flag to an existing API can; and it is also rather inconvenient to have to resort to the full power of generating XML when just adding a flag to the older call will do the trick. So this patch enhances blockcopy to let the user flag when the resulting XML after the copy must list the device as type='block'.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_REBASE_COPY_DEV): New flag. * src/libvirt.c (virDomainBlockRebase): Document it. * tools/virsh-domain.c (opts_block_copy, blockJobImpl): Add --blockdev option. * tools/virsh.pod (blockcopy): Document it. * src/qemu/qemu_driver.c (qemuDomainBlockRebase): Allow new flag. (qemuDomainBlockCopy): Remember the flag, and make sure it is only used on actual block devices. * tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 2 ++ src/libvirt.c | 8 +++-- src/qemu/qemu_driver.c | 36 ++++++++++++++-------- .../qemuxml2argvdata/qemuxml2argv-disk-mirror.xml | 4 +-- tools/virsh-domain.c | 6 ++++ tools/virsh.pod | 7 +++-- 6 files changed, 45 insertions(+), 18 deletions(-)
ACK Peter

On 09/05/2014 03:23 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
To date, anyone performing a block copy and pivot ends up with the destination being treated as <disk type='file'>. While this works for data access for a block device, it has at least one noticeable shortcoming: virDomainGetBlockInfo() reports allocation differently for block devices visited as files (the size of the device) than for block devices visited as <disk type='block'> (the maximum sector used, as reported by qemu); and this difference is significant when trying to manage qcow2 format on block devices that can be grown as needed.
ACK
Pushed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

I'm about to extend the capabilities of blockcopy. Hiding a few common lines of implementation gets in the way of the new required logic, and putting the new logic in the common implementation won't benefit any of the other blockjob operations. Therefore, it is simpler to just do the work inline. There should be no semantic change in this patch. * tools/virsh-domain.c (blockJobImpl): Drop unused variable. Move block copy guts... (cmdBlockCopy): ...into their lone caller. Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 813d8e0..b92b2eb 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1464,7 +1464,6 @@ typedef enum { VSH_CMD_BLOCK_JOB_ABORT, VSH_CMD_BLOCK_JOB_SPEED, VSH_CMD_BLOCK_JOB_PULL, - VSH_CMD_BLOCK_JOB_COPY, VSH_CMD_BLOCK_JOB_COMMIT, } vshCmdBlockJobMode; @@ -1536,21 +1535,6 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, if (virDomainBlockCommit(dom, path, base, top, bandwidth, flags) < 0) goto cleanup; break; - case VSH_CMD_BLOCK_JOB_COPY: - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; - if (vshCommandOptBool(cmd, "shallow")) - flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; - if (vshCommandOptBool(cmd, "reuse-external")) - flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; - if (vshCommandOptBool(cmd, "raw")) - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; - if (vshCommandOptBool(cmd, "blockdev")) - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; - if (vshCommandOptStringReq(ctl, cmd, "dest", &base) < 0) - goto cleanup; - if (virDomainBlockRebase(dom, path, base, bandwidth, flags) < 0) - goto cleanup; - break; } ret = true; @@ -1892,6 +1876,9 @@ static bool cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) { virDomainPtr dom = NULL; + const char *dest; + unsigned long bandwidth = 0; + unsigned int flags = VIR_DOMAIN_BLOCK_REBASE_COPY; bool ret = false; bool blocking = vshCommandOptBool(cmd, "wait"); bool verbose = vshCommandOptBool(cmd, "verbose"); @@ -1907,6 +1894,9 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) bool quit = false; int abort_flags = 0; + if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + return false; + blocking |= vshCommandOptBool(cmd, "timeout") || pivot || finish; if (blocking) { if (pivot && finish) { @@ -1915,8 +1905,6 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) } if (vshCommandOptTimeoutToMs(ctl, cmd, &timeout) < 0) return false; - if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) - return false; if (vshCommandOptBool(cmd, "async")) abort_flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC; @@ -1935,7 +1923,26 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) return false; } - if (!blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_COPY, &dom)) + if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) + goto cleanup; + + if (vshCommandOptULWrap(cmd, "bandwidth", &bandwidth) < 0) { + vshError(ctl, "%s", _("bandwidth must be a number")); + goto cleanup; + } + + if (vshCommandOptBool(cmd, "shallow")) + flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + if (vshCommandOptBool(cmd, "reuse-external")) + flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; + if (vshCommandOptBool(cmd, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + if (vshCommandOptBool(cmd, "blockdev")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; + if (vshCommandOptStringReq(ctl, cmd, "dest", &dest) < 0) + goto cleanup; + + if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) goto cleanup; if (!blocking) { -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
I'm about to extend the capabilities of blockcopy. Hiding a few common lines of implementation gets in the way of the new required logic, and putting the new logic in the common implementation won't benefit any of the other blockjob operations. Therefore, it is simpler to just do the work inline. There should be no semantic change in this patch.
* tools/virsh-domain.c (blockJobImpl): Drop unused variable. Move block copy guts... (cmdBlockCopy): ...into their lone caller.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-)
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 813d8e0..b92b2eb 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c
@@ -1907,6 +1894,9 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) bool quit = false; int abort_flags = 0;
+ if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + return false; + blocking |= vshCommandOptBool(cmd, "timeout") || pivot || finish; if (blocking) { if (pivot && finish) {
...
@@ -1935,7 +1923,26 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) return false; }
- if (!blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_COPY, &dom)) + if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) + goto cleanup; + + if (vshCommandOptULWrap(cmd, "bandwidth", &bandwidth) < 0) { + vshError(ctl, "%s", _("bandwidth must be a number")); + goto cleanup; + } + + if (vshCommandOptBool(cmd, "shallow")) + flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + if (vshCommandOptBool(cmd, "reuse-external")) + flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; + if (vshCommandOptBool(cmd, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + if (vshCommandOptBool(cmd, "blockdev")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; + if (vshCommandOptStringReq(ctl, cmd, "dest", &dest) < 0) + goto cleanup; +
You could extract the flags and string at the top where you extract @path to be consistent, but that's more bikeshedding than useful.
+ if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) goto cleanup;
if (!blocking) {
ACK as-is. Peter

On 09/05/14 11:29, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
I'm about to extend the capabilities of blockcopy. Hiding a few common lines of implementation gets in the way of the new required logic, and putting the new logic in the common implementation won't benefit any of the other blockjob operations. Therefore, it is simpler to just do the work inline. There should be no semantic change in this patch.
* tools/virsh-domain.c (blockJobImpl): Drop unused variable. Move block copy guts... (cmdBlockCopy): ...into their lone caller.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-)
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 813d8e0..b92b2eb 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c
@@ -1907,6 +1894,9 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) bool quit = false; int abort_flags = 0;
+ if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + return false; + blocking |= vshCommandOptBool(cmd, "timeout") || pivot || finish; if (blocking) { if (pivot && finish) {
...
@@ -1935,7 +1923,26 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) return false; }
- if (!blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_COPY, &dom)) + if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) + goto cleanup; + + if (vshCommandOptULWrap(cmd, "bandwidth", &bandwidth) < 0) { + vshError(ctl, "%s", _("bandwidth must be a number")); + goto cleanup; + } + + if (vshCommandOptBool(cmd, "shallow")) + flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + if (vshCommandOptBool(cmd, "reuse-external")) + flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; + if (vshCommandOptBool(cmd, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + if (vshCommandOptBool(cmd, "blockdev")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; + if (vshCommandOptStringReq(ctl, cmd, "dest", &dest) < 0) + goto cleanup; +
You could extract the flags and string at the top where you extract @path to be consistent, but that's more bikeshedding than useful.
Hmmm, next patch is moving it to the destination I've suggested. It'd be better to move it to the correct position right away then ...
+ if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) goto cleanup;
if (!blocking) {
ACK as-is.
Peter
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 09/05/14 11:30, Peter Krempa wrote:
On 09/05/14 11:29, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
I'm about to extend the capabilities of blockcopy. Hiding a few common lines of implementation gets in the way of the new required logic, and putting the new logic in the common implementation won't benefit any of the other blockjob operations. Therefore, it is simpler to just do the work inline. There should be no semantic change in this patch.
* tools/virsh-domain.c (blockJobImpl): Drop unused variable. Move block copy guts... (cmdBlockCopy): ...into their lone caller.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-)
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 813d8e0..b92b2eb 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c
@@ -1907,6 +1894,9 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) bool quit = false; int abort_flags = 0;
+ if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + return false; + blocking |= vshCommandOptBool(cmd, "timeout") || pivot || finish; if (blocking) { if (pivot && finish) {
...
@@ -1935,7 +1923,26 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) return false; }
- if (!blockJobImpl(ctl, cmd, VSH_CMD_BLOCK_JOB_COPY, &dom)) + if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) + goto cleanup; + + if (vshCommandOptULWrap(cmd, "bandwidth", &bandwidth) < 0) { + vshError(ctl, "%s", _("bandwidth must be a number")); + goto cleanup; + } + + if (vshCommandOptBool(cmd, "shallow")) + flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + if (vshCommandOptBool(cmd, "reuse-external")) + flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; + if (vshCommandOptBool(cmd, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + if (vshCommandOptBool(cmd, "blockdev")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; + if (vshCommandOptStringReq(ctl, cmd, "dest", &dest) < 0) + goto cleanup; +
You could extract the flags and string at the top where you extract @path to be consistent, but that's more bikeshedding than useful.
Hmmm, next patch is moving it to the destination I've suggested. It'd be better to move it to the correct position right away then ...
Actually the code is not moved but refactored. Never mind the comment above :) sorry for the noise. Peter

Expose the new power of virDomainBlockCopy through virsh (well, all but the finer-grained bandwidth, as that is its own can of worms for a later patch). Continue to use the older API where possible, for maximum compatibility. The command now requires either --dest (with optional --format and --blockdev), to directly describe the file destination, or --xml, to name a file that contains an XML description such as: <disk type='network'> <driver type='raw'/> <source protocol='gluster' name='vol1/img'> <host name='red'/> </source> </disk> Non-zero option parameters are converted into virTypedParameters, and if anything requires the new API, the command can synthesize appropriate XML even if the --dest option was used instead of --xml. The existing --raw flag remains for back-compat, but the preferred spelling is now --format=raw, since the new API now allows us to specify all formats rather than just a boolean raw to suppress probing. I hope I did justice in describing the effects of granularity and buf-size on how they get passed through to qemu. * tools/virsh-domain.c (cmdBlockCopy): Add new options --xml, --granularity, --buf-size, --format. Make --raw an alias for --format=raw. Call new API if new parameters are in use. * tools/virsh.pod (blockcopy): Document new options. Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 145 +++++++++++++++++++++++++++++++++++++++++++++------ tools/virsh.pod | 45 +++++++++------- 2 files changed, 156 insertions(+), 34 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index b92b2eb..16d7f05 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1818,11 +1818,10 @@ static const vshCmdOptDef opts_block_copy[] = { {.name = "path", .type = VSH_OT_DATA, .flags = VSH_OFLAG_REQ, - .help = N_("fully-qualified path of disk") + .help = N_("fully-qualified path of source disk") }, {.name = "dest", .type = VSH_OT_DATA, - .flags = VSH_OFLAG_REQ, .help = N_("path of the copy to create") }, {.name = "bandwidth", @@ -1838,8 +1837,8 @@ static const vshCmdOptDef opts_block_copy[] = { .help = N_("reuse existing destination") }, {.name = "raw", - .type = VSH_OT_BOOL, - .help = N_("use raw destination file") + .type = VSH_OT_ALIAS, + .help = "format=raw" }, {.name = "blockdev", .type = VSH_OT_BOOL, @@ -1869,6 +1868,22 @@ static const vshCmdOptDef opts_block_copy[] = { .type = VSH_OT_BOOL, .help = N_("with --wait, don't wait for cancel to finish") }, + {.name = "xml", + .type = VSH_OT_DATA, + .help = N_("filename containing XML description of the copy destination") + }, + {.name = "format", + .type = VSH_OT_DATA, + .help = N_("format of the destination file") + }, + {.name = "granularity", + .type = VSH_OT_INT, + .help = N_("power-of-two granularity to use during the copy") + }, + {.name = "buf-size", + .type = VSH_OT_INT, + .help = N_("maximum amount of in-flight data during the copy") + }, {.name = NULL} }; @@ -1876,14 +1891,18 @@ static bool cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) { virDomainPtr dom = NULL; - const char *dest; + const char *dest = NULL; + const char *format = NULL; unsigned long bandwidth = 0; - unsigned int flags = VIR_DOMAIN_BLOCK_REBASE_COPY; + unsigned int granularity = 0; + unsigned long long buf_size = 0; + unsigned int flags = 0; bool ret = false; bool blocking = vshCommandOptBool(cmd, "wait"); bool verbose = vshCommandOptBool(cmd, "verbose"); bool pivot = vshCommandOptBool(cmd, "pivot"); bool finish = vshCommandOptBool(cmd, "finish"); + bool blockdev = vshCommandOptBool(cmd, "blockdev"); int timeout = 0; struct sigaction sig_action; struct sigaction old_sig_action; @@ -1893,6 +1912,23 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) const char *path = NULL; bool quit = false; int abort_flags = 0; + const char *xml = NULL; + char *xmlstr = NULL; + virTypedParameterPtr params = NULL; + int nparams = 0; + + if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + return false; + if (vshCommandOptString(cmd, "dest", &dest) < 0) + return false; + if (vshCommandOptString(cmd, "xml", &xml) < 0) + return false; + if (vshCommandOptString(cmd, "format", &format) < 0) + return false; + + VSH_EXCLUSIVE_OPTIONS_VAR(dest, xml); + VSH_EXCLUSIVE_OPTIONS_VAR(format, xml); + VSH_EXCLUSIVE_OPTIONS_VAR(blockdev, xml); if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) return false; @@ -1926,24 +1962,101 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) goto cleanup; + /* XXX: Parse bandwidth as scaled input, rather than forcing + * MiB/s, and either reject negative input or treat it as 0 rather + * than trying to guess which value will work well across both + * APIs with their different sizes and scales. */ if (vshCommandOptULWrap(cmd, "bandwidth", &bandwidth) < 0) { vshError(ctl, "%s", _("bandwidth must be a number")); goto cleanup; } + if (vshCommandOptUInt(cmd, "granularity", &granularity) < 0) { + vshError(ctl, "%s", _("granularity must be a number")); + goto cleanup; + } + if (vshCommandOptULongLong(cmd, "buf-size", &buf_size) < 0) { + vshError(ctl, "%s", _("buf-size must be a number")); + goto cleanup; + } + if (xml) { + if (virFileReadAll(xml, 8192, &xmlstr) < 0) { + vshReportError(ctl); + goto cleanup; + } + } else if (!dest) { + vshError(ctl, "%s", _("need either --dest or --xml")); + goto cleanup; + } + + /* Exploit that VIR_DOMAIN_BLOCK_REBASE_* and + * VIR_DOMAIN_BLOCK_COPY_* flags have the same values. */ if (vshCommandOptBool(cmd, "shallow")) flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; if (vshCommandOptBool(cmd, "reuse-external")) flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; - if (vshCommandOptBool(cmd, "raw")) - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; - if (vshCommandOptBool(cmd, "blockdev")) - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; - if (vshCommandOptStringReq(ctl, cmd, "dest", &dest) < 0) - goto cleanup; - - if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) - goto cleanup; + + if (granularity || buf_size || (format && STRNEQ(format, "raw")) || xml) { + /* New API */ + if (bandwidth || granularity || buf_size) { + params = vshCalloc(ctl, 3, sizeof(*params)); + if (bandwidth) { + /* bandwidth is ulong MiB/s, but the typed parameter is + * ullong bytes/s; make sure we don't overflow */ + if (bandwidth > LLONG_MAX >> 20) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu"), + LLONG_MAX >> 20); + } + if (virTypedParameterAssign(¶ms[nparams++], + VIR_DOMAIN_BLOCK_COPY_BANDWIDTH, + VIR_TYPED_PARAM_ULLONG, + bandwidth << 20ULL) < 0) + goto cleanup; + } + if (granularity && + virTypedParameterAssign(¶ms[nparams++], + VIR_DOMAIN_BLOCK_COPY_GRANULARITY, + VIR_TYPED_PARAM_UINT, + granularity) < 0) + goto cleanup; + if (buf_size && + virTypedParameterAssign(¶ms[nparams++], + VIR_DOMAIN_BLOCK_COPY_BUF_SIZE, + VIR_TYPED_PARAM_ULLONG, + buf_size) < 0) + goto cleanup; + } + + if (!xmlstr) { + virBuffer buf = VIR_BUFFER_INITIALIZER; + virBufferEscapeString(&buf, "<disk type='%s'>\n", + blockdev ? "block" : "file"); + virBufferAdjustIndent(&buf, 2); + virBufferEscapeString(&buf, "<source %s", + blockdev ? "dev" : "file"); + virBufferEscapeString(&buf, "='%s'/>\n", dest); + virBufferEscapeString(&buf, "<driver type='%s'/>\n", format); + virBufferAdjustIndent(&buf, -2); + virBufferAddLit(&buf, "</disk>\n"); + if (virBufferCheckError(&buf) < 0) + goto cleanup; + xmlstr = virBufferContentAndReset(&buf); + } + + if (virDomainBlockCopy(dom, path, xmlstr, params, nparams, flags) < 0) + goto cleanup; + } else { + /* Old API */ + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; + if (vshCommandOptBool(cmd, "blockdev")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; + if (STREQ_NULLABLE(format, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + + if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) + goto cleanup; + } if (!blocking) { vshPrint(ctl, "%s", _("Block Copy started")); @@ -2014,6 +2127,8 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) ret = true; cleanup: + VIR_FREE(xmlstr); + virTypedParamsFree(params, nparams); if (dom) virDomainFree(dom); if (blocking) diff --git a/tools/virsh.pod b/tools/virsh.pod index 2923fd9..c3a76dc 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -898,30 +898,31 @@ value is interpreted as an unsigned long long value or essentially unlimited. The hypervisor can choose whether to reject the value or convert it to the maximum value allowed. -=item B<blockcopy> I<domain> I<path> I<dest> [I<bandwidth>] [I<--shallow>] -[I<--reuse-external>] [I<--raw>] [I<--blockdev>] -[I<--wait> [I<--async>] [I<--verbose>]] -[{I<--pivot> | I<--finish>}] [I<--timeout> B<seconds>] +=item B<blockcopy> I<domain> I<path> { I<dest> [I<format>] [I<--blockdev>] +| I<xml> } [I<--shallow>] [I<--reuse-external>] [I<bandwidth>] +[I<--wait> [I<--async>] [I<--verbose>]] [{I<--pivot> | I<--finish>}] +[I<--timeout> B<seconds>] [I<granularity>] [I<buf-size>] -Copy a disk backing image chain to I<dest>. By default, this command -flattens the entire chain; but if I<--shallow> is specified, the copy -shares the backing chain. +Copy a disk backing image chain to a destination. Either I<dest> as +the destination file name, or I<xml> as the name of an XML file containing +a top-level <disk> element describing the destination, must be present. +Additionally, if I<dest> is given, I<format> should be specified to declare +the format of the destination (if I<format> is omitted, then libvirt +will reuse the format of the source, or with I<--reuse-external> will +be forced to probe the destination format, which could be a potential +security hole). The command supports I<--raw> as a boolean flag synonym for +I<--format=raw>. When using I<dest>, the destination is treated as a regular +file unless I<--blockdev> is used to signal that it is a block device. By +default, this command flattens the entire chain; but if I<--shallow> is +specified, the copy shares the backing chain. -If I<--reuse-external> is specified, then I<dest> must exist and have +If I<--reuse-external> is specified, then the destination must exist and have sufficient space to hold the copy. If I<--shallow> is used in conjunction with I<--reuse-external> then the pre-created image must have guest visible contents identical to guest visible contents of the backing file of the original image. This may be used to modify the backing file names on the destination. -The format of the destination is determined by the first match in the -following list: if I<--raw> is specified, it will be raw; if -I<--reuse-external> is specified, the existing destination is probed -for a format; and in all other cases, the destination format will -match the source format. The destination is treated as a regular -file unless I<--blockdev> is used to signal that it is a block -device. - By default, the copy job runs in the background, and consists of two phases. Initially, the job must copy all data from the source, and during this phase, the job can only be canceled to revert back to the @@ -943,9 +944,15 @@ while longer until the job has actually cancelled. I<path> specifies fully-qualified path of the disk. I<bandwidth> specifies copying bandwidth limit in MiB/s. Specifying a negative -value is interpreted as an unsigned long long value or essentially -unlimited. The hypervisor can choose whether to reject the value or -convert it to the maximum value allowed. +value is interpreted as an unsigned long long value that might be essentially +unlimited, but more likely would overflow; it is safer to use 0 for that +purpose. Specifying I<granularity> allows fine-tuning of the granularity that +will be copied when a dirty region is detected; larger values trigger less +I/O overhead but may end up copying more data overall (the default value is +usually correct); this value must be a power of two. Specifying I<buf-size> +will control how much data can be simultaneously in-flight during the copy; +larger values use more memory but may allow faster completion (the default +value is usually correct). =item B<blockpull> I<domain> I<path> [I<bandwidth>] [I<base>] [I<--wait> [I<--verbose>] [I<--timeout> B<seconds>] [I<--async>]] -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
Expose the new power of virDomainBlockCopy through virsh (well, all but the finer-grained bandwidth, as that is its own can of worms for a later patch). Continue to use the older API where possible, for maximum compatibility.
The command now requires either --dest (with optional --format and --blockdev), to directly describe the file destination, or --xml, to name a file that contains an XML description such as:
<disk type='network'> <driver type='raw'/> <source protocol='gluster' name='vol1/img'> <host name='red'/> </source> </disk>
Non-zero option parameters are converted into virTypedParameters, and if anything requires the new API, the command can synthesize appropriate XML even if the --dest option was used instead of --xml.
The existing --raw flag remains for back-compat, but the preferred spelling is now --format=raw, since the new API now allows us to specify all formats rather than just a boolean raw to suppress probing.
I hope I did justice in describing the effects of granularity and buf-size on how they get passed through to qemu.
* tools/virsh-domain.c (cmdBlockCopy): Add new options --xml, --granularity, --buf-size, --format. Make --raw an alias for --format=raw. Call new API if new parameters are in use. * tools/virsh.pod (blockcopy): Document new options.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 145 +++++++++++++++++++++++++++++++++++++++++++++------ tools/virsh.pod | 45 +++++++++------- 2 files changed, 156 insertions(+), 34 deletions(-)
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index b92b2eb..16d7f05 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1893,6 +1912,23 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) const char *path = NULL; bool quit = false; int abort_flags = 0; + const char *xml = NULL; + char *xmlstr = NULL; + virTypedParameterPtr params = NULL; + int nparams = 0; + + if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + return false; + if (vshCommandOptString(cmd, "dest", &dest) < 0) + return false; + if (vshCommandOptString(cmd, "xml", &xml) < 0) + return false; + if (vshCommandOptString(cmd, "format", &format) < 0) + return false; + + VSH_EXCLUSIVE_OPTIONS_VAR(dest, xml);
These were designed to work on booleans, but pointers are fortunately sharing the semantics :)
+ VSH_EXCLUSIVE_OPTIONS_VAR(format, xml); + VSH_EXCLUSIVE_OPTIONS_VAR(blockdev, xml);
if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) return false; @@ -1926,24 +1962,101 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) goto cleanup;
+ /* XXX: Parse bandwidth as scaled input, rather than forcing + * MiB/s, and either reject negative input or treat it as 0 rather + * than trying to guess which value will work well across both + * APIs with their different sizes and scales. */ if (vshCommandOptULWrap(cmd, "bandwidth", &bandwidth) < 0) { vshError(ctl, "%s", _("bandwidth must be a number")); goto cleanup; } + if (vshCommandOptUInt(cmd, "granularity", &granularity) < 0) { + vshError(ctl, "%s", _("granularity must be a number")); + goto cleanup; + } + if (vshCommandOptULongLong(cmd, "buf-size", &buf_size) < 0) { + vshError(ctl, "%s", _("buf-size must be a number")); + goto cleanup; + }
+ if (xml) { + if (virFileReadAll(xml, 8192, &xmlstr) < 0) { + vshReportError(ctl); + goto cleanup; + } + } else if (!dest) { + vshError(ctl, "%s", _("need either --dest or --xml")); + goto cleanup; + } + + /* Exploit that VIR_DOMAIN_BLOCK_REBASE_* and + * VIR_DOMAIN_BLOCK_COPY_* flags have the same values. */ if (vshCommandOptBool(cmd, "shallow")) flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; if (vshCommandOptBool(cmd, "reuse-external")) flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; - if (vshCommandOptBool(cmd, "raw")) - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; - if (vshCommandOptBool(cmd, "blockdev")) - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; - if (vshCommandOptStringReq(ctl, cmd, "dest", &dest) < 0) - goto cleanup; - - if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) - goto cleanup; + + if (granularity || buf_size || (format && STRNEQ(format, "raw")) || xml) { + /* New API */ + if (bandwidth || granularity || buf_size) { + params = vshCalloc(ctl, 3, sizeof(*params)); + if (bandwidth) { + /* bandwidth is ulong MiB/s, but the typed parameter is + * ullong bytes/s; make sure we don't overflow */ + if (bandwidth > LLONG_MAX >> 20) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu"), + LLONG_MAX >> 20); + } + if (virTypedParameterAssign(¶ms[nparams++], + VIR_DOMAIN_BLOCK_COPY_BANDWIDTH, + VIR_TYPED_PARAM_ULLONG, + bandwidth << 20ULL) < 0) + goto cleanup; + } + if (granularity && + virTypedParameterAssign(¶ms[nparams++], + VIR_DOMAIN_BLOCK_COPY_GRANULARITY, + VIR_TYPED_PARAM_UINT, + granularity) < 0) + goto cleanup; + if (buf_size && + virTypedParameterAssign(¶ms[nparams++], + VIR_DOMAIN_BLOCK_COPY_BUF_SIZE, + VIR_TYPED_PARAM_ULLONG, + buf_size) < 0) + goto cleanup; + } + + if (!xmlstr) { + virBuffer buf = VIR_BUFFER_INITIALIZER; + virBufferEscapeString(&buf, "<disk type='%s'>\n", + blockdev ? "block" : "file");
This one doesn't really need escaping
+ virBufferAdjustIndent(&buf, 2); + virBufferEscapeString(&buf, "<source %s", + blockdev ? "dev" : "file");
and this either
+ virBufferEscapeString(&buf, "='%s'/>\n", dest); + virBufferEscapeString(&buf, "<driver type='%s'/>\n", format); + virBufferAdjustIndent(&buf, -2); + virBufferAddLit(&buf, "</disk>\n"); + if (virBufferCheckError(&buf) < 0) + goto cleanup; + xmlstr = virBufferContentAndReset(&buf); + } + + if (virDomainBlockCopy(dom, path, xmlstr, params, nparams, flags) < 0) + goto cleanup; + } else { + /* Old API */ + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; + if (vshCommandOptBool(cmd, "blockdev")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_DEV; + if (STREQ_NULLABLE(format, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + + if (virDomainBlockRebase(dom, path, dest, bandwidth, flags) < 0) + goto cleanup; + }
if (!blocking) { vshPrint(ctl, "%s", _("Block Copy started")); @@ -2014,6 +2127,8 @@ cmdBlockCopy(vshControl *ctl, const vshCmd *cmd)
ret = true; cleanup: + VIR_FREE(xmlstr); + virTypedParamsFree(params, nparams); if (dom) virDomainFree(dom); if (blocking) diff --git a/tools/virsh.pod b/tools/virsh.pod index 2923fd9..c3a76dc 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -898,30 +898,31 @@ value is interpreted as an unsigned long long value or essentially unlimited. The hypervisor can choose whether to reject the value or convert it to the maximum value allowed.
-=item B<blockcopy> I<domain> I<path> I<dest> [I<bandwidth>] [I<--shallow>] -[I<--reuse-external>] [I<--raw>] [I<--blockdev>] -[I<--wait> [I<--async>] [I<--verbose>]] -[{I<--pivot> | I<--finish>}] [I<--timeout> B<seconds>] +=item B<blockcopy> I<domain> I<path> { I<dest> [I<format>] [I<--blockdev>] +| I<xml> } [I<--shallow>] [I<--reuse-external>] [I<bandwidth>] +[I<--wait> [I<--async>] [I<--verbose>]] [{I<--pivot> | I<--finish>}] +[I<--timeout> B<seconds>] [I<granularity>] [I<buf-size>]
-Copy a disk backing image chain to I<dest>. By default, this command -flattens the entire chain; but if I<--shallow> is specified, the copy -shares the backing chain. +Copy a disk backing image chain to a destination. Either I<dest> as +the destination file name, or I<xml> as the name of an XML file containing +a top-level <disk> element describing the destination, must be present. +Additionally, if I<dest> is given, I<format> should be specified to declare +the format of the destination (if I<format> is omitted, then libvirt +will reuse the format of the source, or with I<--reuse-external> will +be forced to probe the destination format, which could be a potential +security hole). The command supports I<--raw> as a boolean flag synonym for +I<--format=raw>. When using I<dest>, the destination is treated as a regular +file unless I<--blockdev> is used to signal that it is a block device. By +default, this command flattens the entire chain; but if I<--shallow> is +specified, the copy shares the backing chain.
-If I<--reuse-external> is specified, then I<dest> must exist and have +If I<--reuse-external> is specified, then the destination must exist and have sufficient space to hold the copy. If I<--shallow> is used in conjunction with I<--reuse-external> then the pre-created image must have guest visible contents identical to guest visible contents of the backing file of the original image. This may be used to modify the backing file names on the destination.
-The format of the destination is determined by the first match in the -following list: if I<--raw> is specified, it will be raw; if -I<--reuse-external> is specified, the existing destination is probed -for a format; and in all other cases, the destination format will -match the source format. The destination is treated as a regular -file unless I<--blockdev> is used to signal that it is a block -device. - By default, the copy job runs in the background, and consists of two phases. Initially, the job must copy all data from the source, and during this phase, the job can only be canceled to revert back to the @@ -943,9 +944,15 @@ while longer until the job has actually cancelled.
I<path> specifies fully-qualified path of the disk. I<bandwidth> specifies copying bandwidth limit in MiB/s. Specifying a negative -value is interpreted as an unsigned long long value or essentially -unlimited. The hypervisor can choose whether to reject the value or -convert it to the maximum value allowed. +value is interpreted as an unsigned long long value that might be essentially +unlimited, but more likely would overflow; it is safer to use 0 for that +purpose. Specifying I<granularity> allows fine-tuning of the granularity that +will be copied when a dirty region is detected; larger values trigger less +I/O overhead but may end up copying more data overall (the default value is +usually correct); this value must be a power of two. Specifying I<buf-size> +will control how much data can be simultaneously in-flight during the copy; +larger values use more memory but may allow faster completion (the default +value is usually correct).
=item B<blockpull> I<domain> I<path> [I<bandwidth>] [I<base>] [I<--wait> [I<--verbose>] [I<--timeout> B<seconds>] [I<--async>]]
ACK as is, all comments are pure cosmetic. Peter

On 09/05/2014 03:41 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
Expose the new power of virDomainBlockCopy through virsh (well, all but the finer-grained bandwidth, as that is its own can of worms for a later patch). Continue to use the older API where possible, for maximum compatibility.
+ if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) + return false;
Oops, in all my refactoring, I missed that this statement...[1]
+ if (vshCommandOptString(cmd, "dest", &dest) < 0) + return false; + if (vshCommandOptString(cmd, "xml", &xml) < 0) + return false; + if (vshCommandOptString(cmd, "format", &format) < 0) + return false; + + VSH_EXCLUSIVE_OPTIONS_VAR(dest, xml);
These were designed to work on booleans, but pointers are fortunately sharing the semantics
Yep, convenient :)
+ VSH_EXCLUSIVE_OPTIONS_VAR(format, xml); + VSH_EXCLUSIVE_OPTIONS_VAR(blockdev, xml);
if (vshCommandOptStringReq(ctl, cmd, "path", &path) < 0) return false;
[1]...was already present.
+ + if (!xmlstr) { + virBuffer buf = VIR_BUFFER_INITIALIZER; + virBufferEscapeString(&buf, "<disk type='%s'>\n", + blockdev ? "block" : "file");
This one doesn't really need escaping
+ virBufferAdjustIndent(&buf, 2); + virBufferEscapeString(&buf, "<source %s", + blockdev ? "dev" : "file");
and this either
Okay, I switched them to virBufferAsprintf.
ACK as is, all comments are pure cosmetic.
Now pushed with those fixes. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Fairly straightforward - I got lucky that the generated functions worked out of the box :) * src/remote/remote_protocol.x (remote_domain_block_copy_args): New struct. (REMOTE_PROC_DOMAIN_BLOCK_COPY): New RPC. * src/remote/remote_driver.c (remote_driver): Wire it up. * src/remote_protocol-structs: Regenerate. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 18 +++++++++++++++++- src/remote_protocol-structs | 11 +++++++++++ 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index fda27f7..21da894 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -8108,6 +8108,7 @@ static virDriver remote_driver = { .domainBlockJobSetSpeed = remoteDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = remoteDomainBlockPull, /* 0.9.4 */ .domainBlockRebase = remoteDomainBlockRebase, /* 0.9.10 */ + .domainBlockCopy = remoteDomainBlockCopy, /* 1.2.8 */ .domainBlockCommit = remoteDomainBlockCommit, /* 0.10.2 */ .connectSetKeepAlive = remoteConnectSetKeepAlive, /* 0.9.8 */ .connectIsAlive = remoteConnectIsAlive, /* 0.9.8 */ diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 8fc552f..a4ca0c3 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -124,6 +124,9 @@ const REMOTE_DOMAIN_BLOCK_IO_TUNE_PARAMETERS_MAX = 16; /* Upper limit on list of numa parameters. */ const REMOTE_DOMAIN_NUMA_PARAMETERS_MAX = 16; +/* Upper limit on block copy tunable parameters. */ +const REMOTE_DOMAIN_BLOCK_COPY_PARAMETERS_MAX = 16; + /* Upper limit on list of node cpu stats. */ const REMOTE_NODE_CPU_STATS_MAX = 16; @@ -1284,6 +1287,13 @@ struct remote_domain_block_rebase_args { unsigned hyper bandwidth; unsigned int flags; }; +struct remote_domain_block_copy_args { + remote_nonnull_domain dom; + remote_nonnull_string path; + remote_nonnull_string destxml; + remote_typed_param params<REMOTE_DOMAIN_BLOCK_COPY_PARAMETERS_MAX>; + unsigned int flags; +}; struct remote_domain_block_commit_args { remote_nonnull_domain dom; remote_nonnull_string disk; @@ -5456,5 +5466,11 @@ enum remote_procedure { * @acl: connect:search_domains * @aclfilter: domain:read */ - REMOTE_PROC_CONNECT_GET_ALL_DOMAIN_STATS = 344 + REMOTE_PROC_CONNECT_GET_ALL_DOMAIN_STATS = 344, + + /** + * @generate: both + * @acl: domain:block_write + */ + REMOTE_PROC_DOMAIN_BLOCK_COPY = 345 }; diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 899f1cc..e960415 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -916,6 +916,16 @@ struct remote_domain_block_rebase_args { uint64_t bandwidth; u_int flags; }; +struct remote_domain_block_copy_args { + remote_nonnull_domain dom; + remote_nonnull_string path; + remote_nonnull_string destxml; + struct { + u_int params_len; + remote_typed_param * params_val; + } params; + u_int flags; +}; struct remote_domain_block_commit_args { remote_nonnull_domain dom; remote_nonnull_string disk; @@ -2890,4 +2900,5 @@ enum remote_procedure { REMOTE_PROC_CONNECT_GET_DOMAIN_CAPABILITIES = 342, REMOTE_PROC_DOMAIN_OPEN_GRAPHICS_FD = 343, REMOTE_PROC_CONNECT_GET_ALL_DOMAIN_STATS = 344, + REMOTE_PROC_DOMAIN_BLOCK_COPY = 345, }; -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
Fairly straightforward - I got lucky that the generated functions worked out of the box :)
* src/remote/remote_protocol.x (remote_domain_block_copy_args): New struct. (REMOTE_PROC_DOMAIN_BLOCK_COPY): New RPC. * src/remote/remote_driver.c (remote_driver): Wire it up. * src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 18 +++++++++++++++++- src/remote_protocol-structs | 11 +++++++++++ 3 files changed, 29 insertions(+), 1 deletion(-)
ACK, Peter

On 09/05/2014 03:42 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
Fairly straightforward - I got lucky that the generated functions worked out of the box :)
* src/remote/remote_protocol.x (remote_domain_block_copy_args): New struct. (REMOTE_PROC_DOMAIN_BLOCK_COPY): New RPC. * src/remote/remote_driver.c (remote_driver): Wire it up. * src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 18 +++++++++++++++++- src/remote_protocol-structs | 11 +++++++++++ 3 files changed, 29 insertions(+), 1 deletion(-)
ACK,
Pushed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

In order to implement the new virDomainBlockCopy, the existing block copy internal implementation needs to be adjusted. The new function will parse XML into a storage source, and parse typed parameters into integers, then call into the same common backend. For now, it's easier to keep the same implementation limits that only local file destinations are suported, but now the check needs to be explicit. * src/qemu/qemu_driver.c (qemuDomainBlockCopy): Rename... (qemuDomainBlockCopyCommon): ...and adjust parameters. (qemuDomainBlockRebase): Adjust caller. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 108 ++++++++++++++++++++++++++----------------------- 1 file changed, 58 insertions(+), 50 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 946c384..d3f1042 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15351,11 +15351,12 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, /* bandwidth in bytes/s */ static int -qemuDomainBlockCopy(virDomainObjPtr vm, - virConnectPtr conn, - const char *path, - const char *dest, const char *format, - unsigned long long bandwidth, unsigned int flags) +qemuDomainBlockCopyCommon(virDomainObjPtr vm, + virConnectPtr conn, + const char *path, + virStorageSourcePtr dest, + unsigned long long bandwidth, + unsigned int flags) { virQEMUDriverPtr driver = conn->privateData; qemuDomainObjPrivatePtr priv; @@ -15367,11 +15368,12 @@ qemuDomainBlockCopy(virDomainObjPtr vm, bool need_unlink = false; virStorageSourcePtr mirror = NULL; virQEMUDriverConfigPtr cfg = NULL; + const char *format = NULL; + int desttype = virStorageSourceGetActualType(dest); /* Preliminaries: find the disk we are editing, sanity checks */ - virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | - VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | - VIR_DOMAIN_BLOCK_REBASE_COPY_DEV, -1); + virCheckFlags(VIR_DOMAIN_BLOCK_COPY_SHALLOW | + VIR_DOMAIN_BLOCK_COPY_REUSE_EXT, -1); priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); @@ -15416,8 +15418,8 @@ qemuDomainBlockCopy(virDomainObjPtr vm, if (qemuDomainDetermineDiskChain(driver, vm, disk, false) < 0) goto endjob; - if ((flags & VIR_DOMAIN_BLOCK_REBASE_SHALLOW) && - STREQ_NULLABLE(format, "raw") && + if ((flags & VIR_DOMAIN_BLOCK_COPY_SHALLOW) && + dest->format == VIR_STORAGE_FILE_RAW && disk->src->backingStore->path) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("disk '%s' has backing file, so raw shallow copy " @@ -15427,72 +15429,68 @@ qemuDomainBlockCopy(virDomainObjPtr vm, } /* Prepare the destination file. */ - if (stat(dest, &st) < 0) { + /* XXX Allow non-file mirror destinations */ + if (!virStorageSourceIsLocalStorage(dest)) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("non-file destination not supported yet")); + } + if (stat(dest->path, &st) < 0) { if (errno != ENOENT) { virReportSystemError(errno, _("unable to stat for disk %s: %s"), - disk->dst, dest); + disk->dst, dest->path); goto endjob; - } else if (flags & (VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | - VIR_DOMAIN_BLOCK_REBASE_COPY_DEV)) { + } else if (flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT || + desttype == VIR_STORAGE_TYPE_BLOCK) { virReportSystemError(errno, _("missing destination file for disk %s: %s"), - disk->dst, dest); + disk->dst, dest->path); goto endjob; } } else if (!S_ISBLK(st.st_mode)) { - if (st.st_size && !(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { + if (st.st_size && !(flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("external destination file for disk %s already " "exists and is not a block device: %s"), - disk->dst, dest); + disk->dst, dest->path); goto endjob; } - if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_DEV) { + if (desttype == VIR_STORAGE_TYPE_BLOCK) { virReportError(VIR_ERR_INVALID_ARG, _("blockdev flag requested for disk %s, but file " - "'%s' is not a block device"), disk->dst, dest); + "'%s' is not a block device"), + disk->dst, dest->path); goto endjob; } } - if (VIR_ALLOC(mirror) < 0) + if (!(mirror = virStorageSourceCopy(dest, false))) goto endjob; - /* XXX Allow non-file mirror destinations */ - mirror->type = flags & VIR_DOMAIN_BLOCK_REBASE_COPY_DEV ? - VIR_STORAGE_TYPE_BLOCK : VIR_STORAGE_TYPE_FILE; - if (format) { - if ((mirror->format = virStorageFileFormatTypeFromString(format)) <= 0) { - virReportError(VIR_ERR_INVALID_ARG, _("unrecognized format '%s'"), - format); - goto endjob; - } - } else { - if (!(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { + if (!dest->format) { + if (!(flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT)) { mirror->format = disk->src->format; } else { /* If the user passed the REUSE_EXT flag, then either they - * also passed the RAW flag (and format is non-NULL), or it is - * safe for us to probe the format from the file that we will - * be using. */ - mirror->format = virStorageFileProbeFormat(dest, cfg->user, + * can also pass the RAW flag or use XML to tell us the format. + * So if we get here, we assume it is safe for us to probe the + * format from the file that we will be using. */ + mirror->format = virStorageFileProbeFormat(dest->path, cfg->user, cfg->group); } } /* pre-create the image file */ - if (!(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { - int fd = qemuOpenFile(driver, vm, dest, O_WRONLY | O_TRUNC | O_CREAT, + if (!(flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT)) { + int fd = qemuOpenFile(driver, vm, dest->path, + O_WRONLY | O_TRUNC | O_CREAT, &need_unlink, NULL); if (fd < 0) goto endjob; VIR_FORCE_CLOSE(fd); } - if (!format && mirror->format > 0) + if (mirror->format > 0) format = virStorageFileFormatTypeToString(mirror->format); - if (VIR_STRDUP(mirror->path, dest) < 0) - goto endjob; if (virStorageSourceInitChainElement(mirror, disk->src, false) < 0) goto endjob; @@ -15506,8 +15504,8 @@ qemuDomainBlockCopy(virDomainObjPtr vm, /* Actually start the mirroring */ qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorDriveMirror(priv->mon, device, dest, format, bandwidth, - flags); + ret = qemuMonitorDriveMirror(priv->mon, device, dest->path, format, + bandwidth, flags); virDomainAuditDisk(vm, NULL, mirror, "mirror", ret >= 0); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) { @@ -15527,8 +15525,8 @@ qemuDomainBlockCopy(virDomainObjPtr vm, vm->def->name); endjob: - if (need_unlink && unlink(dest)) - VIR_WARN("unable to unlink just-created %s", dest); + if (need_unlink && unlink(dest->path)) + VIR_WARN("unable to unlink just-created %s", dest->path); virStorageSourceFree(mirror); if (!qemuDomainObjEndJob(driver, vm)) vm = NULL; @@ -15546,9 +15544,9 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { virDomainObjPtr vm; - const char *format = NULL; int ret = -1; unsigned long long speed = bandwidth; + virStorageSourcePtr dest = NULL; virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | @@ -15570,8 +15568,14 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, BLOCK_JOB_PULL, flags); /* If we got here, we are doing a block copy rebase. */ + if (VIR_ALLOC(dest) < 0) + goto cleanup; + dest->type = (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_DEV) ? + VIR_STORAGE_TYPE_BLOCK : VIR_STORAGE_TYPE_FILE; + if (VIR_STRDUP(dest->path, base) < 0) + goto cleanup; if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) - format = "raw"; + dest->format = VIR_STORAGE_FILE_RAW; /* Convert bandwidth MiB to bytes */ if (speed > LLONG_MAX >> 20) { @@ -15591,15 +15595,19 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, goto cleanup; } + /* We rely on the fact that VIR_DOMAIN_BLOCK_REBASE_SHALLOW + * and VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT map to the same values + * as for block copy. */ flags &= (VIR_DOMAIN_BLOCK_REBASE_SHALLOW | - VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | - VIR_DOMAIN_BLOCK_REBASE_COPY_DEV); - ret = qemuDomainBlockCopy(vm, dom->conn, path, base, format, - bandwidth, flags); + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT); + ret = qemuDomainBlockCopyCommon(vm, dom->conn, path, dest, + bandwidth, flags); vm = NULL; + cleanup: if (vm) virObjectUnlock(vm); + virStorageSourceFree(dest); return ret; } -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
In order to implement the new virDomainBlockCopy, the existing block copy internal implementation needs to be adjusted. The new function will parse XML into a storage source, and parse typed parameters into integers, then call into the same common backend. For now, it's easier to keep the same implementation limits that only local file destinations are suported, but now the check needs to be explicit.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Rename... (qemuDomainBlockCopyCommon): ...and adjust parameters. (qemuDomainBlockRebase): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 108 ++++++++++++++++++++++++++----------------------- 1 file changed, 58 insertions(+), 50 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 946c384..d3f1042 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15351,11 +15351,12 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path,
/* bandwidth in bytes/s */ static int -qemuDomainBlockCopy(virDomainObjPtr vm, - virConnectPtr conn, - const char *path, - const char *dest, const char *format, - unsigned long long bandwidth, unsigned int flags)
You should mention that this function consumes @dest.
+qemuDomainBlockCopyCommon(virDomainObjPtr vm, + virConnectPtr conn, + const char *path, + virStorageSourcePtr dest, + unsigned long long bandwidth, + unsigned int flags) { virQEMUDriverPtr driver = conn->privateData; qemuDomainObjPrivatePtr priv; @@ -15367,11 +15368,12 @@ qemuDomainBlockCopy(virDomainObjPtr vm, bool need_unlink = false; virStorageSourcePtr mirror = NULL; virQEMUDriverConfigPtr cfg = NULL; + const char *format = NULL; + int desttype = virStorageSourceGetActualType(dest);
/* Preliminaries: find the disk we are editing, sanity checks */ - virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | - VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | - VIR_DOMAIN_BLOCK_REBASE_COPY_DEV, -1); + virCheckFlags(VIR_DOMAIN_BLOCK_COPY_SHALLOW | + VIR_DOMAIN_BLOCK_COPY_REUSE_EXT, -1);
priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); @@ -15416,8 +15418,8 @@ qemuDomainBlockCopy(virDomainObjPtr vm, if (qemuDomainDetermineDiskChain(driver, vm, disk, false) < 0) goto endjob;
- if ((flags & VIR_DOMAIN_BLOCK_REBASE_SHALLOW) && - STREQ_NULLABLE(format, "raw") && + if ((flags & VIR_DOMAIN_BLOCK_COPY_SHALLOW) && + dest->format == VIR_STORAGE_FILE_RAW && disk->src->backingStore->path) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("disk '%s' has backing file, so raw shallow copy " @@ -15427,72 +15429,68 @@ qemuDomainBlockCopy(virDomainObjPtr vm, }
/* Prepare the destination file. */ - if (stat(dest, &st) < 0) { + /* XXX Allow non-file mirror destinations */ + if (!virStorageSourceIsLocalStorage(dest)) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("non-file destination not supported yet")); + } + if (stat(dest->path, &st) < 0) { if (errno != ENOENT) { virReportSystemError(errno, _("unable to stat for disk %s: %s"), - disk->dst, dest); + disk->dst, dest->path); goto endjob; - } else if (flags & (VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | - VIR_DOMAIN_BLOCK_REBASE_COPY_DEV)) { + } else if (flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT || + desttype == VIR_STORAGE_TYPE_BLOCK) { virReportSystemError(errno, _("missing destination file for disk %s: %s"), - disk->dst, dest); + disk->dst, dest->path); goto endjob; } } else if (!S_ISBLK(st.st_mode)) { - if (st.st_size && !(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { + if (st.st_size && !(flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("external destination file for disk %s already " "exists and is not a block device: %s"), - disk->dst, dest); + disk->dst, dest->path); goto endjob; } - if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_DEV) { + if (desttype == VIR_STORAGE_TYPE_BLOCK) { virReportError(VIR_ERR_INVALID_ARG, _("blockdev flag requested for disk %s, but file " - "'%s' is not a block device"), disk->dst, dest); + "'%s' is not a block device"), + disk->dst, dest->path); goto endjob; } }
- if (VIR_ALLOC(mirror) < 0) + if (!(mirror = virStorageSourceCopy(dest, false)))
Also you won't need to copy it then.
goto endjob; - /* XXX Allow non-file mirror destinations */ - mirror->type = flags & VIR_DOMAIN_BLOCK_REBASE_COPY_DEV ? - VIR_STORAGE_TYPE_BLOCK : VIR_STORAGE_TYPE_FILE;
- if (format) { - if ((mirror->format = virStorageFileFormatTypeFromString(format)) <= 0) { - virReportError(VIR_ERR_INVALID_ARG, _("unrecognized format '%s'"), - format); - goto endjob; - } - } else { - if (!(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { + if (!dest->format) { + if (!(flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT)) { mirror->format = disk->src->format; } else { /* If the user passed the REUSE_EXT flag, then either they - * also passed the RAW flag (and format is non-NULL), or it is - * safe for us to probe the format from the file that we will - * be using. */ - mirror->format = virStorageFileProbeFormat(dest, cfg->user, + * can also pass the RAW flag or use XML to tell us the format. + * So if we get here, we assume it is safe for us to probe the + * format from the file that we will be using. */ + mirror->format = virStorageFileProbeFormat(dest->path, cfg->user, cfg->group); } }
/* pre-create the image file */ - if (!(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) { - int fd = qemuOpenFile(driver, vm, dest, O_WRONLY | O_TRUNC | O_CREAT, + if (!(flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT)) { + int fd = qemuOpenFile(driver, vm, dest->path, + O_WRONLY | O_TRUNC | O_CREAT, &need_unlink, NULL); if (fd < 0) goto endjob; VIR_FORCE_CLOSE(fd); }
- if (!format && mirror->format > 0) + if (mirror->format > 0) format = virStorageFileFormatTypeToString(mirror->format); - if (VIR_STRDUP(mirror->path, dest) < 0) - goto endjob;
if (virStorageSourceInitChainElement(mirror, disk->src, false) < 0) goto endjob; @@ -15506,8 +15504,8 @@ qemuDomainBlockCopy(virDomainObjPtr vm,
/* Actually start the mirroring */ qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorDriveMirror(priv->mon, device, dest, format, bandwidth, - flags); + ret = qemuMonitorDriveMirror(priv->mon, device, dest->path, format, + bandwidth, flags); virDomainAuditDisk(vm, NULL, mirror, "mirror", ret >= 0); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) { @@ -15527,8 +15525,8 @@ qemuDomainBlockCopy(virDomainObjPtr vm, vm->def->name);
endjob: - if (need_unlink && unlink(dest)) - VIR_WARN("unable to unlink just-created %s", dest); + if (need_unlink && unlink(dest->path)) + VIR_WARN("unable to unlink just-created %s", dest->path); virStorageSourceFree(mirror); if (!qemuDomainObjEndJob(driver, vm)) vm = NULL; @@ -15546,9 +15544,9 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { virDomainObjPtr vm; - const char *format = NULL; int ret = -1; unsigned long long speed = bandwidth; + virStorageSourcePtr dest = NULL;
virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | @@ -15570,8 +15568,14 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, BLOCK_JOB_PULL, flags);
/* If we got here, we are doing a block copy rebase. */ + if (VIR_ALLOC(dest) < 0) + goto cleanup; + dest->type = (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_DEV) ? + VIR_STORAGE_TYPE_BLOCK : VIR_STORAGE_TYPE_FILE; + if (VIR_STRDUP(dest->path, base) < 0) + goto cleanup; if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) - format = "raw"; + dest->format = VIR_STORAGE_FILE_RAW;
/* Convert bandwidth MiB to bytes */ if (speed > LLONG_MAX >> 20) { @@ -15591,15 +15595,19 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, goto cleanup; }
+ /* We rely on the fact that VIR_DOMAIN_BLOCK_REBASE_SHALLOW + * and VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT map to the same values + * as for block copy. */ flags &= (VIR_DOMAIN_BLOCK_REBASE_SHALLOW | - VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | - VIR_DOMAIN_BLOCK_REBASE_COPY_DEV); - ret = qemuDomainBlockCopy(vm, dom->conn, path, base, format, - bandwidth, flags); + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT); + ret = qemuDomainBlockCopyCommon(vm, dom->conn, path, dest, + bandwidth, flags); vm = NULL; + cleanup: if (vm) virObjectUnlock(vm); + virStorageSourceFree(dest);
If you don't free it afterwards.
return ret; }
ACK with the docs added. The tweaks of the @dest handling are not necessary. Peter

On 09/05/2014 08:09 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
In order to implement the new virDomainBlockCopy, the existing block copy internal implementation needs to be adjusted. The new function will parse XML into a storage source, and parse typed parameters into integers, then call into the same common backend. For now, it's easier to keep the same implementation limits that only local file destinations are suported, but now the check needs to be explicit.
/* bandwidth in bytes/s */ static int -qemuDomainBlockCopy(virDomainObjPtr vm, - virConnectPtr conn, - const char *path, - const char *dest, const char *format, - unsigned long long bandwidth, unsigned int flags)
You should mention that this function consumes @dest.
+qemuDomainBlockCopyCommon(virDomainObjPtr vm,
- if (VIR_ALLOC(mirror) < 0) + if (!(mirror = virStorageSourceCopy(dest, false)))
Also you won't need to copy it then.
ACK with the docs added. The tweaks of the @dest handling are not necessary.
How about the following interdiff? diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c index 7026a18..26e0237 100644 --- i/src/qemu/qemu_driver.c +++ w/src/qemu/qemu_driver.c @@ -15349,12 +15349,13 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, /* bandwidth in bytes/s. Caller must lock vm beforehand, and not - * access it afterwards. */ + * access it afterwards; likewise, caller must not access mirror + * afterwards. */ static int qemuDomainBlockCopyCommon(virDomainObjPtr vm, virConnectPtr conn, const char *path, - virStorageSourcePtr dest, + virStorageSourcePtr mirror, unsigned long long bandwidth, unsigned int flags) { @@ -15366,10 +15367,9 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, int idx; struct stat st; bool need_unlink = false; - virStorageSourcePtr mirror = NULL; virQEMUDriverConfigPtr cfg = NULL; const char *format = NULL; - int desttype = virStorageSourceGetActualType(dest); + int desttype = virStorageSourceGetActualType(mirror); /* Preliminaries: find the disk we are editing, sanity checks */ virCheckFlags(VIR_DOMAIN_BLOCK_COPY_SHALLOW | @@ -15419,7 +15419,7 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, goto endjob; if ((flags & VIR_DOMAIN_BLOCK_COPY_SHALLOW) && - dest->format == VIR_STORAGE_FILE_RAW && + mirror->format == VIR_STORAGE_FILE_RAW && disk->src->backingStore->path) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("disk '%s' has backing file, so raw shallow copy " @@ -15430,20 +15430,20 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, /* Prepare the destination file. */ /* XXX Allow non-file mirror destinations */ - if (!virStorageSourceIsLocalStorage(dest)) { + if (!virStorageSourceIsLocalStorage(mirror)) { virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", _("non-file destination not supported yet")); } - if (stat(dest->path, &st) < 0) { + if (stat(mirror->path, &st) < 0) { if (errno != ENOENT) { virReportSystemError(errno, _("unable to stat for disk %s: %s"), - disk->dst, dest->path); + disk->dst, mirror->path); goto endjob; } else if (flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT || desttype == VIR_STORAGE_TYPE_BLOCK) { virReportSystemError(errno, _("missing destination file for disk %s: %s"), - disk->dst, dest->path); + disk->dst, mirror->path); goto endjob; } } else if (!S_ISBLK(st.st_mode)) { @@ -15451,22 +15451,19 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("external destination file for disk %s already " "exists and is not a block device: %s"), - disk->dst, dest->path); + disk->dst, mirror->path); goto endjob; } if (desttype == VIR_STORAGE_TYPE_BLOCK) { virReportError(VIR_ERR_INVALID_ARG, _("blockdev flag requested for disk %s, but file " "'%s' is not a block device"), - disk->dst, dest->path); + disk->dst, mirror->path); goto endjob; } } - if (!(mirror = virStorageSourceCopy(dest, false))) - goto endjob; - - if (!dest->format) { + if (!mirror->format) { if (!(flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT)) { mirror->format = disk->src->format; } else { @@ -15474,14 +15471,14 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, * can also pass the RAW flag or use XML to tell us the format. * So if we get here, we assume it is safe for us to probe the * format from the file that we will be using. */ - mirror->format = virStorageFileProbeFormat(dest->path, cfg->user, + mirror->format = virStorageFileProbeFormat(mirror->path, cfg->user, cfg->group); } } /* pre-create the image file */ if (!(flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT)) { - int fd = qemuOpenFile(driver, vm, dest->path, + int fd = qemuOpenFile(driver, vm, mirror->path, O_WRONLY | O_TRUNC | O_CREAT, &need_unlink, NULL); if (fd < 0) @@ -15504,7 +15501,7 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, /* Actually start the mirroring */ qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorDriveMirror(priv->mon, device, dest->path, format, + ret = qemuMonitorDriveMirror(priv->mon, device, mirror->path, format, bandwidth, flags); virDomainAuditDisk(vm, NULL, mirror, "mirror", ret >= 0); qemuDomainObjExitMonitor(driver, vm); @@ -15525,8 +15522,8 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, vm->def->name); endjob: - if (need_unlink && unlink(dest->path)) - VIR_WARN("unable to unlink just-created %s", dest->path); + if (need_unlink && unlink(mirror->path)) + VIR_WARN("unable to unlink just-created %s", mirror->path); virStorageSourceFree(mirror); if (!qemuDomainObjEndJob(driver, vm)) vm = NULL; @@ -15603,6 +15600,7 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, ret = qemuDomainBlockCopyCommon(vm, dom->conn, path, dest, bandwidth, flags); vm = NULL; + dest = NULL; cleanup: if (vm) -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The new blockcopy API wants to reuse only a subset of the disk hotplug parser - namely, we only care about the embedded virStorageSourcePtr inside a <disk> XML. Strange as it may seem, it was easier to just parse an entire disk definition, then throw away everything but the embedded source, than it was to disentangle the source parsing code from the rest of the overall disk parsing function. All that I needed was a couple of tweaks and a new internal flag that determines whether the normally-mandatory target element can be gracefully skipped, since everything else was already optional. * src/conf/domain_conf.h (virDomainDiskSourceParse): New prototype. * src/conf/domain_conf.c (VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE): New flag. (virDomainDiskDefParseXML): Honor flag to make target optional. (virDomainDiskSourceParse): New function. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/conf/domain_conf.c | 107 ++++++++++++++++++++++++++++++++++------------- src/conf/domain_conf.h | 4 ++ src/libvirt_private.syms | 1 + 3 files changed, 82 insertions(+), 30 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 53ef694..8723008 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -100,6 +100,8 @@ typedef enum { VIR_DOMAIN_XML_INTERNAL_ALLOW_ROM = 1 << 19, VIR_DOMAIN_XML_INTERNAL_ALLOW_BOOT = 1 << 20, VIR_DOMAIN_XML_INTERNAL_CLOCK_ADJUST = 1 << 21, + /* parse only source half of <disk> */ + VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE = 1 << 22, } virDomainXMLInternalFlags; #define DUMPXML_FLAGS \ @@ -113,7 +115,8 @@ verify(((VIR_DOMAIN_XML_INTERNAL_STATUS | VIR_DOMAIN_XML_INTERNAL_PCI_ORIG_STATES | VIR_DOMAIN_XML_INTERNAL_ALLOW_ROM | VIR_DOMAIN_XML_INTERNAL_ALLOW_BOOT | - VIR_DOMAIN_XML_INTERNAL_CLOCK_ADJUST) + VIR_DOMAIN_XML_INTERNAL_CLOCK_ADJUST | + VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE) & DUMPXML_FLAGS) == 0); VIR_ENUM_IMPL(virDomainTaint, VIR_DOMAIN_TAINT_LAST, @@ -5840,9 +5843,8 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, * to indicate no media present. LUN is for raw access CD-ROMs * that are not attached to a physical device presently */ if (source == NULL && def->src->hosts == NULL && !def->src->srcpool && - def->device != VIR_DOMAIN_DISK_DEVICE_CDROM && - def->device != VIR_DOMAIN_DISK_DEVICE_LUN && - def->device != VIR_DOMAIN_DISK_DEVICE_FLOPPY) { + (def->device == VIR_DOMAIN_DISK_DEVICE_DISK || + (flags & VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE))) { virReportError(VIR_ERR_NO_SOURCE, target ? "%s" : NULL, target); goto error; @@ -5862,7 +5864,7 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, ctxt->node = saved_node; } - if (target == NULL) { + if (!target && !(flags & VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE)) { if (def->src->srcpool) { char *tmp; if (virAsprintf(&tmp, "pool = '%s', volume = '%s'", @@ -5877,27 +5879,29 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, goto error; } - if (def->device == VIR_DOMAIN_DISK_DEVICE_FLOPPY && - !STRPREFIX(target, "fd")) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Invalid floppy device name: %s"), target); - goto error; - } + if (!(flags & VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE)) { + if (def->device == VIR_DOMAIN_DISK_DEVICE_FLOPPY && + !STRPREFIX(target, "fd")) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid floppy device name: %s"), target); + goto error; + } - /* Force CDROM to be listed as read only */ - if (def->device == VIR_DOMAIN_DISK_DEVICE_CDROM) - def->src->readonly = true; + /* Force CDROM to be listed as read only */ + if (def->device == VIR_DOMAIN_DISK_DEVICE_CDROM) + def->src->readonly = true; - if ((def->device == VIR_DOMAIN_DISK_DEVICE_DISK || - def->device == VIR_DOMAIN_DISK_DEVICE_LUN) && - !STRPREFIX((const char *)target, "hd") && - !STRPREFIX((const char *)target, "sd") && - !STRPREFIX((const char *)target, "vd") && - !STRPREFIX((const char *)target, "xvd") && - !STRPREFIX((const char *)target, "ubd")) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Invalid harddisk device name: %s"), target); - goto error; + if ((def->device == VIR_DOMAIN_DISK_DEVICE_DISK || + def->device == VIR_DOMAIN_DISK_DEVICE_LUN) && + !STRPREFIX((const char *)target, "hd") && + !STRPREFIX((const char *)target, "sd") && + !STRPREFIX((const char *)target, "vd") && + !STRPREFIX((const char *)target, "xvd") && + !STRPREFIX((const char *)target, "ubd")) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid harddisk device name: %s"), target); + goto error; + } } if (snapshot) { @@ -5951,7 +5955,7 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, } else { if (def->device == VIR_DOMAIN_DISK_DEVICE_FLOPPY) { def->bus = VIR_DOMAIN_DISK_BUS_FDC; - } else { + } else if (!(flags & VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE)) { if (STRPREFIX(target, "hd")) def->bus = VIR_DOMAIN_DISK_BUS_IDE; else if (STRPREFIX(target, "sd")) @@ -6186,12 +6190,14 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, } } - if (def->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE - && virDomainDiskDefAssignAddress(xmlopt, def) < 0) - goto error; + if (!(flags & VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE)) { + if (def->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE + && virDomainDiskDefAssignAddress(xmlopt, def) < 0) + goto error; - if (virDomainDiskBackingStoreParse(ctxt, def->src) < 0) - goto error; + if (virDomainDiskBackingStoreParse(ctxt, def->src) < 0) + goto error; + } cleanup: VIR_FREE(bus); @@ -10405,6 +10411,47 @@ virDomainDeviceDefParse(const char *xmlStr, } +virStorageSourcePtr +virDomainDiskDefSourceParse(const char *xmlStr, + const virDomainDef *def, + virDomainXMLOptionPtr xmlopt, + unsigned int flags) +{ + xmlDocPtr xml; + xmlNodePtr node; + xmlXPathContextPtr ctxt = NULL; + virDomainDiskDefPtr disk = NULL; + virStorageSourcePtr ret = NULL; + + if (!(xml = virXMLParseStringCtxt(xmlStr, _("(disk_definition)"), &ctxt))) + goto cleanup; + node = ctxt->node; + + if (!xmlStrEqual(node->name, BAD_CAST "disk")) { + virReportError(VIR_ERR_XML_ERROR, + _("expecting root element of 'disk', not '%s'"), + node->name); + goto cleanup; + } + + flags |= VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE; + if (!(disk = virDomainDiskDefParseXML(xmlopt, node, ctxt, + NULL, def->seclabels, + def->nseclabels, + flags))) + goto cleanup; + + ret = disk->src; + disk->src = NULL; + + cleanup: + virDomainDiskDefFree(disk); + xmlFreeDoc(xml); + xmlXPathFreeContext(ctxt); + return ret; +} + + static const char * virDomainChrTargetTypeToString(int deviceType, int targetType) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 9586c3b..1107fa8 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2297,6 +2297,10 @@ virDomainDeviceDefPtr virDomainDeviceDefParse(const char *xmlStr, virCapsPtr caps, virDomainXMLOptionPtr xmlopt, unsigned int flags); +virStorageSourcePtr virDomainDiskDefSourceParse(const char *xmlStr, + const virDomainDef *def, + virDomainXMLOptionPtr xmlopt, + unsigned int flags); virDomainDefPtr virDomainDefParseString(const char *xmlStr, virCapsPtr caps, virDomainXMLOptionPtr xmlopt, diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 71fc063..e0d788d 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -222,6 +222,7 @@ virDomainDiskDefAssignAddress; virDomainDiskDefForeachPath; virDomainDiskDefFree; virDomainDiskDefNew; +virDomainDiskDefSourceParse; virDomainDiskDeviceTypeToString; virDomainDiskDiscardTypeToString; virDomainDiskErrorPolicyTypeFromString; -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
The new blockcopy API wants to reuse only a subset of the disk hotplug parser - namely, we only care about the embedded virStorageSourcePtr inside a <disk> XML. Strange as it may seem, it was easier to just parse an entire disk definition, then throw away everything but the embedded source, than it was to disentangle the source parsing code from the rest of the overall disk parsing function. All that I needed was a couple of tweaks and a new internal flag that determines whether the normally-mandatory target element can be gracefully skipped, since everything else was already optional.
* src/conf/domain_conf.h (virDomainDiskSourceParse): New prototype. * src/conf/domain_conf.c (VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE): New flag. (virDomainDiskDefParseXML): Honor flag to make target optional. (virDomainDiskSourceParse): New function.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/conf/domain_conf.c | 107 ++++++++++++++++++++++++++++++++++------------- src/conf/domain_conf.h | 4 ++ src/libvirt_private.syms | 1 + 3 files changed, 82 insertions(+), 30 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 53ef694..8723008 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c \ @@ -5951,7 +5955,7 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, } else { if (def->device == VIR_DOMAIN_DISK_DEVICE_FLOPPY) { def->bus = VIR_DOMAIN_DISK_BUS_FDC; - } else { + } else if (!(flags & VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE)) { if (STRPREFIX(target, "hd")) def->bus = VIR_DOMAIN_DISK_BUS_IDE; else if (STRPREFIX(target, "sd")) @@ -6186,12 +6190,14 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, } }
- if (def->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE - && virDomainDiskDefAssignAddress(xmlopt, def) < 0) - goto error; + if (!(flags & VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE)) { + if (def->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE + && virDomainDiskDefAssignAddress(xmlopt, def) < 0) + goto error;
- if (virDomainDiskBackingStoreParse(ctxt, def->src) < 0) - goto error; + if (virDomainDiskBackingStoreParse(ctxt, def->src) < 0) + goto error;
This actually might be useful for source of block copy and others some time in the future.
+ }
cleanup: VIR_FREE(bus);
ACK, thanks for dropping the refactor part. Peter

On 09/05/2014 08:14 AM, Peter Krempa wrote:
On 08/31/14 06:02, Eric Blake wrote:
The new blockcopy API wants to reuse only a subset of the disk hotplug parser - namely, we only care about the embedded virStorageSourcePtr inside a <disk> XML. Strange as it may seem, it was easier to just parse an entire disk definition, then throw away everything but the embedded source, than it was to disentangle the source parsing code from the rest of the overall disk parsing function. All that I needed was a couple of tweaks and a new internal flag that determines whether the normally-mandatory target element can be gracefully skipped, since everything else was already optional.
ACK, thanks for dropping the refactor part.
I've pushed this one. I've got enough other tweaks in the remaining patches in v3 that I'll post a v4 for them. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The hard part of managing the disk copy is already coded; all this had to do was convert the XML and virTypedParameters into the internal representation. With this patch, all blockcopy operations that used the old API should also work via the new API. Additional extensions, such as supporting the granularity tunable or a network rather than file destination, will be added as later patches. * src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d3f1042..d93e184 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15611,6 +15611,90 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, return ret; } + +static int +qemuDomainBlockCopy(virDomainPtr dom, const char *disk, const char *destxml, + virTypedParameterPtr params, int nparams, + unsigned int flags) +{ + virQEMUDriverPtr driver = dom->conn->privateData; + virDomainObjPtr vm; + int ret = -1; + unsigned long long bandwidth = 0; + unsigned int granularity = 0; + unsigned long long buf_size = 0; + virStorageSourcePtr dest = NULL; + size_t i; + + virCheckFlags(VIR_DOMAIN_BLOCK_COPY_SHALLOW | + VIR_DOMAIN_BLOCK_COPY_REUSE_EXT, -1); + if (virTypedParamsValidate(params, nparams, + VIR_DOMAIN_BLOCK_COPY_BANDWIDTH, + VIR_TYPED_PARAM_ULLONG, + VIR_DOMAIN_BLOCK_COPY_GRANULARITY, + VIR_TYPED_PARAM_UINT, + VIR_DOMAIN_BLOCK_COPY_BUF_SIZE, + VIR_TYPED_PARAM_ULLONG, + NULL) < 0) + return -1; + + if (!(vm = qemuDomObjFromDomain(dom))) + return -1; + + if (virDomainBlockCopyEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + for (i = 0; i < nparams; i++) { + virTypedParameterPtr param = ¶ms[i]; + + /* Typed params (wisely) refused to expose unsigned long, but + * back-compat demands that we stick with a maximum of + * unsigned long bandwidth in MiB/s, while our value is + * unsigned long long in bytes/s. Hence, we have to do + * overflow detection if this is a 32-bit server handling a + * 64-bit client. */ + if (STREQ(param->field, VIR_DOMAIN_BLOCK_COPY_BANDWIDTH)) { + if (sizeof(unsigned long) < sizeof(bandwidth) && + param->value.ul > ULONG_MAX * (1ULL << 20)) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu bytes"), + ULONG_MAX * (1ULL << 20)); + goto cleanup; + } + bandwidth = param->value.ul; + } else if (STREQ(param->field, VIR_DOMAIN_BLOCK_COPY_GRANULARITY)) { + granularity = param->value.ui; + } else if (STREQ(param->field, VIR_DOMAIN_BLOCK_COPY_BUF_SIZE)) { + buf_size = param->value.ul; + } + } + if (granularity) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("granularity tuning not supported yet")); + goto cleanup; + } + if (buf_size) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("buffer size tuning not supported yet")); + goto cleanup; + } + + if (!(dest = virDomainDiskDefSourceParse(destxml, vm->def, driver->xmlopt, + VIR_DOMAIN_XML_INACTIVE))) + goto cleanup; + + ret = qemuDomainBlockCopyCommon(vm, dom->conn, disk, dest, + bandwidth, flags); + vm = NULL; + + cleanup: + virStorageSourceFree(dest); + if (vm) + virObjectUnlock(vm); + return ret; +} + + static int qemuDomainBlockPull(virDomainPtr dom, const char *path, unsigned long bandwidth, unsigned int flags) @@ -17683,6 +17767,7 @@ static virDriver qemuDriver = { .domainBlockJobSetSpeed = qemuDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = qemuDomainBlockPull, /* 0.9.4 */ .domainBlockRebase = qemuDomainBlockRebase, /* 0.9.10 */ + .domainBlockCopy = qemuDomainBlockCopy, /* 1.2.8 */ .domainBlockCommit = qemuDomainBlockCommit, /* 1.0.0 */ .connectIsAlive = qemuConnectIsAlive, /* 0.9.8 */ .nodeSuspendForDuration = qemuNodeSuspendForDuration, /* 0.9.8 */ -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
The hard part of managing the disk copy is already coded; all this had to do was convert the XML and virTypedParameters into the internal representation.
With this patch, all blockcopy operations that used the old API should also work via the new API. Additional extensions, such as supporting the granularity tunable or a network rather than file destination, will be added as later patches.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d3f1042..d93e184 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15611,6 +15611,90 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, return ret; }
+ +static int +qemuDomainBlockCopy(virDomainPtr dom, const char *disk, const char *destxml, + virTypedParameterPtr params, int nparams, + unsigned int flags) +{ + virQEMUDriverPtr driver = dom->conn->privateData; + virDomainObjPtr vm; + int ret = -1; + unsigned long long bandwidth = 0; + unsigned int granularity = 0; + unsigned long long buf_size = 0; + virStorageSourcePtr dest = NULL; + size_t i; + + virCheckFlags(VIR_DOMAIN_BLOCK_COPY_SHALLOW | + VIR_DOMAIN_BLOCK_COPY_REUSE_EXT, -1); + if (virTypedParamsValidate(params, nparams, + VIR_DOMAIN_BLOCK_COPY_BANDWIDTH, + VIR_TYPED_PARAM_ULLONG, + VIR_DOMAIN_BLOCK_COPY_GRANULARITY, + VIR_TYPED_PARAM_UINT, + VIR_DOMAIN_BLOCK_COPY_BUF_SIZE, + VIR_TYPED_PARAM_ULLONG, + NULL) < 0) + return -1; + + if (!(vm = qemuDomObjFromDomain(dom))) + return -1; + + if (virDomainBlockCopyEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + for (i = 0; i < nparams; i++) { + virTypedParameterPtr param = ¶ms[i]; + + /* Typed params (wisely) refused to expose unsigned long, but + * back-compat demands that we stick with a maximum of + * unsigned long bandwidth in MiB/s, while our value is + * unsigned long long in bytes/s. Hence, we have to do + * overflow detection if this is a 32-bit server handling a + * 64-bit client. */ + if (STREQ(param->field, VIR_DOMAIN_BLOCK_COPY_BANDWIDTH)) { + if (sizeof(unsigned long) < sizeof(bandwidth) && + param->value.ul > ULONG_MAX * (1ULL << 20)) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu bytes"), + ULONG_MAX * (1ULL << 20)); + goto cleanup;
Sigh. I'd rather not see such checks, but unfortunately we don't want to break output of the blockJobInfo ...
+ } + bandwidth = param->value.ul; + } else if (STREQ(param->field, VIR_DOMAIN_BLOCK_COPY_GRANULARITY)) { + granularity = param->value.ui; + } else if (STREQ(param->field, VIR_DOMAIN_BLOCK_COPY_BUF_SIZE)) { + buf_size = param->value.ul; + } + } + if (granularity) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("granularity tuning not supported yet")); + goto cleanup; + } + if (buf_size) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("buffer size tuning not supported yet")); + goto cleanup; + } + + if (!(dest = virDomainDiskDefSourceParse(destxml, vm->def, driver->xmlopt, + VIR_DOMAIN_XML_INACTIVE))) + goto cleanup; + + ret = qemuDomainBlockCopyCommon(vm, dom->conn, disk, dest, + bandwidth, flags); + vm = NULL; + + cleanup: + virStorageSourceFree(dest); + if (vm) + virObjectUnlock(vm); + return ret; +} + + static int qemuDomainBlockPull(virDomainPtr dom, const char *path, unsigned long bandwidth, unsigned int flags) @@ -17683,6 +17767,7 @@ static virDriver qemuDriver = { .domainBlockJobSetSpeed = qemuDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = qemuDomainBlockPull, /* 0.9.4 */ .domainBlockRebase = qemuDomainBlockRebase, /* 0.9.10 */ + .domainBlockCopy = qemuDomainBlockCopy, /* 1.2.8 */ .domainBlockCommit = qemuDomainBlockCommit, /* 1.0.0 */ .connectIsAlive = qemuConnectIsAlive, /* 0.9.8 */ .nodeSuspendForDuration = qemuNodeSuspendForDuration, /* 0.9.8 */
ACK, Peter

Upstream qemu 1.4 added some drive-mirror tunables not present when it was first introduced in 1.3. Management apps may want to set these in some cases (for example, without tuning granularity down to sector size, a copy may end up occupying more bytes than the original because an entire cluster is copied even when only a sector within the cluster is dirty, although tuning it down results in more CPU time to do the copy). I haven't personally needed to use the parameters, but since they exist, and since the new API supports virTypedParams, we might as well expose them. Since the tuning parameters aren't often used, and omitted from the QMP command when unspecified, I think it is safe to rely on qemu 1.3 to issue an error about them being unsupported, rather than trying to create a new capability bit in libvirt. Meanwhile, all versions of qemu from 1.4 to 2.1 have a bug where a bad granularity (such as non-power-of-2) gives a poor message: error: internal error: unable to execute QEMU command 'drive-mirror': Invalid parameter 'drive-virtio-disk0' because of abuse of QERR_INVALID_PARAMETER (which is supposed to name the parameter that was given a bad value, rather than the value passed to some other parameter). I don't see that a capability check will help, so we'll just live with it (and it has since been improved in upstream qemu). * src/qemu/qemu_monitor.h (qemuMonitorDriveMirror): Add parameters. * src/qemu/qemu_monitor.c (qemuMonitorDriveMirror): Likewise. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror): Likewise. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONDriveMirror): Likewise. * src/qemu/qemu_driver.c (qemuDomainBlockCopyCommon): Likewise. (qemuDomainBlockRebase, qemuDomainBlockCopy): Adjust callers. * src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Likewise. * tests/qemumonitorjsontest.c (qemuMonitorJSONDriveMirror): Likewise. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 18 +++++------------- src/qemu/qemu_migration.c | 2 +- src/qemu/qemu_monitor.c | 8 +++++--- src/qemu/qemu_monitor.h | 2 ++ src/qemu/qemu_monitor_json.c | 6 +++++- src/qemu/qemu_monitor_json.h | 2 ++ tests/qemumonitorjsontest.c | 2 +- 7 files changed, 21 insertions(+), 19 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d93e184..e597d68 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15356,6 +15356,8 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, const char *path, virStorageSourcePtr dest, unsigned long long bandwidth, + unsigned int granularity, + unsigned long long buf_size, unsigned int flags) { virQEMUDriverPtr driver = conn->privateData; @@ -15505,7 +15507,7 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, /* Actually start the mirroring */ qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorDriveMirror(priv->mon, device, dest->path, format, - bandwidth, flags); + bandwidth, granularity, buf_size, flags); virDomainAuditDisk(vm, NULL, mirror, "mirror", ret >= 0); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) { @@ -15601,7 +15603,7 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, flags &= (VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT); ret = qemuDomainBlockCopyCommon(vm, dom->conn, path, dest, - bandwidth, flags); + bandwidth, 0, 0, flags); vm = NULL; cleanup: @@ -15668,23 +15670,13 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *disk, const char *destxml, buf_size = param->value.ul; } } - if (granularity) { - virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", - _("granularity tuning not supported yet")); - goto cleanup; - } - if (buf_size) { - virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", - _("buffer size tuning not supported yet")); - goto cleanup; - } if (!(dest = virDomainDiskDefSourceParse(destxml, vm->def, driver->xmlopt, VIR_DOMAIN_XML_INACTIVE))) goto cleanup; ret = qemuDomainBlockCopyCommon(vm, dom->conn, disk, dest, - bandwidth, flags); + bandwidth, granularity, buf_size, flags); vm = NULL; cleanup: diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 6bdee4e..3a506f8 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1279,7 +1279,7 @@ qemuMigrationDriveMirror(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto error; mon_ret = qemuMonitorDriveMirror(priv->mon, diskAlias, nbd_dest, - NULL, speed, mirror_flags); + NULL, speed, 0, 0, mirror_flags); qemuDomainObjExitMonitor(driver, vm); if (mon_ret < 0) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index d96b1b8..e481ec1 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -3183,17 +3183,19 @@ int qemuMonitorDriveMirror(qemuMonitorPtr mon, const char *device, const char *file, const char *format, unsigned long long bandwidth, + unsigned int granularity, unsigned long long buf_size, unsigned int flags) { int ret = -1; VIR_DEBUG("mon=%p, device=%s, file=%s, format=%s, bandwidth=%lld, " - "flags=%x", - mon, device, file, NULLSTR(format), bandwidth, flags); + "granularity=%#x, buf_size=%lld, flags=%x", + mon, device, file, NULLSTR(format), bandwidth, granularity, + buf_size, flags); if (mon->json) ret = qemuMonitorJSONDriveMirror(mon, device, file, format, bandwidth, - flags); + granularity, buf_size, flags); else virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("drive-mirror requires JSON monitor")); diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 9012ad4..cb88e6e 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -650,6 +650,8 @@ int qemuMonitorDriveMirror(qemuMonitorPtr mon, const char *file, const char *format, unsigned long long bandwidth, + unsigned int granularity, + unsigned long long buf_size, unsigned int flags) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int qemuMonitorDrivePivot(qemuMonitorPtr mon, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 8130f92..2ccf9cb 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -3397,6 +3397,8 @@ int qemuMonitorJSONDriveMirror(qemuMonitorPtr mon, const char *device, const char *file, const char *format, unsigned long long speed, + unsigned int granularity, + unsigned long long buf_size, unsigned int flags) { int ret = -1; @@ -3408,7 +3410,9 @@ qemuMonitorJSONDriveMirror(qemuMonitorPtr mon, cmd = qemuMonitorJSONMakeCommand("drive-mirror", "s:device", device, "s:target", file, - "U:speed", speed, + "P:speed", speed, + "z:granularity", granularity, + "P:buf-size", buf_size, "s:sync", shallow ? "top" : "full", "s:mode", reuse ? "existing" : "absolute-paths", "S:format", format, diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index 4595eae..a325463 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -249,6 +249,8 @@ int qemuMonitorJSONDriveMirror(qemuMonitorPtr mon, const char *file, const char *format, unsigned long long speed, + unsigned int granularity, + unsigned long long buf_size, unsigned int flags) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int qemuMonitorJSONDrivePivot(qemuMonitorPtr mon, diff --git a/tests/qemumonitorjsontest.c b/tests/qemumonitorjsontest.c index e3fb4f7..afbf13a 100644 --- a/tests/qemumonitorjsontest.c +++ b/tests/qemumonitorjsontest.c @@ -1176,7 +1176,7 @@ GEN_TEST_FUNC(qemuMonitorJSONRemoveNetdev, "net0") GEN_TEST_FUNC(qemuMonitorJSONDelDevice, "ide0") GEN_TEST_FUNC(qemuMonitorJSONAddDevice, "some_dummy_devicestr") GEN_TEST_FUNC(qemuMonitorJSONSetDrivePassphrase, "vda", "secret_passhprase") -GEN_TEST_FUNC(qemuMonitorJSONDriveMirror, "vdb", "/foo/bar", NULL, 1024, +GEN_TEST_FUNC(qemuMonitorJSONDriveMirror, "vdb", "/foo/bar", NULL, 1024, 0, 0, VIR_DOMAIN_BLOCK_REBASE_SHALLOW | VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT) GEN_TEST_FUNC(qemuMonitorJSONBlockCommit, "vdb", "/foo/bar1", "/foo/bar2", NULL, 1024) GEN_TEST_FUNC(qemuMonitorJSONDrivePivot, "vdb", NULL, NULL) -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
Upstream qemu 1.4 added some drive-mirror tunables not present when it was first introduced in 1.3. Management apps may want to set these in some cases (for example, without tuning granularity down to sector size, a copy may end up occupying more bytes than the original because an entire cluster is copied even when only a sector within the cluster is dirty, although tuning it down results in more CPU time to do the copy). I haven't personally needed to use the parameters, but since they exist, and since the new API supports virTypedParams, we might as well expose them.
Since the tuning parameters aren't often used, and omitted from the QMP command when unspecified, I think it is safe to rely on qemu 1.3 to issue an error about them being unsupported, rather than trying to create a new capability bit in libvirt.
Meanwhile, all versions of qemu from 1.4 to 2.1 have a bug where a bad granularity (such as non-power-of-2) gives a poor message: error: internal error: unable to execute QEMU command 'drive-mirror': Invalid parameter 'drive-virtio-disk0'
because of abuse of QERR_INVALID_PARAMETER (which is supposed to name the parameter that was given a bad value, rather than the value passed to some other parameter). I don't see that a capability check will help, so we'll just live with it (and it has since been improved in upstream qemu).
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror): Add parameters. * src/qemu/qemu_monitor.c (qemuMonitorDriveMirror): Likewise. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror): Likewise. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONDriveMirror): Likewise. * src/qemu/qemu_driver.c (qemuDomainBlockCopyCommon): Likewise. (qemuDomainBlockRebase, qemuDomainBlockCopy): Adjust callers. * src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Likewise. * tests/qemumonitorjsontest.c (qemuMonitorJSONDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 18 +++++------------- src/qemu/qemu_migration.c | 2 +- src/qemu/qemu_monitor.c | 8 +++++--- src/qemu/qemu_monitor.h | 2 ++ src/qemu/qemu_monitor_json.c | 6 +++++- src/qemu/qemu_monitor_json.h | 2 ++ tests/qemumonitorjsontest.c | 2 +- 7 files changed, 21 insertions(+), 19 deletions(-)
ACK, Peter

We stupidly modeled block job bandwidth after migration bandwidth, which in turn was an 'unsigned long' and therefore subject to 32-bit vs. 64-bit interpretations. To work around the fact that 10-gigabit interfaces are possible but don't fit within 32 bits, the original interface took the number scaled as MiB/sec. But this scaling is rather coarse, and it might be nice to tune bandwidth finer than in megabyte chunks. Several of the block job calls that can set speed are fed through a common interface, so it was easier to adjust them all at once. Note that there is intentionally no flag for the new virDomainBlockCopy; there, since the API already uses a 64-bit type always, instead of a possible 32-bit type, and is brand new, it was easier to just avoid scaling issues. As with the previous patch that adjusted the query side, omitting the new flag preserves old behavior, and the documentation now mentions limits of what happens when a 32-bit machine is on either client or server side. * include/libvirt/libvirt.h.in (virDomainBlockJobSetSpeedFlags) (virDomainBlockPullFlags) (VIR_DOMAIN_BLOCK_REBASE_BANDWIDTH_BYTES) (VIR_DOMAIN_BLOCK_COMMIT_BANDWIDTH_BYTES): New enums. * src/libvirt.c (virDomainBlockJobSetSpeed, virDomainBlockPull) (virDomainBlockRebase, virDomainBlockCommit): Document them. * src/qemu/qemu_driver.c (qemuDomainBlockJobSetSpeed) (qemuDomainBlockPull, qemuDomainBlockRebase) (qemuDomainBlockCommit, qemuDomainBlockJobImpl): Support new flag. Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 31 +++++++++++++---- src/libvirt.c | 83 +++++++++++++++++++++++++++++++------------- src/qemu/qemu_driver.c | 61 +++++++++++++++++++------------- 3 files changed, 118 insertions(+), 57 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index aced31c..bffba06 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2617,11 +2617,24 @@ int virDomainGetBlockJobInfo(virDomainPtr dom, const char *disk, virDomainBlockJobInfoPtr info, unsigned int flags); -int virDomainBlockJobSetSpeed(virDomainPtr dom, const char *disk, - unsigned long bandwidth, unsigned int flags); +/* Flags for use with virDomainBlockJobSetSpeed */ +typedef enum { + VIR_DOMAIN_BLOCK_JOB_SPEED_BANDWIDTH_BYTES = 1 << 0, /* bandwidth in bytes/s + instead of MiB/s */ +} virDomainBlockJobSetSpeedFlags; -int virDomainBlockPull(virDomainPtr dom, const char *disk, - unsigned long bandwidth, unsigned int flags); +int virDomainBlockJobSetSpeed(virDomainPtr dom, const char *disk, + unsigned long bandwidth, unsigned int flags); + +/* Flags for use with virDomainBlockPull (values chosen to be a subset + * of the flags for virDomainBlockRebase) */ +typedef enum { + VIR_DOMAIN_BLOCK_PULL_BANDWIDTH_BYTES = 1 << 6, /* bandwidth in bytes/s + instead of MiB/s */ +} virDomainBlockPullFlags; + +int virDomainBlockPull(virDomainPtr dom, const char *disk, + unsigned long bandwidth, unsigned int flags); /** * virDomainBlockRebaseFlags: @@ -2640,11 +2653,13 @@ typedef enum { names */ VIR_DOMAIN_BLOCK_REBASE_COPY_DEV = 1 << 5, /* Treat destination as block device instead of file */ + VIR_DOMAIN_BLOCK_REBASE_BANDWIDTH_BYTES = 1 << 6, /* bandwidth in bytes/s + instead of MiB/s */ } virDomainBlockRebaseFlags; -int virDomainBlockRebase(virDomainPtr dom, const char *disk, - const char *base, unsigned long bandwidth, - unsigned int flags); +int virDomainBlockRebase(virDomainPtr dom, const char *disk, + const char *base, unsigned long bandwidth, + unsigned int flags); /** * virDomainBlockCopyFlags: @@ -2719,6 +2734,8 @@ typedef enum { VIR_DOMAIN_BLOCK_COMMIT_RELATIVE = 1 << 3, /* keep the backing chain referenced using relative names */ + VIR_DOMAIN_BLOCK_COMMIT_BANDWIDTH_BYTES = 1 << 4, /* bandwidth in bytes/s + instead of MiB/s */ } virDomainBlockCommitFlags; int virDomainBlockCommit(virDomainPtr dom, const char *disk, const char *base, diff --git a/src/libvirt.c b/src/libvirt.c index 4806535..38bc1cb 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -19724,11 +19724,20 @@ virDomainGetBlockJobInfo(virDomainPtr dom, const char *disk, * virDomainBlockJobSetSpeed: * @dom: pointer to domain object * @disk: path to the block device, or device shorthand - * @bandwidth: specify bandwidth limit in MiB/s - * @flags: extra flags; not used yet, so callers should always pass 0 + * @bandwidth: specify bandwidth limit; flags determine the unit + * @flags: bitwise-OR of virDomainBlockJobSetSpeedFlags * * Set the maximimum allowable bandwidth that a block job may consume. If - * bandwidth is 0, the limit will revert to the hypervisor default. + * bandwidth is 0, the limit will revert to the hypervisor default of + * unlimited. + * + * If @flags contains VIR_DOMAIN_BLOCK_JOB_SPEED_BANDWIDTH_BYTES, @bandwidth + * is in bytes/second; otherwise, it is in MiB/second. Values larger than + * 2^52 bytes/sec may be rejected due to overflow considerations based on + * the word size of both client and server, and values larger than 2^31 + * bytes/sec may cause overflow problems if later queried by + * virDomainGetBlockJobInfo() without scaling. Hypervisors may further + * restrict the range of valid bandwidth values. * * The @disk parameter is either an unambiguous source name of the * block device (the <source file='...'/> sub-element, such as @@ -19776,8 +19785,8 @@ virDomainBlockJobSetSpeed(virDomainPtr dom, const char *disk, * virDomainBlockPull: * @dom: pointer to domain object * @disk: path to the block device, or device shorthand - * @bandwidth: (optional) specify copy bandwidth limit in MiB/s - * @flags: extra flags; not used yet, so callers should always pass 0 + * @bandwidth: (optional) specify bandwidth limit; flags determine the unit + * @flags: bitwise-OR of virDomainBlockPullFlags * * Populate a disk image with data from its backing image. Once all data from * its backing image has been pulled, the disk no longer depends on a backing @@ -19794,12 +19803,20 @@ virDomainBlockJobSetSpeed(virDomainPtr dom, const char *disk, * can be found by calling virDomainGetXMLDesc() and inspecting * elements within //domain/devices/disk. * - * The maximum bandwidth (in MiB/s) that will be used to do the copy can be - * specified with the bandwidth parameter. If set to 0, libvirt will choose a - * suitable default. Some hypervisors do not support this feature and will - * return an error if bandwidth is not 0; in this case, it might still be - * possible for a later call to virDomainBlockJobSetSpeed() to succeed. - * The actual speed can be determined with virDomainGetBlockJobInfo(). + * The maximum bandwidth that will be used to do the copy can be + * specified with the @bandwidth parameter. If set to 0, there is no + * limit. If @flags includes VIR_DOMAIN_BLOCK_PULL_BANDWIDTH_BYTES, + * @bandwidth is in bytes/second; otherwise, it is in MiB/second. + * Values larger than 2^52 bytes/sec may be rejected due to overflow + * considerations based on the word size of both client and server, + * and values larger than 2^31 bytes/sec may cause overflow problems + * if later queried by virDomainGetBlockJobInfo() without scaling. + * Hypervisors may further restrict the range of valid bandwidth + * values. Some hypervisors do not support this feature and will + * return an error if bandwidth is not 0; in this case, it might still + * be possible for a later call to virDomainBlockJobSetSpeed() to + * succeed. The actual speed can be determined with + * virDomainGetBlockJobInfo(). * * This is shorthand for virDomainBlockRebase() with a NULL base. * @@ -19844,7 +19861,7 @@ virDomainBlockPull(virDomainPtr dom, const char *disk, * @disk: path to the block device, or device shorthand * @base: path to backing file to keep, or device shorthand, * or NULL for no backing file - * @bandwidth: (optional) specify copy bandwidth limit in MiB/s + * @bandwidth: (optional) specify bandwidth limit; flags determine the unit * @flags: bitwise-OR of virDomainBlockRebaseFlags * * Populate a disk image with data from its backing image chain, and @@ -19923,12 +19940,20 @@ virDomainBlockPull(virDomainPtr dom, const char *disk, * example, "vda[3]" refers to the backing store with index equal to "3" * in the chain of disk "vda". * - * The maximum bandwidth (in MiB/s) that will be used to do the copy can be - * specified with the bandwidth parameter. If set to 0, libvirt will choose a - * suitable default. Some hypervisors do not support this feature and will - * return an error if bandwidth is not 0; in this case, it might still be - * possible for a later call to virDomainBlockJobSetSpeed() to succeed. - * The actual speed can be determined with virDomainGetBlockJobInfo(). + * The maximum bandwidth that will be used to do the copy can be + * specified with the @bandwidth parameter. If set to 0, there is no + * limit. If @flags includes VIR_DOMAIN_BLOCK_REBASE_BANDWIDTH_BYTES, + * @bandwidth is in bytes/second; otherwise, it is in MiB/second. + * Values larger than 2^52 bytes/sec may be rejected due to overflow + * considerations based on the word size of both client and server, + * and values larger than 2^31 bytes/sec may cause overflow problems + * if later queried by virDomainGetBlockJobInfo() without scaling. + * Hypervisors may further restrict the range of valid bandwidth + * values. Some hypervisors do not support this feature and will + * return an error if bandwidth is not 0; in this case, it might still + * be possible for a later call to virDomainBlockJobSetSpeed() to + * succeed. The actual speed can be determined with + * virDomainGetBlockJobInfo(). * * When @base is NULL and @flags is 0, this is identical to * virDomainBlockPull(). When @flags contains VIR_DOMAIN_BLOCK_REBASE_COPY, @@ -20110,7 +20135,7 @@ virDomainBlockCopy(virDomainPtr dom, const char *disk, * or NULL for default * @top: path to file within backing chain that contains data to be merged, * or device shorthand, or NULL to merge all possible data - * @bandwidth: (optional) specify commit bandwidth limit in MiB/s + * @bandwidth: (optional) specify bandwidth limit; flags determine the unit * @flags: bitwise-OR of virDomainBlockCommitFlags * * Commit changes that were made to temporary top-level files within a disk @@ -20188,12 +20213,20 @@ virDomainBlockCopy(virDomainPtr dom, const char *disk, * example, "vda[3]" refers to the backing store with index equal to "3" * in the chain of disk "vda". * - * The maximum bandwidth (in MiB/s) that will be used to do the commit can be - * specified with the bandwidth parameter. If set to 0, libvirt will choose a - * suitable default. Some hypervisors do not support this feature and will - * return an error if bandwidth is not 0; in this case, it might still be - * possible for a later call to virDomainBlockJobSetSpeed() to succeed. - * The actual speed can be determined with virDomainGetBlockJobInfo(). + * The maximum bandwidth that will be used to do the commit can be + * specified with the @bandwidth parameter. If set to 0, there is no + * limit. If @flags includes VIR_DOMAIN_BLOCK_COMMIT_BANDWIDTH_BYTES, + * @bandwidth is in bytes/second; otherwise, it is in MiB/second. + * Values larger than 2^52 bytes/sec may be rejected due to overflow + * considerations based on the word size of both client and server, + * and values larger than 2^31 bytes/sec may cause overflow problems + * if later queried by virDomainGetBlockJobInfo() without scaling. + * Hypervisors may further restrict the range of valid bandwidth + * values. Some hypervisors do not support this feature and will + * return an error if bandwidth is not 0; in this case, it might still + * be possible for a later call to virDomainBlockJobSetSpeed() to + * succeed. The actual speed can be determined with + * virDomainGetBlockJobInfo(). * * Returns 0 if the operation has started, -1 on failure. */ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e597d68..2efcbcb 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15125,14 +15125,19 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, } } - /* Convert bandwidth MiB to bytes */ - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20); - goto endjob; + /* Convert bandwidth MiB to bytes, if needed */ + if ((mode == BLOCK_JOB_SPEED && + !(flags & VIR_DOMAIN_BLOCK_JOB_SPEED_BANDWIDTH_BYTES)) || + (mode == BLOCK_JOB_PULL && + !(flags & VIR_DOMAIN_BLOCK_PULL_BANDWIDTH_BYTES))) { + if (speed > LLONG_MAX >> 20) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu"), + LLONG_MAX >> 20); + goto endjob; + } + speed <<= 20; } - speed <<= 20; qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorBlockJob(priv->mon, device, basePath, backingPath, @@ -15334,7 +15339,7 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, unsigned long bandwidth, unsigned int flags) { virDomainObjPtr vm; - virCheckFlags(0, -1); + virCheckFlags(VIR_DOMAIN_BLOCK_JOB_SPEED_BANDWIDTH_BYTES, -1); if (!(vm = qemuDomObjFromDomain(dom))) return -1; @@ -15555,7 +15560,8 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, VIR_DOMAIN_BLOCK_REBASE_COPY | VIR_DOMAIN_BLOCK_REBASE_COPY_RAW | VIR_DOMAIN_BLOCK_REBASE_RELATIVE | - VIR_DOMAIN_BLOCK_REBASE_COPY_DEV, -1); + VIR_DOMAIN_BLOCK_REBASE_COPY_DEV | + VIR_DOMAIN_BLOCK_REBASE_BANDWIDTH_BYTES, -1); if (!(vm = qemuDomObjFromDomain(dom))) return -1; @@ -15579,14 +15585,16 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) dest->format = VIR_STORAGE_FILE_RAW; - /* Convert bandwidth MiB to bytes */ - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20); - goto cleanup; + /* Convert bandwidth MiB to bytes, if necessary */ + if (!(flags & VIR_DOMAIN_BLOCK_REBASE_BANDWIDTH_BYTES)) { + if (speed > LLONG_MAX >> 20) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu"), + LLONG_MAX >> 20); + goto cleanup; + } + speed <<= 20; } - speed <<= 20; /* XXX: If we are doing a shallow copy but not reusing an external * file, we should attempt to pre-create the destination with a @@ -15692,7 +15700,7 @@ qemuDomainBlockPull(virDomainPtr dom, const char *path, unsigned long bandwidth, unsigned int flags) { virDomainObjPtr vm; - virCheckFlags(0, -1); + virCheckFlags(VIR_DOMAIN_BLOCK_PULL_BANDWIDTH_BYTES, -1); if (!(vm = qemuDomObjFromDomain(dom))) return -1; @@ -15737,7 +15745,8 @@ qemuDomainBlockCommit(virDomainPtr dom, /* XXX Add support for COMMIT_DELETE */ virCheckFlags(VIR_DOMAIN_BLOCK_COMMIT_SHALLOW | VIR_DOMAIN_BLOCK_COMMIT_ACTIVE | - VIR_DOMAIN_BLOCK_COMMIT_RELATIVE, -1); + VIR_DOMAIN_BLOCK_COMMIT_RELATIVE | + VIR_DOMAIN_BLOCK_COMMIT_BANDWIDTH_BYTES, -1); if (!(vm = qemuDomObjFromDomain(dom))) goto cleanup; @@ -15763,14 +15772,16 @@ qemuDomainBlockCommit(virDomainPtr dom, goto endjob; } - /* Convert bandwidth MiB to bytes */ - if (speed > LLONG_MAX >> 20) { - virReportError(VIR_ERR_OVERFLOW, - _("bandwidth must be less than %llu"), - LLONG_MAX >> 20); - goto endjob; + /* Convert bandwidth MiB to bytes, if necessary */ + if (!(flags & VIR_DOMAIN_BLOCK_COMMIT_BANDWIDTH_BYTES)) { + if (speed > LLONG_MAX >> 20) { + virReportError(VIR_ERR_OVERFLOW, + _("bandwidth must be less than %llu"), + LLONG_MAX >> 20); + goto endjob; + } + speed <<= 20; } - speed <<= 20; device = qemuDiskPathToAlias(vm, path, &idx); if (!device) -- 1.9.3

On 08/31/14 06:02, Eric Blake wrote:
We stupidly modeled block job bandwidth after migration bandwidth, which in turn was an 'unsigned long' and therefore subject to 32-bit vs. 64-bit interpretations. To work around the fact that 10-gigabit interfaces are possible but don't fit within 32 bits, the original interface took the number scaled as MiB/sec. But this scaling is rather coarse, and it might be nice to tune bandwidth finer than in megabyte chunks.
Several of the block job calls that can set speed are fed through a common interface, so it was easier to adjust them all at once. Note that there is intentionally no flag for the new virDomainBlockCopy; there, since the API already uses a 64-bit type always, instead of a possible 32-bit type, and is brand new, it was easier to just avoid scaling issues. As with the previous patch that adjusted the query side, omitting the new flag preserves old behavior, and the documentation now mentions limits of what happens when a 32-bit machine is on either client or server side.
* include/libvirt/libvirt.h.in (virDomainBlockJobSetSpeedFlags) (virDomainBlockPullFlags) (VIR_DOMAIN_BLOCK_REBASE_BANDWIDTH_BYTES) (VIR_DOMAIN_BLOCK_COMMIT_BANDWIDTH_BYTES): New enums. * src/libvirt.c (virDomainBlockJobSetSpeed, virDomainBlockPull) (virDomainBlockRebase, virDomainBlockCommit): Document them. * src/qemu/qemu_driver.c (qemuDomainBlockJobSetSpeed) (qemuDomainBlockPull, qemuDomainBlockRebase) (qemuDomainBlockCommit, qemuDomainBlockJobImpl): Support new flag.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 31 +++++++++++++---- src/libvirt.c | 83 +++++++++++++++++++++++++++++++------------- src/qemu/qemu_driver.c | 61 +++++++++++++++++++------------- 3 files changed, 118 insertions(+), 57 deletions(-)
ACK, the better granularity may help today, while we are able to keep the huge range for future expansion. Peter

On Sat, Aug 30, 2014 at 22:02:18 -0600, Eric Blake wrote:
Took me longer than I wanted to get v3 posted. There's lots of new patches in this version, based on feedback on v2.
Among other things, the bandwidth of virDomainBlockCopy is in bytes/s, and all the remaining interfaces are updated to use a flags value to decide whether to use bytes/s instead of the back-compat MiB/s.
I really want patch 1 in 1.2.8; pasth 2-17 would be nice but it may be better to delay to later, and I still don't have virsh updated to cover the changed proposed in patch 18 so that one should probably wait.
Patch 1 is easy and safe to be included in 1.2.8 but I think it's way too late for the rest. However, since the implementation will be missing from 1.2.8, I think the best solution would be to just revert the patch adding this public API and let it slip to the next release. Jirka

On Mon, Sep 01, 2014 at 09:39:35 +0200, Jiri Denemark wrote:
On Sat, Aug 30, 2014 at 22:02:18 -0600, Eric Blake wrote:
Took me longer than I wanted to get v3 posted. There's lots of new patches in this version, based on feedback on v2.
Among other things, the bandwidth of virDomainBlockCopy is in bytes/s, and all the remaining interfaces are updated to use a flags value to decide whether to use bytes/s instead of the back-compat MiB/s.
I really want patch 1 in 1.2.8; pasth 2-17 would be nice but it may be better to delay to later, and I still don't have virsh updated to cover the changed proposed in patch 18 so that one should probably wait.
Patch 1 is easy and safe to be included in 1.2.8 but I think it's way too late for the rest. However, since the implementation will be missing from 1.2.8, I think the best solution would be to just revert the patch adding this public API and let it slip to the next release.
On the other side, we already did a mistake and pushed API when its implementation was not ready yet and given that the API doesn't seem to need any changes (except for the tiny one in 1/18), there's no strong reason to revert it (except of for it being unsupported by all drivers). It would be nice to at least have the remote driver implementation in place but it's probably too late for that too. Jirka
participants (3)
-
Eric Blake
-
Jiri Denemark
-
Peter Krempa